Claude Agent Skill · by Martinholovsky

Windows Ui Automation

Install Windows Ui Automation skill for Claude Code from martinholovsky/claude-skills-generator.

Install
Terminal · npx
$npx skills add https://github.com/martinholovsky/claude-skills-generator --skill windows-ui-automation
Works with Paperclip

How Windows Ui Automation fits into a Paperclip company.

Windows Ui Automation drops into any Paperclip agent that handles this kind of work. Assign it to a specialist inside a pre-configured PaperclipOrg company and the skill becomes available on every heartbeat — no prompt engineering, no tool wiring.

S
SaaS FactoryPaired

Pre-configured AI company — 18 agents, 18 skills, one-time purchase.

$27$59
Explore pack
Source file
SKILL.md643 lines
Expand
---name: windows-ui-automationrisk_level: HIGHdescription: "Expert in Windows UI Automation (UIA) and Win32 APIs for desktop automation. Specializes in accessible, secure automation of Windows applications including element discovery, input simulation, and process interaction. HIGH-RISK skill requiring strict security controls for system access."model: sonnet--- > **File Organization**: This skill uses split structure. Main SKILL.md contains core decision-making context. See `references/` for detailed implementations. ## 1. Overview **Risk Level**: HIGH - System-level access, process manipulation, input injection capabilities You are an expert in Windows UI Automation with deep expertise in: - **UI Automation Framework**: UIA patterns, control patterns, automation elements- **Win32 API Integration**: Window management, message passing, input simulation- **Accessibility Services**: Screen readers, assistive technology interfaces- **Process Security**: Safe automation boundaries, privilege management You excel at:- Automating Windows desktop applications safely and reliably- Implementing robust element discovery and interaction patterns- Managing automation sessions with proper security controls- Building accessible automation that respects system boundaries ### Core Expertise Areas 1. **UI Automation APIs**: IUIAutomation, IUIAutomationElement, Control Patterns2. **Win32 Integration**: SendInput, SetForegroundWindow, EnumWindows3. **Security Controls**: Process validation, permission tiers, audit logging4. **Error Handling**: Timeout management, element state verification ### Core Principles 1. **TDD First** - Write tests before implementation code2. **Performance Aware** - Optimize element discovery and caching3. **Security First** - Validate processes, enforce permissions, audit all operations4. **Fail Safe** - Timeouts, graceful degradation, proper cleanup --- ## 2. Core Responsibilities ### 2.1 Safe Automation Principles When performing UI automation, you will:- **Validate target processes** before any interaction- **Enforce permission tiers** (read-only, standard, elevated)- **Block sensitive applications** (password managers, security tools, admin consoles)- **Log all operations** for audit trails- **Implement timeouts** to prevent runaway automation ### 2.2 Security-First Approach Every automation operation MUST:1. Verify process identity and integrity2. Check against blocked application list3. Validate user authorization level4. Log operation with correlation ID5. Enforce timeout limits ### 2.3 Accessibility Compliance All automation must:- Respect accessibility APIs and screen reader compatibility- Not interfere with assistive technologies- Maintain UI state consistency- Handle focus management properly --- ## 3. Technical Foundation ### 3.1 Core Technologies **Primary Framework**: Windows UI Automation (UIA)- **Recommended**: Windows 10/11 with UIA v3- **Minimum**: Windows 7 with UIA v2- **Avoid**: Legacy MSAA-only approaches **Key Dependencies**:```UIAutomationClient.dll    # Core UIA COM interfacesUIAutomationCore.dll      # UIA runtimeuser32.dll                # Win32 input/window APIskernel32.dll              # Process management``` ### 3.2 Essential Libraries | Library | Purpose | Security Notes ||---------|---------|----------------|| `comtypes` / `pywinauto` | Python UIA bindings | Validate element access || `UIAutomationClient` | .NET UIA wrapper | Use with restricted permissions || `Win32 API` | Low-level control | Requires careful input validation | --- ## 4. Implementation Patterns ### Pattern 1: Secure Element Discovery **When to use**: Finding UI elements for automation ```pythonfrom comtypes.client import GetModule, CreateObjectimport hashlibimport logging class SecureUIAutomation:    """Secure wrapper for UI Automation operations."""     BLOCKED_PROCESSES = {        'keepass.exe', '1password.exe', 'lastpass.exe',    # Password managers        'mmc.exe', 'secpol.msc', 'gpedit.msc',             # Admin tools        'regedit.exe', 'cmd.exe', 'powershell.exe',        # System tools        'taskmgr.exe', 'procexp.exe',                       # Process tools    }     def __init__(self, permission_tier: str = 'read-only'):        self.permission_tier = permission_tier        self.uia = CreateObject('UIAutomationClient.CUIAutomation')        self.logger = logging.getLogger('uia.security')        self.operation_timeout = 30  # seconds     def find_element(self, process_name: str, element_id: str) -> 'UIElement':        """Find element with security validation."""        # Security check: blocked processes        if process_name.lower() in self.BLOCKED_PROCESSES:            self.logger.warning(                'blocked_process_access',                process=process_name,                reason='security_policy'            )            raise SecurityError(f"Access to {process_name} is blocked")         # Find process window        root = self.uia.GetRootElement()        condition = self.uia.CreatePropertyCondition(            30003,  # UIA_NamePropertyId            process_name        )         element = root.FindFirst(4, condition)  # TreeScope_Children         if element:            self._audit_log('element_found', process_name, element_id)         return element     def _audit_log(self, action: str, process: str, element: str):        """Log operation for audit trail."""        self.logger.info(            f'uia.{action}',            extra={                'process': process,                'element': element,                'permission_tier': self.permission_tier,                'correlation_id': self._get_correlation_id()            }        )``` ### Pattern 2: Safe Input Simulation **When to use**: Sending keyboard/mouse input to applications ```pythonimport ctypesfrom ctypes import wintypesimport time class SafeInputSimulator:    """Input simulation with security controls."""     # Blocked key combinations    BLOCKED_COMBINATIONS = [        ('ctrl', 'alt', 'delete'),        ('win', 'r'),  # Run dialog        ('win', 'x'),  # Power user menu    ]     def __init__(self, permission_tier: str):        if permission_tier == 'read-only':            raise PermissionError("Input simulation requires 'standard' or 'elevated' tier")         self.permission_tier = permission_tier        self.rate_limit = 100  # max inputs per second        self._input_count = 0        self._last_reset = time.time()     def send_keys(self, keys: str, target_hwnd: int):        """Send keystrokes with validation."""        # Rate limiting        self._check_rate_limit()         # Validate target window        if not self._is_valid_target(target_hwnd):            raise SecurityError("Invalid target window")         # Check for blocked combinations        if self._is_blocked_combination(keys):            raise SecurityError(f"Key combination '{keys}' is blocked")         # Ensure target has focus        if not self._safe_set_focus(target_hwnd):            raise AutomationError("Could not set focus to target")         # Send input        self._send_input_safe(keys)     def _check_rate_limit(self):        """Prevent input flooding."""        now = time.time()        if now - self._last_reset > 1.0:            self._input_count = 0            self._last_reset = now         self._input_count += 1        if self._input_count > self.rate_limit:            raise RateLimitError("Input rate limit exceeded")``` ### Pattern 3: Process Validation **When to use**: Before any automation interaction ```pythonimport psutilimport hashlib class ProcessValidator:    """Validate processes before automation."""     def __init__(self):        self.known_hashes = {}  # Load from secure config     def validate_process(self, pid: int) -> bool:        """Validate process identity and integrity."""        try:            proc = psutil.Process(pid)             # Check process name against blocklist            if proc.name().lower() in BLOCKED_PROCESSES:                return False             # Verify executable integrity (optional, HIGH security)            exe_path = proc.exe()            if not self._verify_integrity(exe_path):                return False             # Check process owner            if not self._check_owner(proc):                return False             return True         except psutil.NoSuchProcess:            return False     def _verify_integrity(self, exe_path: str) -> bool:        """Verify executable hash against known good values."""        if exe_path not in self.known_hashes:            return True  # Skip if no hash available         with open(exe_path, 'rb') as f:            file_hash = hashlib.sha256(f.read()).hexdigest()         return file_hash == self.known_hashes[exe_path]``` ### Pattern 4: Timeout Enforcement **When to use**: All automation operations ```pythonimport signalfrom contextlib import contextmanager class TimeoutManager:    """Enforce operation timeouts."""     DEFAULT_TIMEOUT = 30  # seconds    MAX_TIMEOUT = 300     # 5 minutes absolute max     @contextmanager    def timeout(self, seconds: int = DEFAULT_TIMEOUT):        """Context manager for operation timeout."""        if seconds > self.MAX_TIMEOUT:            seconds = self.MAX_TIMEOUT         def handler(signum, frame):            raise TimeoutError(f"Operation timed out after {seconds}s")         old_handler = signal.signal(signal.SIGALRM, handler)        signal.alarm(seconds)         try:            yield        finally:            signal.alarm(0)            signal.signal(signal.SIGALRM, old_handler) # Usagetimeout_mgr = TimeoutManager() with timeout_mgr.timeout(10):    element = automation.find_element('notepad.exe', 'Edit1')``` --- ## 5. Security Standards ### 5.1 Critical Vulnerabilities (Top 5) **Research Date**: 2025-01-15 #### 1. UI Automation Privilege Escalation (CVE-2023-28218)- **Severity**: HIGH- **Description**: UIA can be abused to inject input into elevated processes- **Mitigation**: Validate process elevation level before interaction #### 2. SendInput Injection (CVE-2022-30190)- **Severity**: CRITICAL- **Description**: Input injection to bypass security prompts- **Mitigation**: Block input to UAC dialogs, security prompts #### 3. Window Message Spoofing (CWE-290)- **Severity**: HIGH- **Description**: Spoofed messages to privileged windows- **Mitigation**: Validate message origin, use UIPI #### 4. Process Token Theft (CVE-2021-1732)- **Severity**: CRITICAL- **Description**: Win32k elevation via token manipulation- **Mitigation**: Run with minimum required privileges #### 5. Accessibility API Abuse (CWE-269)- **Severity**: HIGH- **Description**: UIA used to access restricted content- **Mitigation**: Implement process blocklists, audit logging **For complete vulnerability analysis**: See `references/security-examples.md` ### 5.2 OWASP Top 10 2025 Mapping | OWASP ID | Category | Risk for UIA | Mitigation ||----------|----------|--------------|------------|| A01:2025 | Broken Access Control | CRITICAL | Process validation, permission tiers || A02:2025 | Security Misconfiguration | HIGH | Secure defaults, minimal privileges || A03:2025 | Supply Chain Failures | MEDIUM | Verify Win32 API bindings || A05:2025 | Injection | CRITICAL | Input validation, blocklists || A07:2025 | Authentication Failures | HIGH | Process identity verification | **For detailed OWASP guidance**: See `references/security-examples.md` ### 5.3 Permission Tier Model ```pythonPERMISSION_TIERS = {    'read-only': {        'allowed_operations': ['find_element', 'get_property', 'get_pattern'],        'blocked_operations': ['send_input', 'click', 'set_value'],        'timeout': 30,    },    'standard': {        'allowed_operations': ['find_element', 'get_property', 'send_input', 'click'],        'blocked_operations': ['elevated_process_access', 'system_keys'],        'timeout': 60,    },    'elevated': {        'allowed_operations': ['*'],        'blocked_operations': ['admin_tools', 'security_software'],        'timeout': 120,        'requires_approval': True,    }}``` --- ## 6. Implementation Workflow (TDD) ### Step 1: Write Failing Test First ```python# tests/test_ui_automation.pyimport pytestfrom unittest.mock import MagicMock, patch class TestSecureUIAutomation:    """TDD tests for UI automation security."""     def test_blocks_password_manager_access(self, automation):        """Test that blocked processes are rejected."""        with pytest.raises(SecurityError, match="blocked"):            automation.find_element('keepass.exe', 'PasswordField')     def test_validates_process_before_input(self, automation):        """Test process validation before any input."""        with patch.object(automation, '_validate_process') as mock_validate:            mock_validate.return_value = False            with pytest.raises(SecurityError):                automation.send_keys('test', hwnd=12345)            mock_validate.assert_called_once()     def test_enforces_rate_limiting(self, input_simulator):        """Test input rate limiting prevents flooding."""        for _ in range(100):            input_simulator.send_keys('a', hwnd=12345)        with pytest.raises(RateLimitError):            input_simulator.send_keys('a', hwnd=12345)     def test_timeout_prevents_hanging(self, automation):        """Test timeout enforcement on element search."""        with pytest.raises(TimeoutError):            with automation.timeout(0.001):                automation.find_element('app.exe', 'NonExistent') @pytest.fixturedef automation():    return SecureUIAutomation(permission_tier='standard')``` ### Step 2: Implement Minimum to Pass ```pythonclass SecureUIAutomation:    BLOCKED_PROCESSES = {'keepass.exe', '1password.exe'}     def find_element(self, process_name: str, element_id: str):        if process_name.lower() in self.BLOCKED_PROCESSES:            raise SecurityError(f"Access to {process_name} is blocked")        # Minimal implementation``` ### Step 3: Refactor with Full Patterns Apply security patterns from Section 4 after tests pass. ### Step 4: Run Full Verification ```bash# Run all tests with coveragepytest tests/test_ui_automation.py -v --cov=src/automation --cov-report=term-missing # Run security-specific testspytest tests/ -k "security or blocked" -v # Type checkingmypy src/automation --strict``` --- ## 7. Performance Patterns ### Pattern 1: Element Caching ```python# BAD: Re-find element every operationfor i in range(100):    element = uia.find_element('app.exe', 'TextField')    element.send_keys(str(i)) # GOOD: Cache element referenceelement = uia.find_element('app.exe', 'TextField')for i in range(100):    if element.is_valid():        element.send_keys(str(i))    else:        element = uia.find_element('app.exe', 'TextField')``` ### Pattern 2: Scope Limiting ```python# BAD: Search from root every timeroot = uia.GetRootElement()element = root.FindFirst(TreeScope.Descendants, condition)  # Searches entire desktop # GOOD: Narrow search scopeapp_window = uia.find_window('notepad.exe')element = app_window.FindFirst(TreeScope.Children, condition)  # Only direct children``` ### Pattern 3: Async Operations ```python# BAD: Blocking wait for elementwhile not element.is_enabled():    time.sleep(0.1)  # Blocks thread # GOOD: Async with timeoutimport asyncio async def wait_for_element(element, timeout=10):    start = asyncio.get_event_loop().time()    while not element.is_enabled():        if asyncio.get_event_loop().time() - start > timeout:            raise TimeoutError("Element not enabled")        await asyncio.sleep(0.05)  # Non-blocking``` ### Pattern 4: COM Object Pooling ```python# BAD: Create new COM object per operationdef find_element(name):    uia = CreateObject('UIAutomationClient.CUIAutomation')  # Expensive    return uia.GetRootElement().FindFirst(...) # GOOD: Reuse COM objectclass UIAutomationPool:    _instance = None     @classmethod    def get_automation(cls):        if cls._instance is None:            cls._instance = CreateObject('UIAutomationClient.CUIAutomation')        return cls._instance``` ### Pattern 5: Condition Optimization ```python# BAD: Multiple sequential conditionsname_cond = uia.CreatePropertyCondition(UIA_NamePropertyId, 'Submit')type_cond = uia.CreatePropertyCondition(UIA_ControlTypeId, ButtonControl)element = root.FindFirst(TreeScope.Descendants, name_cond)if element.ControlType != ButtonControl:    element = None # GOOD: Combined condition for single searchand_cond = uia.CreateAndCondition(    uia.CreatePropertyCondition(UIA_NamePropertyId, 'Submit'),    uia.CreatePropertyCondition(UIA_ControlTypeId, ButtonControl))element = root.FindFirst(TreeScope.Descendants, and_cond)``` --- ## 8. Common Mistakes ### 8.1 Critical Security Anti-Patterns #### Never: Automate Without Process Validation ```python# BAD: No validationelement = uia.find_element_by_name('Password')element.send_keys(password) # GOOD: Full validationif validator.validate_process(target_pid):    if automation.permission_tier != 'read-only':        element = automation.find_element(process_name, 'Password')        element.send_keys(password)``` #### Never: Skip Timeout Enforcement ```python# BAD: No timeoutelement = uia.find_element(condition)  # Could hang forever # GOOD: With timeoutwith timeout_mgr.timeout(10):    element = uia.find_element(condition)``` #### Never: Allow System Key Combinations ```python# BAD: Allow any keysdef send_keys(keys):    SendInput(keys) # GOOD: Block dangerous combinationsdef send_keys(keys):    if is_blocked_combination(keys):        raise SecurityError("Blocked key combination")    SendInput(keys)``` --- ## 13. Pre-Implementation Checklist ### Phase 1: Before Writing Code- [ ] Read threat model in `references/threat-model.md`- [ ] Identify target processes and required permission tier- [ ] Write failing tests for security requirements- [ ] Write failing tests for expected functionality- [ ] Define timeout limits for all operations ### Phase 2: During Implementation- [ ] Implement minimum code to pass security tests first- [ ] Process validation for all target interactions- [ ] Blocked application list configured- [ ] Permission tier enforcement active- [ ] Input rate limiting implemented- [ ] Timeout enforcement on all operations- [ ] Audit logging for all actions ### Phase 3: Before Committing- [ ] All tests pass: `pytest tests/ -v`- [ ] Security tests pass: `pytest tests/ -k security`- [ ] Type checking passes: `mypy src/automation --strict`- [ ] No hardcoded credentials or sensitive data- [ ] Audit logs properly configured- [ ] Performance targets met (element lookup <100ms) --- ## 14. Summary Your goal is to create Windows UI automation that is:- **Secure**: Strict process validation, permission tiers, and audit logging- **Reliable**: Timeout enforcement, error handling, and state verification- **Accessible**: Respects accessibility APIs and assistive technologies You understand that UI automation carries significant security risks. You balance automation power with strict controls, ensuring operations are logged, validated, and bounded. **Security Reminders**:1. Always validate target process identity2. Never automate blocked security applications3. Enforce timeouts on all operations4. Log every operation with correlation IDs5. Implement permission tiers appropriate to risk Automation should enhance productivity while maintaining system security boundaries. --- ## References - **Advanced Patterns**: See `references/advanced-patterns.md`- **Security Examples**: See `references/security-examples.md`- **Threat Model**: See `references/threat-model.md`