nixos/shared/linked-dotfiles/opencode/skills/browser-automation/SKILL.md
2025-10-29 18:46:16 -06:00

325 lines
8.5 KiB
Markdown

---
name: browser-automation
description: Use when automating web tasks, filling forms, extracting content, or controlling Chrome - provides Chrome DevTools Protocol automation via use_browser MCP tool for multi-tab workflows, form automation, and content extraction
---
# Browser Automation with Chrome DevTools Protocol
Control Chrome directly via DevTools Protocol using the `use_browser` MCP tool. Single unified interface with auto-starting Chrome.
**Core principle:** One tool, action-based interface, zero dependencies.
## When to Use This Skill
**Use when:**
- Automating web forms and interactions
- Extracting content from web pages (text, tables, links)
- Managing authenticated browser sessions
- Multi-tab workflows requiring context switching
- Testing web applications interactively
- Scraping dynamic content loaded by JavaScript
**Don't use when:**
- Need fresh isolated browser instances
- Require PDF/screenshot generation (use Playwright MCP)
- Simple HTTP requests suffice (use curl/fetch)
## Quick Reference
| Task | Action | Key Parameters |
|------|--------|----------------|
| Go to URL | `navigate` | `payload`: URL |
| Wait for element | `await_element` | `selector`, `timeout` |
| Click element | `click` | `selector` |
| Type text | `type` | `selector`, `payload` (add `\n` to submit) |
| Get content | `extract` | `payload`: 'markdown'\|'text'\|'html' |
| Run JavaScript | `eval` | `payload`: JS code |
| Get attribute | `attr` | `selector`, `payload`: attr name |
| Select dropdown | `select` | `selector`, `payload`: option value |
| Take screenshot | `screenshot` | `payload`: filename |
| List tabs | `list_tabs` | - |
| New tab | `new_tab` | - |
## The use_browser Tool
**Parameters:**
- `action` (required): Operation to perform
- `tab_index` (optional): Tab to operate on (default: 0)
- `selector` (optional): CSS selector or XPath (XPath starts with `/` or `//`)
- `payload` (optional): Action-specific data
- `timeout` (optional): Timeout in ms (default: 5000, max: 60000)
**Returns:** JSON response with result or error
## Core Pattern
Every browser workflow follows this structure:
```
1. Navigate to page
2. Wait for content to load
3. Interact or extract
4. Validate result
```
**Example:**
```json
{action: "navigate", payload: "https://example.com"}
{action: "await_element", selector: "h1"}
{action: "extract", payload: "text", selector: "h1"}
```
## Common Workflows
### Form Filling
```json
{action: "navigate", payload: "https://app.com/login"}
{action: "await_element", selector: "input[name=email]"}
{action: "type", selector: "input[name=email]", payload: "user@example.com"}
{action: "type", selector: "input[name=password]", payload: "pass123\n"}
{action: "await_text", payload: "Welcome"}
```
Note: `\n` at end submits the form automatically.
### Content Extraction
```json
{action: "navigate", payload: "https://example.com"}
{action: "await_element", selector: "body"}
{action: "extract", payload: "markdown"}
```
### Multi-Tab Workflow
```json
{action: "list_tabs"}
{action: "click", tab_index: 2, selector: "a.email"}
{action: "await_element", tab_index: 2, selector: ".content"}
{action: "extract", tab_index: 2, payload: "text", selector: ".amount"}
```
### Dynamic Content
```json
{action: "navigate", payload: "https://app.com"}
{action: "type", selector: "input[name=q]", payload: "query"}
{action: "click", selector: "button.search"}
{action: "await_element", selector: ".results"}
{action: "extract", payload: "text", selector: ".result-title"}
```
### Get Structured Data
```json
{action: "eval", payload: "Array.from(document.querySelectorAll('a')).map(a => ({ text: a.textContent.trim(), href: a.href }))"}
```
## Implementation Steps
### 1. Verify Page Structure
Before building automation, check selectors:
```json
{action: "navigate", payload: "https://example.com"}
{action: "await_element", selector: "body"}
{action: "extract", payload: "html"}
```
### 2. Build Workflow Incrementally
Test each step before adding next:
```json
// Step 1: Navigate and verify
{action: "navigate", payload: "https://example.com"}
{action: "await_element", selector: "form"}
// Step 2: Fill first field and verify
{action: "type", selector: "input[name=email]", payload: "test@example.com"}
{action: "attr", selector: "input[name=email]", payload: "value"}
// Step 3: Complete form
{action: "type", selector: "input[name=password]", payload: "pass\n"}
```
### 3. Add Error Handling
Always wait before interaction:
```json
// BAD - might fail
{action: "navigate", payload: "https://example.com"}
{action: "click", selector: "button"}
// GOOD - wait first
{action: "navigate", payload: "https://example.com"}
{action: "await_element", selector: "button"}
{action: "click", selector: "button"}
```
### 4. Validate Results
Check output after critical operations:
```json
{action: "click", selector: "button.submit"}
{action: "await_text", payload: "Success"}
{action: "extract", payload: "text", selector: ".confirmation"}
```
## Selector Strategies
**Use specific selectors:**
-`button[type=submit]`
-`#login-button`
-`.modal button.confirm`
-`button` (too generic)
**XPath for complex queries:**
```json
{action: "extract", selector: "//h2 | //h3", payload: "text"}
{action: "click", selector: "//button[contains(text(), 'Submit')]"}
```
**Test selectors first:**
```json
{action: "eval", payload: "document.querySelector('button.submit')"}
```
## Common Mistakes
### Timing Issues
**Problem:** Clicking before element loads
```json
{action: "navigate", payload: "https://example.com"}
{action: "click", selector: "button"} // ❌ Fails if slow
```
**Solution:** Always wait
```json
{action: "navigate", payload: "https://example.com"}
{action: "await_element", selector: "button"} // ✅ Waits
{action: "click", selector: "button"}
```
### Generic Selectors
**Problem:** Matches wrong element
```json
{action: "click", selector: "button"} // ❌ First button only
```
**Solution:** Be specific
```json
{action: "click", selector: "button.login-button"} // ✅ Specific
```
### Missing Tab Management
**Problem:** Tab indices change after closing tabs
```json
{action: "close_tab", tab_index: 1}
{action: "click", tab_index: 2, selector: "a"} // ❌ Index shifted
```
**Solution:** Re-list tabs
```json
{action: "close_tab", tab_index: 1}
{action: "list_tabs"} // ✅ Get updated indices
{action: "click", tab_index: 1, selector: "a"} // Now correct
```
### Insufficient Timeout
**Problem:** Default 5s timeout too short
```json
{action: "await_element", selector: ".slow-content"} // ❌ Times out
```
**Solution:** Increase timeout
```json
{action: "await_element", selector: ".slow-content", timeout: 30000} // ✅
```
## Advanced Patterns
### Wait for AJAX Complete
```json
{action: "eval", payload: `
new Promise(resolve => {
const check = () => {
if (!document.querySelector('.spinner')) {
resolve(true);
} else {
setTimeout(check, 100);
}
};
check();
})
`}
```
### Extract Table Data
```json
{action: "eval", payload: "Array.from(document.querySelectorAll('table tr')).map(row => Array.from(row.cells).map(cell => cell.textContent.trim()))"}
```
### Handle Modals
```json
{action: "click", selector: "button.open-modal"}
{action: "await_element", selector: ".modal.visible"}
{action: "type", selector: ".modal input[name=username]", payload: "testuser"}
{action: "click", selector: ".modal button.submit"}
{action: "eval", payload: `
new Promise(resolve => {
const check = () => {
if (!document.querySelector('.modal.visible')) resolve(true);
else setTimeout(check, 100);
};
check();
})
`}
```
### Access Browser Storage
```json
// Get cookies
{action: "eval", payload: "document.cookie"}
// Get localStorage
{action: "eval", payload: "JSON.stringify(localStorage)"}
// Set localStorage
{action: "eval", payload: "localStorage.setItem('key', 'value')"}
```
## Real-World Impact
**Before:** Manual form filling, 5 minutes per submission
**After:** Automated workflow, 30 seconds per submission (10x faster)
**Before:** Copy-paste from multiple tabs, error-prone
**After:** Multi-tab extraction with validation, zero errors
**Before:** Unreliable scraping with arbitrary delays
**After:** Event-driven waiting, 100% reliability
## Additional Resources
See `references/examples.md` for:
- Complete e-commerce workflows
- Multi-step form automation
- Advanced scraping patterns
- Infinite scroll handling
- Cross-site data correlation
Chrome DevTools Protocol docs: https://chromedevtools.github.io/devtools-protocol/