Install
Terminal · npx$
npx skills add https://github.com/inferen-sh/skills --skill twitter-automationWorks with Paperclip
How Puppeteer Automation fits into a Paperclip company.
Puppeteer Automation drops into any Paperclip agent that handles this kind of work. Assign it to a specialist inside a pre-configured PaperclipOrg company and the skill becomes available on every heartbeat — no prompt engineering, no tool wiring.
S
SaaS FactoryPaired
Pre-configured AI company — 18 agents, 18 skills, one-time purchase.
$27$59
Explore packSource file
SKILL.md417 linesExpandCollapse
---name: puppeteer-automationdescription: Expert guidance for browser automation using Puppeteer with best practices for web scraping, testing, screenshot capture, and JavaScript execution in headless Chrome.--- # Puppeteer Browser Automation You are an expert in Puppeteer, Node.js browser automation, web scraping, and building reliable automation scripts for Chrome and Chromium browsers. ## Core Expertise- Puppeteer API and browser automation patterns- Page navigation and interaction- Element selection and manipulation- Screenshot and PDF generation- Network request interception- Headless and headful browser modes- Performance optimization and memory management- Integration with testing frameworks (Jest, Mocha) ## Key Principles - Write clean, async/await based code for readability- Use proper error handling with try/catch blocks- Implement robust waiting strategies for dynamic content- Close browser instances properly to prevent memory leaks- Follow modular design patterns for reusable automation code- Handle browser context and page lifecycle appropriately ## Project Setup ```bashnpm init -ynpm install puppeteer``` ### Basic Structure```javascriptconst puppeteer = require('puppeteer'); async function main() { const browser = await puppeteer.launch({ headless: 'new', args: ['--no-sandbox', '--disable-setuid-sandbox'] }); try { const page = await browser.newPage(); await page.goto('https://example.com'); // Your automation code here } finally { await browser.close(); }} main().catch(console.error);``` ## Browser Launch Options ```javascriptconst browser = await puppeteer.launch({ headless: 'new', // 'new' for new headless mode, false for visible browser slowMo: 50, // Slow down operations for debugging devtools: true, // Open DevTools automatically args: [ '--no-sandbox', '--disable-setuid-sandbox', '--disable-dev-shm-usage', '--disable-accelerated-2d-canvas', '--disable-gpu', '--window-size=1920,1080' ], defaultViewport: { width: 1920, height: 1080 }});``` ## Page Navigation ```javascript// Navigate to URLawait page.goto('https://example.com', { waitUntil: 'networkidle2', // Wait until network is idle timeout: 30000}); // Wait options:// - 'load': Wait for load event// - 'domcontentloaded': Wait for DOMContentLoaded event// - 'networkidle0': No network connections for 500ms// - 'networkidle2': No more than 2 network connections for 500ms // Navigate back/forwardawait page.goBack();await page.goForward(); // Reload pageawait page.reload({ waitUntil: 'networkidle2' });``` ## Element Selection ### Query Selectors```javascript// Single elementconst element = await page.$('selector'); // Multiple elementsconst elements = await page.$$('selector'); // Wait for elementconst element = await page.waitForSelector('selector', { visible: true, timeout: 5000}); // XPath selectionconst elements = await page.$x('//xpath/expression');``` ### Evaluation in Page Context```javascript// Get text contentconst text = await page.$eval('selector', el => el.textContent); // Get attributeconst href = await page.$eval('a', el => el.getAttribute('href')); // Multiple elementsconst texts = await page.$$eval('.items', elements => elements.map(el => el.textContent)); // Execute arbitrary JavaScriptconst result = await page.evaluate(() => { return document.title;});``` ## Page Interactions ### Clicking```javascriptawait page.click('button#submit'); // Click with optionsawait page.click('button', { button: 'left', // 'left', 'right', 'middle' clickCount: 1, delay: 100 // Time between mousedown and mouseup}); // Click and wait for navigationawait Promise.all([ page.waitForNavigation(), page.click('a.nav-link')]);``` ### Typing```javascript// Type textawait page.type('input#username', 'myuser', { delay: 50 }); // Clear and typeawait page.click('input#username', { clickCount: 3 });await page.type('input#username', 'newvalue'); // Press keysawait page.keyboard.press('Enter');await page.keyboard.down('Shift');await page.keyboard.press('Tab');await page.keyboard.up('Shift');``` ### Form Handling```javascript// Select dropdownawait page.select('select#country', 'us'); // Check checkboxawait page.click('input[type="checkbox"]'); // File uploadconst inputFile = await page.$('input[type="file"]');await inputFile.uploadFile('/path/to/file.pdf');``` ## Waiting Strategies ```javascript// Wait for selectorawait page.waitForSelector('.loaded'); // Wait for selector to disappearawait page.waitForSelector('.loading', { hidden: true }); // Wait for functionawait page.waitForFunction( () => document.querySelector('.count').textContent === '10'); // Wait for navigationawait page.waitForNavigation({ waitUntil: 'networkidle2' }); // Wait for network requestawait page.waitForRequest(request => request.url().includes('/api/data')); // Wait for network responseawait page.waitForResponse(response => response.url().includes('/api/data') && response.status() === 200); // Fixed timeout (use sparingly)await page.waitForTimeout(1000);``` ## Screenshots and PDFs ### Screenshots```javascript// Full page screenshotawait page.screenshot({ path: 'screenshot.png', fullPage: true}); // Element screenshotconst element = await page.$('.chart');await element.screenshot({ path: 'chart.png' }); // Screenshot optionsawait page.screenshot({ path: 'screenshot.png', type: 'png', // 'png' or 'jpeg' quality: 80, // jpeg only, 0-100 clip: { x: 0, y: 0, width: 800, height: 600 }});``` ### PDF Generation```javascriptawait page.pdf({ path: 'document.pdf', format: 'A4', printBackground: true, margin: { top: '20px', right: '20px', bottom: '20px', left: '20px' }});``` ## Network Interception ```javascript// Enable request interceptionawait page.setRequestInterception(true); page.on('request', request => { // Block images and stylesheets if (['image', 'stylesheet'].includes(request.resourceType())) { request.abort(); } else { request.continue(); }}); // Modify requestspage.on('request', request => { request.continue({ headers: { ...request.headers(), 'X-Custom-Header': 'value' } });}); // Monitor responsespage.on('response', async response => { if (response.url().includes('/api/')) { const data = await response.json(); console.log('API Response:', data); }});``` ## Authentication and Cookies ```javascript// Basic HTTP authenticationawait page.authenticate({ username: 'user', password: 'pass'}); // Set cookiesawait page.setCookie({ name: 'session', value: 'abc123', domain: 'example.com'}); // Get cookiesconst cookies = await page.cookies(); // Clear cookiesawait page.deleteCookie({ name: 'session' });``` ## Browser Context and Multiple Pages ```javascript// Create incognito contextconst context = await browser.createIncognitoBrowserContext();const page = await context.newPage(); // Multiple pagesconst page1 = await browser.newPage();const page2 = await browser.newPage(); // Get all pagesconst pages = await browser.pages(); // Handle popupspage.on('popup', async popup => { await popup.waitForLoadState(); console.log('Popup URL:', popup.url());});``` ## Error Handling ```javascriptasync function scrapeWithRetry(url, maxRetries = 3) { for (let i = 0; i < maxRetries; i++) { try { const browser = await puppeteer.launch(); const page = await browser.newPage(); // Set timeout page.setDefaultTimeout(30000); await page.goto(url, { waitUntil: 'networkidle2' }); const data = await page.$eval('.content', el => el.textContent); await browser.close(); return data; } catch (error) { console.error(`Attempt ${i + 1} failed:`, error.message); if (i === maxRetries - 1) throw error; await new Promise(r => setTimeout(r, 2000 * (i + 1))); } }}``` ## Performance Optimization ```javascript// Disable unnecessary featuresawait page.setRequestInterception(true);page.on('request', request => { const blockedTypes = ['image', 'stylesheet', 'font']; if (blockedTypes.includes(request.resourceType())) { request.abort(); } else { request.continue(); }}); // Reuse browser instanceconst browser = await puppeteer.launch(); async function scrape(url) { const page = await browser.newPage(); try { await page.goto(url); // ... scraping logic } finally { await page.close(); // Close page, not browser }} // Use connection pool for parallel scrapingconst cluster = require('puppeteer-cluster');``` ## Key Dependencies - puppeteer- puppeteer-core (for custom Chrome installations)- puppeteer-cluster (for parallel scraping)- puppeteer-extra (for plugins)- puppeteer-extra-plugin-stealth (anti-detection) ## Best Practices 1. Always close browser instances in finally blocks2. Use `waitForSelector` before interacting with elements3. Prefer `networkidle2` over `networkidle0` for faster loads4. Use stealth plugin for anti-bot bypass5. Implement proper error handling and retries6. Monitor memory usage in long-running scripts7. Use browser context for isolated sessions8. Set reasonable timeouts for all operationsRelated skills