Claude Agent Skill · by Mindrally

Puppeteer Automation

Install Puppeteer Automation skill for Claude Code from mindrally/skills.

Install
Terminal · npx
$npx skills add https://github.com/inferen-sh/skills --skill twitter-automation
Works with Paperclip

How Puppeteer Automation fits into a Paperclip company.

Puppeteer Automation drops into any Paperclip agent that handles this kind of work. Assign it to a specialist inside a pre-configured PaperclipOrg company and the skill becomes available on every heartbeat — no prompt engineering, no tool wiring.

S
SaaS FactoryPaired

Pre-configured AI company — 18 agents, 18 skills, one-time purchase.

$27$59
Explore pack
Source file
SKILL.md417 lines
Expand
---name: puppeteer-automationdescription: Expert guidance for browser automation using Puppeteer with best practices for web scraping, testing, screenshot capture, and JavaScript execution in headless Chrome.--- # Puppeteer Browser Automation You are an expert in Puppeteer, Node.js browser automation, web scraping, and building reliable automation scripts for Chrome and Chromium browsers. ## Core Expertise- Puppeteer API and browser automation patterns- Page navigation and interaction- Element selection and manipulation- Screenshot and PDF generation- Network request interception- Headless and headful browser modes- Performance optimization and memory management- Integration with testing frameworks (Jest, Mocha) ## Key Principles - Write clean, async/await based code for readability- Use proper error handling with try/catch blocks- Implement robust waiting strategies for dynamic content- Close browser instances properly to prevent memory leaks- Follow modular design patterns for reusable automation code- Handle browser context and page lifecycle appropriately ## Project Setup ```bashnpm init -ynpm install puppeteer``` ### Basic Structure```javascriptconst puppeteer = require('puppeteer'); async function main() {  const browser = await puppeteer.launch({    headless: 'new',    args: ['--no-sandbox', '--disable-setuid-sandbox']  });   try {    const page = await browser.newPage();    await page.goto('https://example.com');    // Your automation code here  } finally {    await browser.close();  }} main().catch(console.error);``` ## Browser Launch Options ```javascriptconst browser = await puppeteer.launch({  headless: 'new',  // 'new' for new headless mode, false for visible browser  slowMo: 50,       // Slow down operations for debugging  devtools: true,   // Open DevTools automatically  args: [    '--no-sandbox',    '--disable-setuid-sandbox',    '--disable-dev-shm-usage',    '--disable-accelerated-2d-canvas',    '--disable-gpu',    '--window-size=1920,1080'  ],  defaultViewport: {    width: 1920,    height: 1080  }});``` ## Page Navigation ```javascript// Navigate to URLawait page.goto('https://example.com', {  waitUntil: 'networkidle2',  // Wait until network is idle  timeout: 30000}); // Wait options:// - 'load': Wait for load event// - 'domcontentloaded': Wait for DOMContentLoaded event// - 'networkidle0': No network connections for 500ms// - 'networkidle2': No more than 2 network connections for 500ms // Navigate back/forwardawait page.goBack();await page.goForward(); // Reload pageawait page.reload({ waitUntil: 'networkidle2' });``` ## Element Selection ### Query Selectors```javascript// Single elementconst element = await page.$('selector'); // Multiple elementsconst elements = await page.$$('selector'); // Wait for elementconst element = await page.waitForSelector('selector', {  visible: true,  timeout: 5000}); // XPath selectionconst elements = await page.$x('//xpath/expression');``` ### Evaluation in Page Context```javascript// Get text contentconst text = await page.$eval('selector', el => el.textContent); // Get attributeconst href = await page.$eval('a', el => el.getAttribute('href')); // Multiple elementsconst texts = await page.$$eval('.items', elements =>  elements.map(el => el.textContent)); // Execute arbitrary JavaScriptconst result = await page.evaluate(() => {  return document.title;});``` ## Page Interactions ### Clicking```javascriptawait page.click('button#submit'); // Click with optionsawait page.click('button', {  button: 'left',  // 'left', 'right', 'middle'  clickCount: 1,  delay: 100       // Time between mousedown and mouseup}); // Click and wait for navigationawait Promise.all([  page.waitForNavigation(),  page.click('a.nav-link')]);``` ### Typing```javascript// Type textawait page.type('input#username', 'myuser', { delay: 50 }); // Clear and typeawait page.click('input#username', { clickCount: 3 });await page.type('input#username', 'newvalue'); // Press keysawait page.keyboard.press('Enter');await page.keyboard.down('Shift');await page.keyboard.press('Tab');await page.keyboard.up('Shift');``` ### Form Handling```javascript// Select dropdownawait page.select('select#country', 'us'); // Check checkboxawait page.click('input[type="checkbox"]'); // File uploadconst inputFile = await page.$('input[type="file"]');await inputFile.uploadFile('/path/to/file.pdf');``` ## Waiting Strategies ```javascript// Wait for selectorawait page.waitForSelector('.loaded'); // Wait for selector to disappearawait page.waitForSelector('.loading', { hidden: true }); // Wait for functionawait page.waitForFunction(  () => document.querySelector('.count').textContent === '10'); // Wait for navigationawait page.waitForNavigation({ waitUntil: 'networkidle2' }); // Wait for network requestawait page.waitForRequest(request =>  request.url().includes('/api/data')); // Wait for network responseawait page.waitForResponse(response =>  response.url().includes('/api/data') && response.status() === 200); // Fixed timeout (use sparingly)await page.waitForTimeout(1000);``` ## Screenshots and PDFs ### Screenshots```javascript// Full page screenshotawait page.screenshot({  path: 'screenshot.png',  fullPage: true}); // Element screenshotconst element = await page.$('.chart');await element.screenshot({ path: 'chart.png' }); // Screenshot optionsawait page.screenshot({  path: 'screenshot.png',  type: 'png',  // 'png' or 'jpeg'  quality: 80,   // jpeg only, 0-100  clip: {    x: 0,    y: 0,    width: 800,    height: 600  }});``` ### PDF Generation```javascriptawait page.pdf({  path: 'document.pdf',  format: 'A4',  printBackground: true,  margin: {    top: '20px',    right: '20px',    bottom: '20px',    left: '20px'  }});``` ## Network Interception ```javascript// Enable request interceptionawait page.setRequestInterception(true); page.on('request', request => {  // Block images and stylesheets  if (['image', 'stylesheet'].includes(request.resourceType())) {    request.abort();  } else {    request.continue();  }}); // Modify requestspage.on('request', request => {  request.continue({    headers: {      ...request.headers(),      'X-Custom-Header': 'value'    }  });}); // Monitor responsespage.on('response', async response => {  if (response.url().includes('/api/')) {    const data = await response.json();    console.log('API Response:', data);  }});``` ## Authentication and Cookies ```javascript// Basic HTTP authenticationawait page.authenticate({  username: 'user',  password: 'pass'}); // Set cookiesawait page.setCookie({  name: 'session',  value: 'abc123',  domain: 'example.com'}); // Get cookiesconst cookies = await page.cookies(); // Clear cookiesawait page.deleteCookie({ name: 'session' });``` ## Browser Context and Multiple Pages ```javascript// Create incognito contextconst context = await browser.createIncognitoBrowserContext();const page = await context.newPage(); // Multiple pagesconst page1 = await browser.newPage();const page2 = await browser.newPage(); // Get all pagesconst pages = await browser.pages(); // Handle popupspage.on('popup', async popup => {  await popup.waitForLoadState();  console.log('Popup URL:', popup.url());});``` ## Error Handling ```javascriptasync function scrapeWithRetry(url, maxRetries = 3) {  for (let i = 0; i < maxRetries; i++) {    try {      const browser = await puppeteer.launch();      const page = await browser.newPage();       // Set timeout      page.setDefaultTimeout(30000);       await page.goto(url, { waitUntil: 'networkidle2' });      const data = await page.$eval('.content', el => el.textContent);       await browser.close();      return data;    } catch (error) {      console.error(`Attempt ${i + 1} failed:`, error.message);      if (i === maxRetries - 1) throw error;      await new Promise(r => setTimeout(r, 2000 * (i + 1)));    }  }}``` ## Performance Optimization ```javascript// Disable unnecessary featuresawait page.setRequestInterception(true);page.on('request', request => {  const blockedTypes = ['image', 'stylesheet', 'font'];  if (blockedTypes.includes(request.resourceType())) {    request.abort();  } else {    request.continue();  }}); // Reuse browser instanceconst browser = await puppeteer.launch(); async function scrape(url) {  const page = await browser.newPage();  try {    await page.goto(url);    // ... scraping logic  } finally {    await page.close();  // Close page, not browser  }} // Use connection pool for parallel scrapingconst cluster = require('puppeteer-cluster');``` ## Key Dependencies - puppeteer- puppeteer-core (for custom Chrome installations)- puppeteer-cluster (for parallel scraping)- puppeteer-extra (for plugins)- puppeteer-extra-plugin-stealth (anti-detection) ## Best Practices 1. Always close browser instances in finally blocks2. Use `waitForSelector` before interacting with elements3. Prefer `networkidle2` over `networkidle0` for faster loads4. Use stealth plugin for anti-bot bypass5. Implement proper error handling and retries6. Monitor memory usage in long-running scripts7. Use browser context for isolated sessions8. Set reasonable timeouts for all operations