Name: Paddleocr Text Recognition
Author: Aidenwu0209

Install

Terminal · npx

$npx skills add https://github.com/aidenwu0209/paddleocr-skills --skill paddleocr-text-recognition

Works with Paperclip

How Paddleocr Text Recognition fits into a Paperclip company.

Paddleocr Text Recognition drops into any Paperclip agent that handles this kind of work. Assign it to a specialist inside a pre-configured PaperclipOrg company and the skill becomes available on every heartbeat — no prompt engineering, no tool wiring.

SaaS FactoryPaired

Pre-configured AI company — 18 agents, 18 skills, one-time purchase.

$27$59

Explore pack

Source file

SKILL.md234 linesmarkdown

Expand

1---2name: paddleocr-text-recognition3description: Extracts text (with locations) from images and PDF documents using PaddleOCR.4metadata:5  openclaw:6    requires:7      env:8        - PADDLEOCR_OCR_API_URL9        - PADDLEOCR_ACCESS_TOKEN10        - PADDLEOCR_OCR_TIMEOUT11      bins:12        - python13    primaryEnv: PADDLEOCR_ACCESS_TOKEN14    emoji: "🔤"15    homepage: https://github.com/PaddlePaddle/PaddleOCR/tree/main/skills/paddleocr-text-recognition16---17 18# PaddleOCR Text Recognition Skill19 20## When to Use This Skill21 22Invoke this skill in the following situations:23- Extract text from images (screenshots, photos, scans)24- Extract text from PDFs or document images25- Extract text and positions from structured documents (invoices, receipts, forms, tables)26- Extract text from URLs or local files that point to images/PDFs27 28Do not use this skill in the following situations:29- Plain text files that can be read directly with the Read tool30- Code files or markdown documents31- Tasks that do not involve image-to-text conversion32 33## How to Use This Skill34 35**⛔ MANDATORY RESTRICTIONS - DO NOT VIOLATE ⛔**36 371. **ONLY use PaddleOCR Text Recognition API** - Execute the script `python scripts/ocr_caller.py`382. **NEVER read images directly** - Do NOT read images yourself393. **NEVER offer alternatives** - Do NOT suggest "I can try to read it" or similar404. **IF API fails** - Display the error message and STOP immediately415. **NO fallback methods** - Do NOT attempt OCR any other way42 43If the script execution fails (API not configured, network error, etc.):44- Show the error message to the user45- Do NOT offer to help using your vision capabilities46- Do NOT ask "Would you like me to try reading it?"47- Simply stop and wait for user to fix the configuration48 49### Basic Workflow50 511. **Identify the input source**:52   - User provides URL: Use the `--file-url` parameter53   - User provides local file path: Use the `--file-path` parameter54   - User uploads image: Save it first, then use `--file-path`55 56   **Input type note**:57   - Supported file types depend on the model and endpoint configuration.58   - Follow the official endpoint/API documentation for the exact supported formats.59 602. **Execute OCR**:61   ```bash62   python scripts/ocr_caller.py --file-url "URL provided by user" --pretty63   ```64   Or for local files:65   ```bash66   python scripts/ocr_caller.py --file-path "file path" --pretty67   ```68 69   **Default behavior: save raw JSON to a temp file**:70   - If `--output` is omitted, the script saves automatically under the system temp directory71   - Default path pattern: `<system-temp>/paddleocr/text-recognition/results/result_<timestamp>_<id>.json`72   - If `--output` is provided, it overrides the default temp-file destination73   - If `--stdout` is provided, JSON is printed to stdout and no file is saved74   - In save mode, the script prints the absolute saved path on stderr: `Result saved to: /absolute/path/...`75   - In default/custom save mode, read and parse the saved JSON file before responding76   - Use `--stdout` only when you explicitly want to skip file persistence77 783. **Parse JSON response**:79   - In default/custom save mode, load JSON from the saved file path shown by the script80   - Check the `ok` field: `true` means success, `false` means error81   - Extract text: `text` field contains all recognized text82   - If `--stdout` is used, parse the stdout JSON directly83   - Handle errors: If `ok` is false, display `error.message`84 854. **Present results to user**:86   - Display extracted text in a readable format87   - If the text is empty, the image may contain no text88   - In save mode, always tell the user the saved file path and that full raw JSON is available there89 90### IMPORTANT: Complete Output Display91 92**CRITICAL**: Always display the COMPLETE recognized text to the user. Do NOT truncate or summarize the OCR results.93 94- The output JSON contains complete output, including full text in `text` field95- **You MUST display the entire `text` content to the user**, no matter how long it is96- Do NOT use phrases like "Here's a summary" or "The text begins with..."97- Do NOT truncate with "..." unless the text truly exceeds reasonable display limits98- The user expects to see ALL the recognized text, not a preview or excerpt99 100**Correct approach**:101```102I've extracted the text from the image. Here's the complete content:103 104[Display the entire text here]105```106 107**Incorrect approach**:108```109I found some text in the image. Here's a preview:110"The quick brown fox..." (truncated)111```112 113### Usage Examples114 115**Example 1: URL OCR**:116```bash117python scripts/ocr_caller.py --file-url "https://example.com/invoice.jpg" --pretty118```119 120**Example 2: Local File OCR**:121```bash122python scripts/ocr_caller.py --file-path "./document.pdf" --pretty123```124 125**Example 3: OCR With Explicit File Type**:126```bash127python scripts/ocr_caller.py --file-url "https://example.com/input" --file-type 1 --pretty128```129 130**Example 4: Print JSON Without Saving**:131```bash132python scripts/ocr_caller.py --file-url "https://example.com/input" --stdout --pretty133```134 135### Understanding the Output136 137The output JSON structure is as follows:138```json139{140  "ok": true,141  "text": "All recognized text here...",142  "result": { ... },143  "error": null144}145```146 147**Key fields**:148- `ok`: `true` for success, `false` for error149- `text`: Complete recognized text150- `result`: Raw API response (for debugging)151- `error`: Error details if `ok` is false152 153> Raw result location (default): the temp-file path printed by the script on stderr154 155### First-Time Configuration156 157You can generally assume that the required environment variables have already been configured. Only when an OCR task fails should you analyze the error message to determine whether it is caused by a configuration issue. If it is indeed a configuration problem, you should notify the user to fix it.158 159**When API is not configured**:160 161The error will show:162```163CONFIG_ERROR: PADDLEOCR_OCR_API_URL not configured. Get your API at: https://paddleocr.com164```165 166**Configuration workflow**:167 1681. **Show the exact error message** to the user (including the URL).169 1702. **Guide the user to configure securely**:171   - Recommend configuring through the host application's standard method (e.g., settings file, environment variable UI) rather than pasting credentials in chat.172   - List the required environment variables:173     ```174     - PADDLEOCR_OCR_API_URL175     - PADDLEOCR_ACCESS_TOKEN176     - Optional: PADDLEOCR_OCR_TIMEOUT177     ```178 1793. **If the user provides credentials in chat anyway** (accept any reasonable format), for example:180   - `PADDLEOCR_OCR_API_URL=https://xxx.paddleocr.com/ocr, PADDLEOCR_ACCESS_TOKEN=abc123...`181   - `Here's my API: https://xxx and token: abc123`182   - Copy-pasted code format183   - Any other reasonable format184   - **Security note**: Warn the user that credentials shared in chat may be stored in conversation history. Recommend setting them through the host application's configuration instead when possible.185 186   Then parse and validate the values:187   - Extract `PADDLEOCR_OCR_API_URL` (look for URLs with `paddleocr.com` or similar)188   - Confirm `PADDLEOCR_OCR_API_URL` is a full endpoint ending with `/ocr`189   - Extract `PADDLEOCR_ACCESS_TOKEN` (long alphanumeric string, usually 40+ chars)190 1914. **Ask the user to confirm the environment is configured**.192 1935. **Retry only after confirmation**:194   - Once the user confirms the environment variables are available, retry the original OCR task195 196### Error Handling197 198**Authentication failed**:199```200API_ERROR: Authentication failed (403). Check your token.201```202- Token is invalid, reconfigure with correct credentials203 204**Quota exceeded**:205```206API_ERROR: API rate limit exceeded (429)207```208- Daily API quota exhausted, inform user to wait or upgrade209 210**No text detected**:211- `text` field is empty212- Image may be blank, corrupted, or contain no text213 214### Tips for Better Results215 216If recognition quality is poor, suggest:217- Check if the image is clear and contains text218- Provide a higher resolution image if possible219 220## Reference Documentation221 222For in-depth understanding of the OCR system, refer to:223- `references/output_schema.md` - Output format specification224 225> **Note**: Model version, capabilities, and supported file formats are determined by your API endpoint (`PADDLEOCR_OCR_API_URL`) and its official API documentation.226 227## Testing the Skill228 229To verify the skill is working properly:230```bash231python scripts/smoke_test.py232```233 234This tests configuration and API connectivity.

Related skills

1password

Install 1password skill for Claude Code from steipete/clawdis.

3d Web Experience

Install 3d Web Experience skill for Claude Code from sickn33/antigravity-awesome-skills.

Ab Test Setup

This handles the full A/B testing workflow from hypothesis formation to statistical analysis. It walks you through proper test design, calculates sample sizes,