Claude Agent Skill · by Johnlindquist

Gemini Image

Install Gemini Image skill for Claude Code from johnlindquist/claude.

Install
Terminal · npx
$npx skills add https://github.com/johnlindquist/claude --skill gemini-image
Works with Paperclip

How Gemini Image fits into a Paperclip company.

Gemini Image drops into any Paperclip agent that handles this kind of work. Assign it to a specialist inside a pre-configured PaperclipOrg company and the skill becomes available on every heartbeat — no prompt engineering, no tool wiring.

S
SaaS FactoryPaired

Pre-configured AI company — 18 agents, 18 skills, one-time purchase.

$27$59
Explore pack
Source file
SKILL.md212 lines
Expand
---name: gemini-imagedescription: Analyze images using Gemini's vision capabilities. Use for image analysis, text extraction from screenshots, and visual content understanding.--- # Gemini Image Analysis Analyze images using Gemini Pro's vision capabilities. ## Prerequisites ```bashpip install google-generativeaiexport GEMINI_API_KEY=your_api_key``` ## CLI Reference ### Basic Image Analysis ```bash# Analyze an imagegemini -m pro -f /path/to/image.png "Describe this image in detail" # With specific questiongemini -m pro -f screenshot.png "What error message is shown?" # Multiple imagesgemini -m pro -f image1.png -f image2.png "Compare these two images"``` ## Analysis Operations ### General Description ```bashgemini -m pro -f image.png "Describe this image comprehensively:1. Main subject/content2. Colors and composition3. Text visible (if any)4. Context and purpose5. Notable details"``` ### Extract Text (OCR) ```bashgemini -m pro -f screenshot.png "Extract all text from this image.Format as plain text, preserving layout where possible.Include any text in buttons, labels, or UI elements."``` ### Code from Screenshot ```bashgemini -m pro -f code-screenshot.png "Extract the code from this screenshot.Provide as properly formatted code with correct indentation.Note any parts that are unclear or partially visible."``` ### UI Analysis ```bashgemini -m pro -f ui-screenshot.png "Analyze this UI:1. What application/website is this?2. What page/screen is shown?3. Main UI elements and their purpose4. User flow/actions available5. Any UX issues or suggestions"``` ### Error Analysis ```bashgemini -m pro -f error-screenshot.png "Analyze this error:1. What error is shown?2. What is the likely cause?3. How to fix it?4. Any related information visible?"``` ### Diagram Understanding ```bashgemini -m pro -f diagram.png "Explain this diagram:1. What type of diagram is this?2. Main components and their relationships3. Data/process flow4. Key takeaways"``` ## Specific Use Cases ### Debug Screenshot ```bashgemini -m pro -f debug-screen.png "I'm debugging an issue. From this screenshot:1. What is the current state?2. What errors or warnings are visible?3. What should I look at?4. Suggested next steps"``` ### Compare Before/After ```bashgemini -m pro -f before.png -f after.png "Compare these before and after images:1. What changed?2. Is this an improvement?3. Any issues in the 'after' version?4. Anything missing?"``` ### Design Feedback ```bashgemini -m pro -f design.png "Provide design feedback:1. Visual hierarchy2. Color usage3. Typography4. Spacing and alignment5. Accessibility concerns6. Suggestions for improvement"``` ### Data Extraction ```bashgemini -m pro -f chart.png "Extract data from this chart:1. Chart type2. Data series and values3. Axes labels and ranges4. Key trends or insights5. Output as structured data if possible"``` ### Form Analysis ```bashgemini -m pro -f form.png "Analyze this form:1. Form purpose2. Fields and their types3. Required vs optional4. Validation rules visible5. UX suggestions"``` ## Workflow Patterns ### Screenshot to Issue ```bash# Capture screenshot (macOS)screencapture -i /tmp/bug.png # Analyze and format as issuegemini -m pro -f /tmp/bug.png "Create a bug report from this screenshot: ## Summary[One-line description] ## Steps to Reproduce[Inferred from screenshot] ## Expected Behavior[What should happen] ## Actual Behavior[What the screenshot shows] ## Environment[Any visible system info]"``` ### UI to Code ```bashgemini -m pro -f ui-design.png "Generate React component code that recreates this UI:- Use Tailwind CSS for styling- Make it responsive- Include proper TypeScript types- Add appropriate accessibility attributes"``` ### Documentation ```bashgemini -m pro -f app-screen.png "Write user documentation for this screen:- What this screen is for- How to use each feature- Common tasks- Tips and notes"``` ## Image Types Supported - PNG, JPEG, GIF, WebP- Screenshots- Photos- Diagrams and charts- UI mockups- Code snippets- Documents ## Best Practices 1. **Use clear images** - Higher quality = better analysis2. **Crop to relevant area** - Remove unnecessary context3. **Ask specific questions** - Vague prompts get vague answers4. **Provide context** - Tell Gemini what you're looking for5. **Verify extracted text** - OCR isn't perfect6. **Multiple angles** - Use multiple images for complex subjects