Name: Videoagent Image Studio
Author: Pexoai

Install

Terminal · npx

$npx skills add https://github.com/pexoai/pexo-skills --skill videoagent-image-studio

Works with Paperclip

How Videoagent Image Studio fits into a Paperclip company.

Videoagent Image Studio drops into any Paperclip agent that handles - video work. Assign it to a specialist inside a pre-configured PaperclipOrg company and the skill becomes available on every heartbeat — no prompt engineering, no tool wiring.

SaaS FactoryPaired

Pre-configured AI company — 18 agents, 18 skills, one-time purchase.

$27$59

Explore pack

Source file

SKILL.md234 linesmarkdown

Expand

1---2name: videoagent-image-studio3version: 2.0.04author: "wells"5emoji: "🎨"6tags:7  - video8  - image-generation9  - midjourney10  - flux11  - gemini12  - fal13  - ideogram14  - recraft15description: >16  Tired of juggling 8 API keys? This skill gives you one-command access to Midjourney, Flux, Ideogram, and more, with zero setup. Use when you want to generate any image without worrying about API keys.17homepage: https://github.com/pexoai/image-studio-skill18metadata:19  openclaw:20    emoji: "🎨"21    install:22      - id: node23        kind: node24        label: "No dependencies needed — all calls go through the hosted proxy"25---26 27# 🎨 VideoAgent Image Studio28 29**Use when:** User asks to generate, draw, create, or make any kind of image, photo, illustration, icon, logo, or artwork.30 31Generate images with 8 state-of-the-art AI models. This skill automatically picks the best model for the job and handles all the complexity — including Midjourney's async polling — so you can focus on the conversation.32 33---34 35## Quick Reference36 37| User Intent | Model | Speed |38|---|---|---|39| Artistic, cinematic, painterly | `midjourney` | ~15s |40| Photorealistic, portrait, product | `flux-pro` | ~8s |41| General purpose, balanced | `flux-dev` | ~10s |42| Quick draft, fast iteration | `flux-schnell` | ~2s |43| Image with text, logo, poster | `ideogram` | ~10s |44| Vector art, icon, flat design | `recraft` | ~8s |45| Anime, stylized illustration | `sdxl` | ~5s |46| Gemini-powered, consistent style | `nano-banana` | ~12s |47 48---49 50## How to Generate an Image51 52### Step 1 — Enhance the prompt53 54Before calling the script, expand the user's prompt with style, lighting, and quality descriptors appropriate for the chosen model.55 56- **Midjourney**: Add `cinematic lighting`, `ultra detailed`, `--v 7`, `--style raw`57- **Flux**: Add `masterpiece`, `highly detailed`, `sharp focus`, `professional photography`58- **Ideogram**: Be explicit about text content, font style, and layout59- **Recraft**: Specify `vector illustration`, `flat design`, `icon style`60 61### Step 2 — Run the script62 63```bash64node {baseDir}/tools/generate.js \65  --model <model_id> \66  --prompt "<enhanced prompt>" \67  --aspect-ratio <ratio>68```69 70**All parameters:**71 72| Parameter | Default | Description |73|---|---|---|74| `--model` | `flux-dev` | Model ID from the table above |75| `--prompt` | *(required)* | The image generation prompt |76| `--aspect-ratio` | `1:1` | `1:1`, `16:9`, `9:16`, `4:3`, `3:4`, `3:2`, `21:9` |77| `--num-images` | `1` | Number of images (1–4; Midjourney always returns 4) |78| `--negative-prompt` | — | Things to avoid (not supported by Midjourney) |79| `--seed` | — | Seed for reproducibility |80 81### Step 3 — Return the result82 83The script always waits and returns the final image URL(s). No polling required.84 85```json86{87  "success": true,88  "model": "flux-pro",89  "imageUrl": "https://...",90  "images": ["https://..."]91}92```93 94Send the `imageUrl` to the user.95 96---97 98## Midjourney Actions99 100After generating a 4-image grid with Midjourney, offer the user these options:101 102```bash103# Upscale image #2 (subtle, preserves details)104node {baseDir}/tools/generate.js \105  --model midjourney \106  --action upscale \107  --index 2 \108  --job-id <job_id>109 110# Create a strong variation of image #3111node {baseDir}/tools/generate.js \112  --model midjourney \113  --action variation \114  --index 3 \115  --job-id <job_id> \116  --variation-type 1117 118# Regenerate with same prompt119node {baseDir}/tools/generate.js \120  --model midjourney \121  --action reroll \122  --job-id <job_id>123```124 125**Upscale types:** `0` = Subtle (default, best for photos), `1` = Creative (best for illustrations)126 127**Variation types:** `0` = Subtle (default), `1` = Strong (dramatic changes)128 129---130 131## Example Conversations132 133**User:** "Draw a snow leopard on a snowy mountain with cinematic lighting"134 135```bash136# Choose midjourney for artistic quality137node {baseDir}/tools/generate.js \138  --model midjourney \139  --prompt "a majestic snow leopard on a snowy mountain peak, cinematic lighting, dramatic atmosphere, ultra detailed --ar 16:9 --v 7" \140  --aspect-ratio 16:9141```142 143> 🎨 Done! Which one to upscale? (U1-U4) Or create a variant? (V1-V4)144 145---146 147**User:** "Use Flux to generate a perfume product poster, white background"148 149```bash150# Choose flux-pro for photorealistic product shots151node {baseDir}/tools/generate.js \152  --model flux-pro \153  --prompt "a luxury perfume bottle on a clean white background, professional product photography, soft shadows, 8k, highly detailed" \154  --aspect-ratio 3:4155```156 157---158 159**User:** "Show me a quick draft"160 161```bash162# flux-schnell for instant previews163node {baseDir}/tools/generate.js \164  --model flux-schnell \165  --prompt "..." \166  --aspect-ratio 1:1167```168 169---170 171**User:** "Make me an App icon, flat style, blue theme"172 173```bash174# recraft for vector/icon style175node {baseDir}/tools/generate.js \176  --model recraft \177  --prompt "a minimal flat design app icon, blue color scheme, simple geometric shapes, vector style, white background"178```179 180---181 182## Setup183 184**Zero API keys needed!** All requests go through a hosted proxy that handles authentication server-side.185 186The skill works out of the box — just install and use.187 188### Advanced: Custom proxy or token189 190If you want to use your own proxy or a persistent token, set these environment variables:191 192```json193{194  "skills": {195    "entries": {196      "videoagent-image-studio": {197        "enabled": true,198        "env": {199          "IMAGE_STUDIO_PROXY_URL": "https://your-proxy.vercel.app",200          "IMAGE_STUDIO_TOKEN": "your_token_here"201        }202      }203    }204  }205}206```207 208| Variable | Required | Description |209|---|---|---|210| `IMAGE_STUDIO_PROXY_URL` | No | Custom proxy base URL (default: `https://image-gen-proxy.vercel.app`) |211| `IMAGE_STUDIO_TOKEN` | No | Persistent token (auto-obtained if not set, 100 free uses per token) |212 213To deploy your own proxy, see the [videoagent-audio-studio proxy](../videoagent-audio-studio/proxy/) as a reference implementation. You'll need `FAL_KEY` and `LEGNEXT_KEY` as Vercel environment variables.214 215---216 217## Changelog218 219### v2.0.0220- **Simplified async**: The script now blocks until Midjourney completes. No more `--async` / `--poll` flags needed in SKILL.md instructions.221- **Unified output format**: All models return the same `{ success, imageUrl, images }` shape.222- **Reference images for Nano Banana**: Pass `--reference-images "url1,url2"` for character/style consistency across generations.223 224### v1.3.0225- Added non-blocking async mode for Midjourney (`--async` + `--poll`).226 227### v1.2.0228- Midjourney turbo mode enabled by default (~10-20s).229 230### v1.1.0231- Switched Midjourney provider from TTAPI to Legnext.ai for better stability.232 233### v1.0.0234- Initial release with Midjourney, Flux, SDXL, Nano Banana, Ideogram, Recraft.

Related skills

Pexo Agent

A solid integration for generating short videos through Pexo's AI platform without leaving your Claude workflow. Handles the full pipeline from uploading assets

Seedance 2.0 Prompter

Install Seedance 2.0 Prompter skill for Claude Code from pexoai/pexo-skills.

Videoagent Audio Studio

This is essentially an audio API router that saves you from juggling multiple services. Point it at text and it'll generate speech via ElevenLabs, ask for backg