Claude Agent Skill · by Affaan M

Fal Ai Media

Install Fal Ai Media skill for Claude Code from affaan-m/everything-claude-code.

Install
Terminal · npx
$npx skills add https://github.com/vercel-labs/agent-skills --skill vercel-react-best-practices
Works with Paperclip

How Fal Ai Media fits into a Paperclip company.

Fal Ai Media drops into any Paperclip agent that handles this kind of work. Assign it to a specialist inside a pre-configured PaperclipOrg company and the skill becomes available on every heartbeat — no prompt engineering, no tool wiring.

S
SaaS FactoryPaired

Pre-configured AI company — 18 agents, 18 skills, one-time purchase.

$27$59
Explore pack
Source file
SKILL.md284 lines
Expand
---name: fal-ai-mediadescription: Unified media generation via fal.ai MCP — image, video, and audio. Covers text-to-image (Nano Banana), text/image-to-video (Seedance, Kling, Veo 3), text-to-speech (CSM-1B), and video-to-audio (ThinkSound). Use when the user wants to generate images, videos, or audio with AI.origin: ECC--- # fal.ai Media Generation Generate images, videos, and audio using fal.ai models via MCP. ## When to Activate - User wants to generate images from text prompts- Creating videos from text or images- Generating speech, music, or sound effects- Any media generation task- User says "generate image", "create video", "text to speech", "make a thumbnail", or similar ## MCP Requirement fal.ai MCP server must be configured. Add to `~/.claude.json`: ```json"fal-ai": {  "command": "npx",  "args": ["-y", "fal-ai-mcp-server"],  "env": { "FAL_KEY": "YOUR_FAL_KEY_HERE" }}``` Get an API key at [fal.ai](https://fal.ai). ## MCP Tools The fal.ai MCP provides these tools:- `search` — Find available models by keyword- `find` — Get model details and parameters- `generate` — Run a model with parameters- `result` — Check async generation status- `status` — Check job status- `cancel` — Cancel a running job- `estimate_cost` — Estimate generation cost- `models` — List popular models- `upload` — Upload files for use as inputs --- ## Image Generation ### Nano Banana 2 (Fast)Best for: quick iterations, drafts, text-to-image, image editing. ```generate(  app_id: "fal-ai/nano-banana-2",  input_data: {    "prompt": "a futuristic cityscape at sunset, cyberpunk style",    "image_size": "landscape_16_9",    "num_images": 1,    "seed": 42  })``` ### Nano Banana Pro (High Fidelity)Best for: production images, realism, typography, detailed prompts. ```generate(  app_id: "fal-ai/nano-banana-pro",  input_data: {    "prompt": "professional product photo of wireless headphones on marble surface, studio lighting",    "image_size": "square",    "num_images": 1,    "guidance_scale": 7.5  })``` ### Common Image Parameters | Param | Type | Options | Notes ||-------|------|---------|-------|| `prompt` | string | required | Describe what you want || `image_size` | string | `square`, `portrait_4_3`, `landscape_16_9`, `portrait_16_9`, `landscape_4_3` | Aspect ratio || `num_images` | number | 1-4 | How many to generate || `seed` | number | any integer | Reproducibility || `guidance_scale` | number | 1-20 | How closely to follow the prompt (higher = more literal) | ### Image EditingUse Nano Banana 2 with an input image for inpainting, outpainting, or style transfer: ```# First upload the source imageupload(file_path: "/path/to/image.png") # Then generate with image inputgenerate(  app_id: "fal-ai/nano-banana-2",  input_data: {    "prompt": "same scene but in watercolor style",    "image_url": "<uploaded_url>",    "image_size": "landscape_16_9"  })``` --- ## Video Generation ### Seedance 1.0 Pro (ByteDance)Best for: text-to-video, image-to-video with high motion quality. ```generate(  app_id: "fal-ai/seedance-1-0-pro",  input_data: {    "prompt": "a drone flyover of a mountain lake at golden hour, cinematic",    "duration": "5s",    "aspect_ratio": "16:9",    "seed": 42  })``` ### Kling Video v3 ProBest for: text/image-to-video with native audio generation. ```generate(  app_id: "fal-ai/kling-video/v3/pro",  input_data: {    "prompt": "ocean waves crashing on a rocky coast, dramatic clouds",    "duration": "5s",    "aspect_ratio": "16:9"  })``` ### Veo 3 (Google DeepMind)Best for: video with generated sound, high visual quality. ```generate(  app_id: "fal-ai/veo-3",  input_data: {    "prompt": "a bustling Tokyo street market at night, neon signs, crowd noise",    "aspect_ratio": "16:9"  })``` ### Image-to-VideoStart from an existing image: ```generate(  app_id: "fal-ai/seedance-1-0-pro",  input_data: {    "prompt": "camera slowly zooms out, gentle wind moves the trees",    "image_url": "<uploaded_image_url>",    "duration": "5s"  })``` ### Video Parameters | Param | Type | Options | Notes ||-------|------|---------|-------|| `prompt` | string | required | Describe the video || `duration` | string | `"5s"`, `"10s"` | Video length || `aspect_ratio` | string | `"16:9"`, `"9:16"`, `"1:1"` | Frame ratio || `seed` | number | any integer | Reproducibility || `image_url` | string | URL | Source image for image-to-video | --- ## Audio Generation ### CSM-1B (Conversational Speech)Text-to-speech with natural, conversational quality. ```generate(  app_id: "fal-ai/csm-1b",  input_data: {    "text": "Hello, welcome to the demo. Let me show you how this works.",    "speaker_id": 0  })``` ### ThinkSound (Video-to-Audio)Generate matching audio from video content. ```generate(  app_id: "fal-ai/thinksound",  input_data: {    "video_url": "<video_url>",    "prompt": "ambient forest sounds with birds chirping"  })``` ### ElevenLabs (via API, no MCP)For professional voice synthesis, use ElevenLabs directly: ```pythonimport osimport requests resp = requests.post(    "https://api.elevenlabs.io/v1/text-to-speech/<voice_id>",    headers={        "xi-api-key": os.environ["ELEVENLABS_API_KEY"],        "Content-Type": "application/json"    },    json={        "text": "Your text here",        "model_id": "eleven_turbo_v2_5",        "voice_settings": {"stability": 0.5, "similarity_boost": 0.75}    })with open("output.mp3", "wb") as f:    f.write(resp.content)``` ### VideoDB Generative AudioIf VideoDB is configured, use its generative audio: ```python# Voice generationaudio = coll.generate_voice(text="Your narration here", voice="alloy") # Music generationmusic = coll.generate_music(prompt="upbeat electronic background music", duration=30) # Sound effectssfx = coll.generate_sound_effect(prompt="thunder crack followed by rain")``` --- ## Cost Estimation Before generating, check estimated cost: ```estimate_cost(  estimate_type: "unit_price",  endpoints: {    "fal-ai/nano-banana-pro": {      "unit_quantity": 1    }  })``` ## Model Discovery Find models for specific tasks: ```search(query: "text to video")find(endpoint_ids: ["fal-ai/seedance-1-0-pro"])models()``` ## Tips - Use `seed` for reproducible results when iterating on prompts- Start with lower-cost models (Nano Banana 2) for prompt iteration, then switch to Pro for finals- For video, keep prompts descriptive but concise — focus on motion and scene- Image-to-video produces more controlled results than pure text-to-video- Check `estimate_cost` before running expensive video generations ## Related Skills - `videodb` — Video processing, editing, and streaming- `video-editing` — AI-powered video editing workflows- `content-engine` — Content creation for social platforms