Claude Agent Skill · by Affaan M

Foundation Models On Device

Install Foundation Models On Device skill for Claude Code from affaan-m/everything-claude-code.

Install
Terminal · npx
$npx skills add https://github.com/affaan-m/everything-claude-code --skill foundation-models-on-device
Works with Paperclip

How Foundation Models On Device fits into a Paperclip company.

Foundation Models On Device drops into any Paperclip agent that handles this kind of work. Assign it to a specialist inside a pre-configured PaperclipOrg company and the skill becomes available on every heartbeat — no prompt engineering, no tool wiring.

S
SaaS FactoryPaired

Pre-configured AI company — 18 agents, 18 skills, one-time purchase.

$27$59
Explore pack
Source file
SKILL.md243 lines
Expand
---name: foundation-models-on-devicedescription: Apple FoundationModels framework for on-device LLM — text generation, guided generation with @Generable, tool calling, and snapshot streaming in iOS 26+.--- # FoundationModels: On-Device LLM (iOS 26) Patterns for integrating Apple's on-device language model into apps using the FoundationModels framework. Covers text generation, structured output with `@Generable`, custom tool calling, and snapshot streaming — all running on-device for privacy and offline support. ## When to Activate - Building AI-powered features using Apple Intelligence on-device- Generating or summarizing text without cloud dependency- Extracting structured data from natural language input- Implementing custom tool calling for domain-specific AI actions- Streaming structured responses for real-time UI updates- Need privacy-preserving AI (no data leaves the device) ## Core Pattern — Availability Check Always check model availability before creating a session: ```swiftstruct GenerativeView: View {    private var model = SystemLanguageModel.default     var body: some View {        switch model.availability {        case .available:            ContentView()        case .unavailable(.deviceNotEligible):            Text("Device not eligible for Apple Intelligence")        case .unavailable(.appleIntelligenceNotEnabled):            Text("Please enable Apple Intelligence in Settings")        case .unavailable(.modelNotReady):            Text("Model is downloading or not ready")        case .unavailable(let other):            Text("Model unavailable: \(other)")        }    }}``` ## Core Pattern — Basic Session ```swift// Single-turn: create a new session each timelet session = LanguageModelSession()let response = try await session.respond(to: "What's a good month to visit Paris?")print(response.content) // Multi-turn: reuse session for conversation contextlet session = LanguageModelSession(instructions: """    You are a cooking assistant.    Provide recipe suggestions based on ingredients.    Keep suggestions brief and practical.    """) let first = try await session.respond(to: "I have chicken and rice")let followUp = try await session.respond(to: "What about a vegetarian option?")``` Key points for instructions:- Define the model's role ("You are a mentor")- Specify what to do ("Help extract calendar events")- Set style preferences ("Respond as briefly as possible")- Add safety measures ("Respond with 'I can't help with that' for dangerous requests") ## Core Pattern — Guided Generation with @Generable Generate structured Swift types instead of raw strings: ### 1. Define a Generable Type ```swift@Generable(description: "Basic profile information about a cat")struct CatProfile {    var name: String     @Guide(description: "The age of the cat", .range(0...20))    var age: Int     @Guide(description: "A one sentence profile about the cat's personality")    var profile: String}``` ### 2. Request Structured Output ```swiftlet response = try await session.respond(    to: "Generate a cute rescue cat",    generating: CatProfile.self) // Access structured fields directlyprint("Name: \(response.content.name)")print("Age: \(response.content.age)")print("Profile: \(response.content.profile)")``` ### Supported @Guide Constraints - `.range(0...20)` — numeric range- `.count(3)` — array element count- `description:` — semantic guidance for generation ## Core Pattern — Tool Calling Let the model invoke custom code for domain-specific tasks: ### 1. Define a Tool ```swiftstruct RecipeSearchTool: Tool {    let name = "recipe_search"    let description = "Search for recipes matching a given term and return a list of results."     @Generable    struct Arguments {        var searchTerm: String        var numberOfResults: Int    }     func call(arguments: Arguments) async throws -> ToolOutput {        let recipes = await searchRecipes(            term: arguments.searchTerm,            limit: arguments.numberOfResults        )        return .string(recipes.map { "- \($0.name): \($0.description)" }.joined(separator: "\n"))    }}``` ### 2. Create Session with Tools ```swiftlet session = LanguageModelSession(tools: [RecipeSearchTool()])let response = try await session.respond(to: "Find me some pasta recipes")``` ### 3. Handle Tool Errors ```swiftdo {    let answer = try await session.respond(to: "Find a recipe for tomato soup.")} catch let error as LanguageModelSession.ToolCallError {    print(error.tool.name)    if case .databaseIsEmpty = error.underlyingError as? RecipeSearchToolError {        // Handle specific tool error    }}``` ## Core Pattern — Snapshot Streaming Stream structured responses for real-time UI with `PartiallyGenerated` types: ```swift@Generablestruct TripIdeas {    @Guide(description: "Ideas for upcoming trips")    var ideas: [String]} let stream = session.streamResponse(    to: "What are some exciting trip ideas?",    generating: TripIdeas.self) for try await partial in stream {    // partial: TripIdeas.PartiallyGenerated (all properties Optional)    print(partial)}``` ### SwiftUI Integration ```swift@State private var partialResult: TripIdeas.PartiallyGenerated?@State private var errorMessage: String? var body: some View {    List {        ForEach(partialResult?.ideas ?? [], id: \.self) { idea in            Text(idea)        }    }    .overlay {        if let errorMessage { Text(errorMessage).foregroundStyle(.red) }    }    .task {        do {            let stream = session.streamResponse(to: prompt, generating: TripIdeas.self)            for try await partial in stream {                partialResult = partial            }        } catch {            errorMessage = error.localizedDescription        }    }}``` ## Key Design Decisions | Decision | Rationale ||----------|-----------|| On-device execution | Privacy — no data leaves the device; works offline || 4,096 token limit | On-device model constraint; chunk large data across sessions || Snapshot streaming (not deltas) | Structured output friendly; each snapshot is a complete partial state || `@Generable` macro | Compile-time safety for structured generation; auto-generates `PartiallyGenerated` type || Single request per session | `isResponding` prevents concurrent requests; create multiple sessions if needed || `response.content` (not `.output`) | Correct API — always access results via `.content` property | ## Best Practices - **Always check `model.availability`** before creating a session — handle all unavailability cases- **Use `instructions`** to guide model behavior — they take priority over prompts- **Check `isResponding`** before sending a new request — sessions handle one request at a time- **Access `response.content`** for results — not `.output`- **Break large inputs into chunks** — 4,096 token limit applies to instructions + prompt + output combined- **Use `@Generable`** for structured output — stronger guarantees than parsing raw strings- **Use `GenerationOptions(temperature:)`** to tune creativity (higher = more creative)- **Monitor with Instruments** — use Xcode Instruments to profile request performance ## Anti-Patterns to Avoid - Creating sessions without checking `model.availability` first- Sending inputs exceeding the 4,096 token context window- Attempting concurrent requests on a single session- Using `.output` instead of `.content` to access response data- Parsing raw string responses when `@Generable` structured output would work- Building complex multi-step logic in a single prompt — break into multiple focused prompts- Assuming the model is always available — device eligibility and settings vary ## When to Use - On-device text generation for privacy-sensitive apps- Structured data extraction from user input (forms, natural language commands)- AI-assisted features that must work offline- Streaming UI that progressively shows generated content- Domain-specific AI actions via tool calling (search, compute, lookup)