Claude Agent Skill · by Dpearson2699

Natural Language

Install Natural Language skill for Claude Code from dpearson2699/swift-ios-skills.

Install
Terminal · npx
$npx skills add https://github.com/vercel-labs/agent-skills --skill vercel-react-native-skills
Works with Paperclip

How Natural Language fits into a Paperclip company.

Natural Language drops into any Paperclip agent that handles this kind of work. Assign it to a specialist inside a pre-configured PaperclipOrg company and the skill becomes available on every heartbeat — no prompt engineering, no tool wiring.

S
SaaS FactoryPaired

Pre-configured AI company — 18 agents, 18 skills, one-time purchase.

$27$59
Explore pack
Source file
SKILL.md412 lines
Expand
---name: natural-languagedescription: "Tokenize, tag, and analyze natural language text using Apple's NaturalLanguage framework and translate between languages with the Translation framework. Use when adding language identification, sentiment analysis, named entity recognition, part-of-speech tagging, text embeddings, or in-app translation to iOS/macOS/visionOS apps."--- # NaturalLanguage + Translation Analyze natural language text for tokenization, part-of-speech tagging, namedentity recognition, sentiment analysis, language identification, and word/sentenceembeddings. Translate text between languages with the Translation framework.Targets Swift 6.3 / iOS 26+. > This skill covers two related frameworks: **NaturalLanguage** (`NLTokenizer`, `NLTagger`, `NLEmbedding`) for on-device text analysis, and **Translation** (`TranslationSession`, `LanguageAvailability`) for language translation. ## Contents - [Setup](#setup)- [Tokenization](#tokenization)- [Language Identification](#language-identification)- [Part-of-Speech Tagging](#part-of-speech-tagging)- [Named Entity Recognition](#named-entity-recognition)- [Sentiment Analysis](#sentiment-analysis)- [Text Embeddings](#text-embeddings)- [Translation](#translation)- [Common Mistakes](#common-mistakes)- [Review Checklist](#review-checklist)- [References](#references) ## Setup Import `NaturalLanguage` for text analysis and `Translation` for languagetranslation. No special entitlements or capabilities are required forNaturalLanguage. Translation requires iOS 17.4+ / macOS 14.4+. ```swiftimport NaturalLanguageimport Translation``` NaturalLanguage classes (`NLTokenizer`, `NLTagger`) are **not thread-safe**.Use each instance from one thread or dispatch queue at a time. ## Tokenization Segment text into words, sentences, or paragraphs with `NLTokenizer`. ```swiftimport NaturalLanguage func tokenizeWords(in text: String) -> [String] {    let tokenizer = NLTokenizer(unit: .word)    tokenizer.string = text     let range = text.startIndex..<text.endIndex    return tokenizer.tokens(for: range).map { String(text[$0]) }}``` ### Token Units | Unit | Description ||---|---|| `.word` | Individual words || `.sentence` | Sentences || `.paragraph` | Paragraphs || `.document` | Entire document | ### Enumerating with Attributes Use `enumerateTokens(in:using:)` to detect numeric or emoji tokens. ```swiftlet tokenizer = NLTokenizer(unit: .word)tokenizer.string = text tokenizer.enumerateTokens(in: text.startIndex..<text.endIndex) { range, attributes in    if attributes.contains(.numeric) {        print("Number: \(text[range])")    }    return true // continue enumeration}``` ## Language Identification Detect the dominant language of a string with `NLLanguageRecognizer`. ```swiftfunc detectLanguage(for text: String) -> NLLanguage? {    NLLanguageRecognizer.dominantLanguage(for: text)} // Multiple hypotheses with confidence scoresfunc languageHypotheses(for text: String, max: Int = 5) -> [NLLanguage: Double] {    let recognizer = NLLanguageRecognizer()    recognizer.processString(text)    return recognizer.languageHypotheses(withMaximum: max)}``` Constrain the recognizer to expected languages for better accuracy on short text. ```swiftlet recognizer = NLLanguageRecognizer()recognizer.languageConstraints = [.english, .french, .spanish]recognizer.processString(text)let detected = recognizer.dominantLanguage``` ## Part-of-Speech Tagging Identify nouns, verbs, adjectives, and other lexical classes with `NLTagger`. ```swiftfunc tagPartsOfSpeech(in text: String) -> [(String, NLTag)] {    let tagger = NLTagger(tagSchemes: [.lexicalClass])    tagger.string = text     var results: [(String, NLTag)] = []    let range = text.startIndex..<text.endIndex    let options: NLTagger.Options = [.omitPunctuation, .omitWhitespace]     tagger.enumerateTags(in: range, unit: .word, scheme: .lexicalClass, options: options) { tag, tokenRange in        if let tag {            results.append((String(text[tokenRange]), tag))        }        return true    }    return results}``` ### Common Tag Schemes | Scheme | Output ||---|---|| `.lexicalClass` | Part of speech (noun, verb, adjective) || `.nameType` | Named entity type (person, place, organization) || `.nameTypeOrLexicalClass` | Combined NER + POS || `.lemma` | Base form of a word || `.language` | Per-token language || `.sentimentScore` | Sentiment polarity score | ## Named Entity Recognition Extract people, places, and organizations. ```swiftfunc extractEntities(from text: String) -> [(String, NLTag)] {    let tagger = NLTagger(tagSchemes: [.nameType])    tagger.string = text     var entities: [(String, NLTag)] = []    let options: NLTagger.Options = [.omitPunctuation, .omitWhitespace, .joinNames]     tagger.enumerateTags(        in: text.startIndex..<text.endIndex,        unit: .word,        scheme: .nameType,        options: options    ) { tag, tokenRange in        if let tag, tag != .other {            entities.append((String(text[tokenRange]), tag))        }        return true    }    return entities}// NLTag values: .personalName, .placeName, .organizationName``` ## Sentiment Analysis Score text sentiment from -1.0 (negative) to +1.0 (positive). ```swiftfunc sentimentScore(for text: String) -> Double? {    let tagger = NLTagger(tagSchemes: [.sentimentScore])    tagger.string = text     let (tag, _) = tagger.tag(        at: text.startIndex,        unit: .paragraph,        scheme: .sentimentScore    )    return tag.flatMap { Double($0.rawValue) }}``` ## Text Embeddings Measure semantic similarity between words or sentences with `NLEmbedding`. ```swiftfunc wordSimilarity(_ word1: String, _ word2: String) -> Double? {    guard let embedding = NLEmbedding.wordEmbedding(for: .english) else { return nil }    return embedding.distance(between: word1, and: word2, distanceType: .cosine)} func findSimilarWords(to word: String, count: Int = 5) -> [(String, Double)] {    guard let embedding = NLEmbedding.wordEmbedding(for: .english) else { return [] }    return embedding.neighbors(for: word, maximumCount: count, distanceType: .cosine)}``` Sentence embeddings compare entire sentences. ```swiftfunc sentenceSimilarity(_ s1: String, _ s2: String) -> Double? {    guard let embedding = NLEmbedding.sentenceEmbedding(for: .english) else { return nil }    return embedding.distance(between: s1, and: s2, distanceType: .cosine)}``` ## Translation ### System Translation Overlay Show the built-in translation UI with `.translationPresentation()`. ```swiftimport SwiftUIimport Translation struct TranslatableView: View {    @State private var showTranslation = false    let text = "Hello, how are you?"     var body: some View {        Text(text)            .onTapGesture { showTranslation = true }            .translationPresentation(                isPresented: $showTranslation,                text: text            )    }}``` ### Programmatic Translation Use `.translationTask()` for programmatic translations within a view context. ```swiftstruct TranslatingView: View {    @State private var translatedText = ""    @State private var configuration: TranslationSession.Configuration?     var body: some View {        VStack {            Text(translatedText)            Button("Translate") {                configuration = .init(source: Locale.Language(identifier: "en"),                                      target: Locale.Language(identifier: "es"))            }        }        .translationTask(configuration) { session in            let response = try await session.translate("Hello, world!")            translatedText = response.targetText        }    }}``` ### Batch Translation Translate multiple strings in a single session. ```swift.translationTask(configuration) { session in    let requests = texts.enumerated().map { index, text in        TranslationSession.Request(sourceText: text,                                    clientIdentifier: "\(index)")    }    let responses = try await session.translations(from: requests)    for response in responses {        print("\(response.sourceText) -> \(response.targetText)")    }}``` ### Checking Language Availability ```swiftlet availability = LanguageAvailability()let status = await availability.status(    from: Locale.Language(identifier: "en"),    to: Locale.Language(identifier: "ja"))switch status {case .installed: break    // Ready to translate offlinecase .supported: break    // Needs downloadcase .unsupported: break  // Language pair not available}``` ## Common Mistakes ### DON'T: Share NLTagger/NLTokenizer across threads These classes are not thread-safe and will produce incorrect results or crash. ```swift// WRONGlet sharedTagger = NLTagger(tagSchemes: [.lexicalClass])DispatchQueue.concurrentPerform(iterations: 10) { _ in    sharedTagger.string = someText  // Data race} // CORRECTawait withTaskGroup(of: Void.self) { group in    for _ in 0..<10 {        group.addTask {            let tagger = NLTagger(tagSchemes: [.lexicalClass])            tagger.string = someText            // process...        }    }}``` ### DON'T: Confuse NaturalLanguage with Core ML NaturalLanguage provides built-in linguistic analysis. Use Core ML for customtrained models. They complement each other via `NLModel`. ```swift// WRONG: Trying to do NER with raw Core MLlet coreMLModel = try MLModel(contentsOf: modelURL) // CORRECT: Use NLTagger for built-in NERlet tagger = NLTagger(tagSchemes: [.nameType]) // Or load a custom Core ML model via NLModellet nlModel = try NLModel(mlModel: coreMLModel)tagger.setModels([nlModel], forTagScheme: .nameType)``` ### DON'T: Assume embeddings exist for all languages Not all languages have word or sentence embeddings available on device. ```swift// WRONG: Force unwraplet embedding = NLEmbedding.wordEmbedding(for: .japanese)! // CORRECT: Handle nilguard let embedding = NLEmbedding.wordEmbedding(for: .japanese) else {    // Embedding not available for this language    return}``` ### DON'T: Create a new tagger per token Creating and configuring a tagger is expensive. Reuse it for the same text. ```swift// WRONG: New tagger per wordfor word in words {    let tagger = NLTagger(tagSchemes: [.lexicalClass])    tagger.string = word} // CORRECT: Set string once, enumeratelet tagger = NLTagger(tagSchemes: [.lexicalClass])tagger.string = fullTexttagger.enumerateTags(in: fullText.startIndex..<fullText.endIndex,                     unit: .word, scheme: .lexicalClass, options: []) { tag, range in    return true}``` ### DON'T: Ignore language hints for short text Language detection on short strings (under ~20 characters) is unreliable.Set constraints or hints to improve accuracy. ```swift// WRONG: Detect language of a single wordlet lang = NLLanguageRecognizer.dominantLanguage(for: "chat")  // French or English? // CORRECT: Provide contextlet recognizer = NLLanguageRecognizer()recognizer.languageHints = [.english: 0.8, .french: 0.2]recognizer.processString("chat")``` ## Review Checklist - [ ] `NLTokenizer` and `NLTagger` instances used from a single thread- [ ] Tagger created once per text, not per token- [ ] Language detection uses constraints/hints for short text- [ ] `NLEmbedding` availability checked before use (returns nil if unavailable)- [ ] Translation `LanguageAvailability` checked before attempting translation- [ ] `.translationTask()` used within a SwiftUI view hierarchy- [ ] Batch translation uses `clientIdentifier` to match responses to requests- [ ] Sentiment scores handled as optional (may return nil for unsupported languages)- [ ] `.joinNames` option used with NER to keep multi-word names together- [ ] Custom ML models loaded via `NLModel`, not raw Core ML ## References - Extended patterns (custom models, contextual embeddings, gazetteers): [references/translation-patterns.md](references/translation-patterns.md)- [Natural Language framework](https://sosumi.ai/documentation/naturallanguage)- [NLTokenizer](https://sosumi.ai/documentation/naturallanguage/nltokenizer)- [NLTagger](https://sosumi.ai/documentation/naturallanguage/nltagger)- [NLEmbedding](https://sosumi.ai/documentation/naturallanguage/nlembedding)- [NLLanguageRecognizer](https://sosumi.ai/documentation/naturallanguage/nllanguagerecognizer)- [Translation framework](https://sosumi.ai/documentation/translation)- [TranslationSession](https://sosumi.ai/documentation/translation/translationsession)- [LanguageAvailability](https://sosumi.ai/documentation/translation/languageavailability)