Name: Fine Tuning Expert
Author: Jeffallan

Install

Terminal · npx

$npx skills add https://github.com/jeffallan/claude-skills --skill fine-tuning-expert

Works with Paperclip

How Fine Tuning Expert fits into a Paperclip company.

Fine Tuning Expert drops into any Paperclip agent that handles this kind of work. Assign it to a specialist inside a pre-configured PaperclipOrg company and the skill becomes available on every heartbeat — no prompt engineering, no tool wiring.

SaaS FactoryPaired

Pre-configured AI company — 18 agents, 18 skills, one-time purchase.

$27$59

Explore pack

Source file

SKILL.md162 linesmarkdown

Expand

1---2name: fine-tuning-expert3description: "Use when fine-tuning LLMs, training custom models, or adapting foundation models for specific tasks. Invoke for configuring LoRA/QLoRA adapters, preparing JSONL training datasets, setting hyperparameters for fine-tuning runs, adapter training, transfer learning, finetuning with Hugging Face PEFT, OpenAI fine-tuning, instruction tuning, RLHF, DPO, or quantizing and deploying fine-tuned models. Trigger terms include: LoRA, QLoRA, PEFT, finetuning, fine-tuning, adapter tuning, LLM training, model training, custom model."4license: MIT5metadata:6  author: https://github.com/Jeffallan7  version: "1.1.0"8  domain: data-ml9  triggers: fine-tuning, fine tuning, finetuning, LoRA, QLoRA, PEFT, adapter tuning, transfer learning, model training, custom model, LLM training, instruction tuning, RLHF, model optimization, quantization10  role: expert11  scope: implementation12  output-format: code13  related-skills: devops-engineer14---15 16# Fine-Tuning Expert17 18Senior ML engineer specializing in LLM fine-tuning, parameter-efficient methods, and production model optimization.19 20## Core Workflow21 221. **Dataset preparation** — Validate and format data; run quality checks before training starts23   - Checkpoint: `python validate_dataset.py --input data.jsonl` — fix all errors before proceeding242. **Method selection** — Choose PEFT technique based on GPU memory and task requirements25   - Use LoRA for most tasks; QLoRA (4-bit) when GPU memory is constrained; full fine-tune only for small models263. **Training** — Configure hyperparameters, monitor loss curves, checkpoint regularly27   - Checkpoint: validation loss must decrease; plateau or increase signals overfitting284. **Evaluation** — Benchmark against the base model; test on held-out set and edge cases29   - Checkpoint: collect perplexity, task-specific metrics (BLEU/ROUGE), and latency numbers305. **Deployment** — Merge adapter weights, quantize, measure inference throughput before serving31 32## Reference Guide33 34Load detailed guidance based on context:35 36| Topic | Reference | Load When |37|-------|-----------|-----------|38| LoRA/PEFT | `references/lora-peft.md` | Parameter-efficient fine-tuning, adapters |39| Dataset Prep | `references/dataset-preparation.md` | Training data formatting, quality checks |40| Hyperparameters | `references/hyperparameter-tuning.md` | Learning rates, batch sizes, schedulers |41| Evaluation | `references/evaluation-metrics.md` | Benchmarking, metrics, model comparison |42| Deployment | `references/deployment-optimization.md` | Model merging, quantization, serving |43 44## Minimal Working Example — LoRA Fine-Tuning with Hugging Face PEFT45 46```python47from datasets import load_dataset48from transformers import AutoTokenizer, AutoModelForCausalLM, TrainingArguments49from peft import LoraConfig, get_peft_model, TaskType50from trl import SFTTrainer51import torch52 53# 1. Load base model and tokenizer54model_id = "meta-llama/Llama-3-8B"55tokenizer = AutoTokenizer.from_pretrained(model_id)56tokenizer.pad_token = tokenizer.eos_token57 58model = AutoModelForCausalLM.from_pretrained(59    model_id,60    torch_dtype=torch.bfloat16,61    device_map="auto",62)63 64# 2. Configure LoRA adapter65lora_config = LoraConfig(66    task_type=TaskType.CAUSAL_LM,67    r=16,               # rank — increase for more capacity, decrease to save memory68    lora_alpha=32,      # scaling factor; typically 2× rank69    target_modules=["q_proj", "v_proj"],70    lora_dropout=0.05,71    bias="none",72)73model = get_peft_model(model, lora_config)74model.print_trainable_parameters()  # verify: should be ~0.1–1% of total params75 76# 3. Load and format dataset (Alpaca-style JSONL)77dataset = load_dataset("json", data_files={"train": "train.jsonl", "test": "test.jsonl"})78 79def format_prompt(example):80    return {"text": f"### Instruction:\n{example['instruction']}\n\n### Response:\n{example['output']}"}81 82dataset = dataset.map(format_prompt)83 84# 4. Training arguments85training_args = TrainingArguments(86    output_dir="./checkpoints",87    num_train_epochs=3,88    per_device_train_batch_size=4,89    gradient_accumulation_steps=4,     # effective batch size = 1690    learning_rate=2e-4,91    lr_scheduler_type="cosine",92    warmup_ratio=0.03,                 # always use warmup93    fp16=False,94    bf16=True,95    logging_steps=10,96    eval_strategy="steps",97    eval_steps=100,98    save_steps=200,99    load_best_model_at_end=True,100)101 102# 5. Train103trainer = SFTTrainer(104    model=model,105    args=training_args,106    train_dataset=dataset["train"],107    eval_dataset=dataset["test"],108    dataset_text_field="text",109    max_seq_length=2048,110)111trainer.train()112 113# 6. Save adapter weights only114model.save_pretrained("./lora-adapter")115tokenizer.save_pretrained("./lora-adapter")116```117 118**QLoRA variant** — add these lines before loading the model to enable 4-bit quantization:119```python120from transformers import BitsAndBytesConfig121 122bnb_config = BitsAndBytesConfig(123    load_in_4bit=True,124    bnb_4bit_quant_type="nf4",125    bnb_4bit_compute_dtype=torch.bfloat16,126    bnb_4bit_use_double_quant=True,127)128model = AutoModelForCausalLM.from_pretrained(model_id, quantization_config=bnb_config, device_map="auto")129```130 131**Merge adapter into base model for deployment:**132```python133from peft import PeftModel134 135base = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=torch.bfloat16)136merged = PeftModel.from_pretrained(base, "./lora-adapter").merge_and_unload()137merged.save_pretrained("./merged-model")138```139 140## Constraints141 142### MUST DO143- Validate dataset quality before training144- Use parameter-efficient methods for large models (>7B)145- Monitor training/validation loss curves146- Document hyperparameters and training config147- Version datasets and model checkpoints148- Always include a learning rate warmup149 150### MUST NOT DO151- Skip data quality validation152- Overfit on small datasets — use regularisation (dropout, weight decay) and early stopping153- Merge incompatible adapters (mismatched rank, base model, or target modules)154- Deploy without evaluation against a held-out set and latency benchmark155 156## Output Templates157 158When implementing fine-tuning, always provide:1591. **Dataset preparation script** with validation logic (schema checks, token-length histogram, deduplication)1602. **Training configuration** (full `TrainingArguments` + `LoraConfig` block, commented)1613. **Evaluation script** reporting perplexity, task-specific metrics, and latency1624. **Brief design rationale** — why this PEFT method, rank, and learning rate were chosen for this task

Related skills

Angular Architect

Install Angular Architect skill for Claude Code from jeffallan/claude-skills.

Api Designer

Install Api Designer skill for Claude Code from jeffallan/claude-skills.

Architecture Designer

Install Architecture Designer skill for Claude Code from jeffallan/claude-skills.