Claude Agent Skill · by Jeffallan

Fine Tuning Expert

Install Fine Tuning Expert skill for Claude Code from jeffallan/claude-skills.

Install
Terminal · npx
$npx skills add https://github.com/jeffallan/claude-skills --skill fine-tuning-expert
Works with Paperclip

How Fine Tuning Expert fits into a Paperclip company.

Fine Tuning Expert drops into any Paperclip agent that handles this kind of work. Assign it to a specialist inside a pre-configured PaperclipOrg company and the skill becomes available on every heartbeat — no prompt engineering, no tool wiring.

S
SaaS FactoryPaired

Pre-configured AI company — 18 agents, 18 skills, one-time purchase.

$27$59
Explore pack
Source file
SKILL.md162 lines
Expand
---name: fine-tuning-expertdescription: "Use when fine-tuning LLMs, training custom models, or adapting foundation models for specific tasks. Invoke for configuring LoRA/QLoRA adapters, preparing JSONL training datasets, setting hyperparameters for fine-tuning runs, adapter training, transfer learning, finetuning with Hugging Face PEFT, OpenAI fine-tuning, instruction tuning, RLHF, DPO, or quantizing and deploying fine-tuned models. Trigger terms include: LoRA, QLoRA, PEFT, finetuning, fine-tuning, adapter tuning, LLM training, model training, custom model."license: MITmetadata:  author: https://github.com/Jeffallan  version: "1.1.0"  domain: data-ml  triggers: fine-tuning, fine tuning, finetuning, LoRA, QLoRA, PEFT, adapter tuning, transfer learning, model training, custom model, LLM training, instruction tuning, RLHF, model optimization, quantization  role: expert  scope: implementation  output-format: code  related-skills: devops-engineer--- # Fine-Tuning Expert Senior ML engineer specializing in LLM fine-tuning, parameter-efficient methods, and production model optimization. ## Core Workflow 1. **Dataset preparation** — Validate and format data; run quality checks before training starts   - Checkpoint: `python validate_dataset.py --input data.jsonl` — fix all errors before proceeding2. **Method selection** — Choose PEFT technique based on GPU memory and task requirements   - Use LoRA for most tasks; QLoRA (4-bit) when GPU memory is constrained; full fine-tune only for small models3. **Training** — Configure hyperparameters, monitor loss curves, checkpoint regularly   - Checkpoint: validation loss must decrease; plateau or increase signals overfitting4. **Evaluation** — Benchmark against the base model; test on held-out set and edge cases   - Checkpoint: collect perplexity, task-specific metrics (BLEU/ROUGE), and latency numbers5. **Deployment** — Merge adapter weights, quantize, measure inference throughput before serving ## Reference Guide Load detailed guidance based on context: | Topic | Reference | Load When ||-------|-----------|-----------|| LoRA/PEFT | `references/lora-peft.md` | Parameter-efficient fine-tuning, adapters || Dataset Prep | `references/dataset-preparation.md` | Training data formatting, quality checks || Hyperparameters | `references/hyperparameter-tuning.md` | Learning rates, batch sizes, schedulers || Evaluation | `references/evaluation-metrics.md` | Benchmarking, metrics, model comparison || Deployment | `references/deployment-optimization.md` | Model merging, quantization, serving | ## Minimal Working Example — LoRA Fine-Tuning with Hugging Face PEFT ```pythonfrom datasets import load_datasetfrom transformers import AutoTokenizer, AutoModelForCausalLM, TrainingArgumentsfrom peft import LoraConfig, get_peft_model, TaskTypefrom trl import SFTTrainerimport torch # 1. Load base model and tokenizermodel_id = "meta-llama/Llama-3-8B"tokenizer = AutoTokenizer.from_pretrained(model_id)tokenizer.pad_token = tokenizer.eos_token model = AutoModelForCausalLM.from_pretrained(    model_id,    torch_dtype=torch.bfloat16,    device_map="auto",) # 2. Configure LoRA adapterlora_config = LoraConfig(    task_type=TaskType.CAUSAL_LM,    r=16,               # rank — increase for more capacity, decrease to save memory    lora_alpha=32,      # scaling factor; typically 2× rank    target_modules=["q_proj", "v_proj"],    lora_dropout=0.05,    bias="none",)model = get_peft_model(model, lora_config)model.print_trainable_parameters()  # verify: should be ~0.1–1% of total params # 3. Load and format dataset (Alpaca-style JSONL)dataset = load_dataset("json", data_files={"train": "train.jsonl", "test": "test.jsonl"}) def format_prompt(example):    return {"text": f"### Instruction:\n{example['instruction']}\n\n### Response:\n{example['output']}"} dataset = dataset.map(format_prompt) # 4. Training argumentstraining_args = TrainingArguments(    output_dir="./checkpoints",    num_train_epochs=3,    per_device_train_batch_size=4,    gradient_accumulation_steps=4,     # effective batch size = 16    learning_rate=2e-4,    lr_scheduler_type="cosine",    warmup_ratio=0.03,                 # always use warmup    fp16=False,    bf16=True,    logging_steps=10,    eval_strategy="steps",    eval_steps=100,    save_steps=200,    load_best_model_at_end=True,) # 5. Traintrainer = SFTTrainer(    model=model,    args=training_args,    train_dataset=dataset["train"],    eval_dataset=dataset["test"],    dataset_text_field="text",    max_seq_length=2048,)trainer.train() # 6. Save adapter weights onlymodel.save_pretrained("./lora-adapter")tokenizer.save_pretrained("./lora-adapter")``` **QLoRA variant** — add these lines before loading the model to enable 4-bit quantization:```pythonfrom transformers import BitsAndBytesConfig bnb_config = BitsAndBytesConfig(    load_in_4bit=True,    bnb_4bit_quant_type="nf4",    bnb_4bit_compute_dtype=torch.bfloat16,    bnb_4bit_use_double_quant=True,)model = AutoModelForCausalLM.from_pretrained(model_id, quantization_config=bnb_config, device_map="auto")``` **Merge adapter into base model for deployment:**```pythonfrom peft import PeftModel base = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=torch.bfloat16)merged = PeftModel.from_pretrained(base, "./lora-adapter").merge_and_unload()merged.save_pretrained("./merged-model")``` ## Constraints ### MUST DO- Validate dataset quality before training- Use parameter-efficient methods for large models (>7B)- Monitor training/validation loss curves- Document hyperparameters and training config- Version datasets and model checkpoints- Always include a learning rate warmup ### MUST NOT DO- Skip data quality validation- Overfit on small datasets — use regularisation (dropout, weight decay) and early stopping- Merge incompatible adapters (mismatched rank, base model, or target modules)- Deploy without evaluation against a held-out set and latency benchmark ## Output Templates When implementing fine-tuning, always provide:1. **Dataset preparation script** with validation logic (schema checks, token-length histogram, deduplication)2. **Training configuration** (full `TrainingArguments` + `LoraConfig` block, commented)3. **Evaluation script** reporting perplexity, task-specific metrics, and latency4. **Brief design rationale** — why this PEFT method, rank, and learning rate were chosen for this task