Can I fine-tune GPT-5?

Via OpenAI's fine-tuning API, yes · weights stay on OpenAI's side. Open-weight models let you own the adapter.

How long does fine-tuning take?

LoRA on a 7B model: 1-4 hours on a single H100. Full fine-tune: 12-48 hours on 4-8 H100s. Closed-provider APIs: 10 minutes to hours depending on dataset size.

ConceptsReading · ~3 min · 88 words deep

Fine-tuning

Q: How much data do I need to fine-tune?

1,000-10,000 high-quality examples is the typical sweet spot. Quality > quantity. Poor data degrades the base model.

Training a base model on a smaller, specific dataset to teach it a new skill or voice.

TL;DR

Training a base model on a smaller, specific dataset to teach it a new skill or voice.

Level 1

Basic

Fine-tuning takes a pre-trained foundation model and continues training it on task-specific data. Use cases: customer-support tone, domain knowledge (legal, medical), structured output formats, specific language dialects. Methods range from full fine-tuning (update all parameters, expensive) to LoRA or QLoRA (update only a small adapter, cheap). Most production fine-tuning uses LoRA in 2026.

Level 2

Deep

Fine-tuning recipes: Supervised Fine-Tuning (SFT) on labeled demonstrations · the most common. Direct Preference Optimization (DPO) on preference pairs · preferred over RLHF for alignment because it skips reward modeling. RLHF still used for safety-critical alignment. LoRA (Low-Rank Adaptation) trains a small adapter matrix while freezing base weights · 10-100× cheaper than full fine-tune, recoverable to base. QLoRA adds 4-bit quantization to LoRA for consumer-GPU training. Cost: a 7B model full fine-tune ~$5-20K on rented H100s; same workload with LoRA ~$50-500. Fine-tuning is reversible by discarding the adapter.

Level 3

Expert

LoRA approximates the fine-tune delta ΔW as BA where B ∈ R^{d×r} and A ∈ R^{r×k} with r << min(d, k). Typical rank r = 8-64. Adapter parameters count at LoRA rank 16: <0.1% of full weight count. QLoRA adds NF4 quantization of the frozen base to run fine-tuning on single-GPU setups · degrades quality ~1-2 points. DPO loss: -log σ(β * (log π_θ(y_w|x) - log π_θ(y_l|x) - log π_ref(y_w|x) + log π_ref(y_l|x))). Catastrophic forgetting is the main risk · mitigate with low learning rates (1e-5), limited epochs (1-3), and mixing general-purpose data into the training set.

The takeaway for you

Depending on why you're here

If you are a

Researcher

·SFT, DPO, RLHF are the three common recipes
·LoRA/QLoRA for parameter-efficient tuning
·Catastrophic forgetting is the main pitfall

If you are a

Builder

·Start with LoRA · cheapest, fastest, reversible
·Need ~1K-10K high-quality examples for most tasks
·Fine-tune when RAG can't solve it · tone, format, or latency-critical knowledge

If you are a

Investor

·Fine-tuning infra (Together, Replicate, Hugging Face) is commoditizing fast
·Real moat is in domain-specific datasets and tuning recipes
·Enterprise fine-tuning revenue concentrates at closed-model providers

If you are a

Curious · Normie

·Teaching a smart AI a new trick without retraining it from scratch
·Cheaper than building your own AI
·Turns a general-purpose model into a specialist

Don't mix them up

Often confused with

Fine-tuningvsRAG

RAG injects context at query time · no weight changes. Fine-tuning changes weights. RAG for fast-changing knowledge, fine-tuning for stable patterns.

Fine-tuningvsPrompt engineering

Prompt engineering changes inputs. Fine-tuning changes the model itself. Prompts are free, fine-tuning costs money but yields more reliable behavior.

Gecko's take

Most teams reaching for fine-tuning should reach for better RAG first. When RAG isn't enough, LoRA before full fine-tune · always.

Frequently Asked Questions

1,000-10,000 high-quality examples is the typical sweet spot. Quality > quantity. Poor data degrades the base model.

Canonical sources

Read the primary sources

Fine-tuning

Basic

Deep

Expert

Depending on why you're here

Often confused with

Frequently Asked Questions

Read the primary sources

Related terms

Glossary

Explore live data

Cite or embed