Two Training Phases, Two Different Goals
When people talk about “training an LLM,” they often mix together two distinct processes: pretraining and fine-tuning. They both involve adjusting the model’s internal parameters, but they differ in purpose, data, cost, and the kinds of behavior changes you should expect.
Pretraining is about building broad language competence: the model learns general patterns of text, code, and structured language from very large datasets. The objective is usually to predict the next token (or fill in missing text), which forces the model to internalize grammar, facts that appear frequently in data, common reasoning patterns, and many domain conventions.
Fine-tuning is about specializing or aligning: you start from a pretrained model and train it further on a narrower dataset so it behaves in a more specific way. Fine-tuning can teach a model a particular writing style, a domain vocabulary, a task format, or a set of safety and policy preferences.
A useful mental model is: pretraining creates a capable “generalist,” while fine-tuning turns that generalist into a “specialist” (or a “well-behaved assistant”) without rebuilding everything from scratch.
What Changes Inside the Model?
Both pretraining and fine-tuning update the model’s parameters (the weights). The difference is the scope and strength of the update.
Continue in our app.
You can listen to the audiobook with the screen off, receive a free certificate for this course, and also have access to 5,000 other free online courses.
Or continue reading below...Download the app
- Pretraining updates are massive and foundational. The model sees huge, diverse corpora and learns broad statistical regularities. This is where most of the model’s general capabilities come from.
- Fine-tuning updates are targeted. The model sees a much smaller, curated dataset and shifts its behavior in specific directions. The goal is not to relearn language, but to adjust how the model uses what it already knows.
Because fine-tuning is narrower, it can produce dramatic changes in output style and task performance, even though the amount of training data is tiny compared to pretraining. At the same time, fine-tuning can also introduce risks like overfitting or “forgetting” some general behaviors if done carelessly.
Pretraining: Learning General Language and World Regularities
Typical objective
Most modern LLMs are pretrained with an objective like next-token prediction. The model is shown a sequence and learns to predict what comes next. Over billions or trillions of tokens, this encourages the model to represent syntax, semantics, discourse structure, and many domain patterns.
Data characteristics
Pretraining data is typically:
- Large-scale (orders of magnitude larger than any fine-tuning set).
- Diverse (many topics, writing styles, and formats such as prose, documentation, code, and Q&A).
- Noisy (contains contradictions, outdated info, inconsistent quality, and mixed intents).
This diversity is why pretrained models can respond to many prompts, but it is also why they can produce inconsistent tone, follow unwanted patterns, or reflect biases present in data.
Compute and time
Pretraining is expensive. It requires large clusters of GPUs/TPUs, careful distributed training, and long runs. This is why most teams do not pretrain from scratch; they start from an existing pretrained model and adapt it.
What pretraining is good for
- General language fluency across many topics.
- Broad coding and text transformation abilities.
- Learning common formats (emails, summaries, lists, documentation styles).
- Capturing widely repeated facts and conventions from the training distribution.
What pretraining is not good for
- Guaranteed correctness. The model learns patterns, not a verified database of truth.
- Strict adherence to your organization’s policies. Unless those policies are strongly represented in training data, the model will not reliably follow them.
- Consistent behavior across edge cases. Pretraining alone does not ensure the model will refuse unsafe requests or follow a specific response format every time.
Fine-Tuning: Specializing Behavior After Pretraining
Fine-tuning starts with a pretrained model and continues training on a smaller dataset to shape outputs. There are multiple flavors of fine-tuning, and the term is sometimes used broadly. The most common practical categories are:
- Supervised fine-tuning (SFT): Train on input-output pairs (prompt → ideal answer). This teaches the model to respond in a desired format and style.
- Preference-based tuning (often via reward modeling and optimization): Train the model to prefer certain responses over others (helpful, harmless, policy-compliant). This is commonly used to align assistant behavior.
- Parameter-efficient fine-tuning (PEFT): Update only a small set of parameters (e.g., adapters/LoRA) rather than the whole model, reducing cost and making deployment easier.
What fine-tuning is good for
- Consistent formatting: Always output JSON with specific keys, always use a certain template, always include citations in a specific form (if your data shows that).
- Domain style and terminology: Customer support tone, legal drafting conventions, medical note structure (with appropriate safeguards).
- Task specialization: Classification, extraction, routing, or structured generation for a narrow set of tasks.
- Policy alignment: Teaching refusal patterns, safe completion styles, and compliance with internal guidelines.
What fine-tuning is not good for
- Injecting large amounts of new factual knowledge reliably. Fine-tuning can add some knowledge, but it is not a dependable “database update.” If you need up-to-date facts, you typically use external data sources at inference time.
- Fixing fundamental capability gaps. If the base model cannot do multi-step reasoning or handle long instructions well, fine-tuning may help at the margins but won’t fully transform it.
- Replacing good prompting and evaluation. Fine-tuning without clear specs and tests often yields a model that is confidently wrong in a more consistent style.
Choosing Between Pretraining and Fine-Tuning (and When You Need Neither)
In practice, you rarely choose “pretraining vs fine-tuning” as equals. Pretraining from scratch is a major research and infrastructure effort. Fine-tuning is the common adaptation tool. But there is a third option: no additional training, just careful prompting and system-level constraints.
Use a pretrained model as-is when
- Your task is general (summarization, rewriting, brainstorming, basic Q&A).
- You can tolerate some variability in tone and format.
- You can specify requirements clearly in the prompt and validate outputs downstream.
Use fine-tuning when
- You need consistent outputs (format, tone, refusal behavior) across many calls.
- You have a stable task definition and enough high-quality examples.
- You want to reduce prompt length by baking instructions into the model’s behavior.
Consider pretraining (or continued pretraining) when
- You have a large corpus in a specialized language or format not well covered by existing models (e.g., internal codebase patterns, niche technical notation).
- You need a base model that is fundamentally better at your domain before alignment and instruction-following.
- You have the compute budget and expertise to manage large-scale training and evaluation.
Some teams do continued pretraining (also called domain-adaptive pretraining): you take a pretrained model and keep training it on a large domain corpus using the original pretraining objective. This can improve domain fluency before doing instruction fine-tuning. It sits between “pretraining from scratch” and “fine-tuning for instructions.”
Practical Example: Same Base Model, Different Outcomes
Imagine you have a general pretrained model and you want it to act as a customer support assistant for a specific product.
- Without fine-tuning: You can prompt it with guidelines, but it may drift in tone, forget required disclaimers, or invent policies.
- With supervised fine-tuning: You train on thousands of real (sanitized) support conversations rewritten into ideal responses. The model learns the preferred tone, the step-by-step troubleshooting flow, and the standard closing actions (e.g., “offer to create a ticket”).
- With preference-based tuning: You add comparisons where one response follows policy and another violates it. The model learns to prefer compliant behavior even when the user pushes.
The base knowledge of language and general problem-solving comes from pretraining; the reliable “support agent persona” comes from fine-tuning.
Step-by-Step: Designing a Fine-Tuning Project
Fine-tuning succeeds when you treat it like an engineering project with clear specifications and tests, not like a magical upgrade.
Step 1: Write a behavior specification
Define what “good” means in observable terms. Examples:
- Output must be valid JSON with keys:
action,confidence,message. - Never claim to have performed real-world actions (e.g., “I reset your account”) unless the system actually did.
- When uncertain, ask exactly one clarifying question.
This spec becomes your labeling guide and your evaluation checklist.
Step 2: Decide the tuning type
- SFT if you have clear ideal answers and want consistent formatting and tone.
- Preference-based if you want to encode “better vs worse” judgments, especially for safety and policy compliance.
- PEFT if you need lower cost, faster iteration, or multiple specialized variants.
Step 3: Build a high-quality dataset
Quality matters more than size once you have enough coverage. Your dataset should include:
- Typical cases (the majority of real requests).
- Edge cases (ambiguous inputs, missing info, conflicting instructions).
- Adversarial cases (users trying to bypass policy or force the model to output restricted content).
- Negative examples (what the model should not do), especially useful for preference tuning.
Keep prompts and outputs consistent. If you want the model to ask clarifying questions, include many examples where the ideal output is a clarifying question rather than a guess.
Step 4: Choose an evaluation set before training
Hold out a test set that represents real usage. Include automatic checks where possible:
- JSON validity checks.
- Regex checks for required fields.
- Policy keyword checks (as a baseline, not the only safeguard).
- Human review for nuanced criteria (helpfulness, tone, correctness).
If you don’t define success metrics up front, you can easily “improve” the model in ways that don’t matter in production.
Step 5: Run a baseline and error analysis
Before fine-tuning, run the base model with your best prompt and measure performance. Categorize failures:
- Format failures (invalid structure, missing fields).
- Instruction-following failures (ignores constraints).
- Knowledge/correctness failures (wrong facts, hallucinations).
- Safety/policy failures (answers it should refuse).
This tells you whether fine-tuning is the right lever. For example, if most failures are “wrong facts,” fine-tuning may not be the best fix; you might need external verification or retrieval.
Step 6: Train with conservative settings and iterate
Fine-tuning can overfit quickly. Practical tactics:
- Start with a small learning rate and fewer epochs.
- Monitor validation loss and task metrics.
- Prefer more diverse examples over many near-duplicates.
Iterate by adding examples that target observed failure modes. This is often more effective than simply increasing dataset size.
Step 7: Test for regressions
After fine-tuning, check that you didn’t break general capabilities you still need. Common regressions include:
- Reduced helpfulness on general queries.
- Over-refusal (model refuses safe requests).
- Overconfidence in narrow domain answers.
Keep a “general ability” test suite alongside your domain suite.
How Fine-Tuning Interacts With Prompting
Fine-tuning does not eliminate the need for prompts; it changes what prompts need to do.
- Before fine-tuning: Prompts often include long instruction blocks (“You are a helpful assistant… output JSON… follow these rules…”).
- After fine-tuning: Many of those rules can be shortened because the model has learned them. Prompts can focus on the variable parts: the user’s request and any runtime context.
A practical workflow is to start with prompting, collect failures, and then fine-tune to reduce recurring failures. This avoids training a model before you understand the real distribution of requests.
Common Failure Modes and How to Address Them
Overfitting to the training style
If your fine-tuning data always uses the same phrasing, the model may mimic it too strongly and become brittle. Fix by diversifying prompts and including paraphrases.
Catastrophic forgetting (behavior drift)
Heavy fine-tuning can degrade general skills. Mitigations include:
- Using smaller learning rates and fewer epochs.
- Mixing in a small amount of general instruction data.
- Using PEFT methods to limit how much the base model changes.
Teaching the model to “sound right” rather than “be right”
If your dataset rewards confident, polished answers even when uncertain, the model may learn to bluff. Include examples where the correct behavior is to ask a clarifying question, state uncertainty, or provide a verification step.
Misalignment with real production inputs
If training prompts are clean but real user inputs are messy, performance will disappoint. Include realistic noise: typos, incomplete messages, mixed languages, and short fragments.
Pretraining vs Fine-Tuning in Terms of Product Requirements
Latency and cost at inference time
Pretraining and fine-tuning happen offline, but they influence inference costs indirectly. A well-fine-tuned model can reduce the need for long prompts and repeated clarifications, which can lower token usage and speed up responses.
Maintainability
Fine-tuning creates a versioned artifact (a model checkpoint or adapter). That can be easier to manage than a complex prompt that evolves informally. However, it also introduces a lifecycle: dataset updates, retraining, regression testing, and deployment controls.
Governance and compliance
If you need the model to follow strict rules, fine-tuning can help, but you still need external safeguards (input filtering, output validation, and logging). Fine-tuning improves average behavior; it does not guarantee perfect compliance.
Mini Walkthrough: Creating an SFT Dataset for Structured Output
Suppose you want the model to convert free-form user requests into a structured action plan for an internal tool. You want consistent JSON output.
1) Define the schema
{ "intent": "...", "entities": { "...": "..." }, "missing_info": ["..."], "next_step": "..."}2) Write labeling rules
- intent must be one of a fixed set (e.g.,
reset_password,update_billing,cancel_account). - entities includes only fields explicitly present in the user text.
- missing_info lists what is required to proceed (e.g., account email) if absent.
- next_step is either a clarifying question or a safe instruction for the user.
3) Create examples that cover ambiguity
Include cases like:
- User: “I can’t get in.” (ambiguous: password? MFA? locked account?)
- User: “Cancel it.” (missing which subscription/product)
- User: “Change my card to the one ending 1234.” (entity present but may require verification)
4) Add counterexamples
Include examples where the model must not fabricate entities:
- User does not provide an email → model must not invent one.
- User does not specify product → model must ask which product.
5) Train, then validate with strict parsers
After fine-tuning, run outputs through a JSON parser and schema validator. Track failure rates and feed the most common failures back into the dataset as new training examples.
When Fine-Tuning Is the Wrong Tool
Fine-tuning is attractive because it feels like “making the model smarter,” but many real problems are better solved elsewhere:
- Need up-to-date or authoritative facts: Use external sources and verification rather than trying to bake facts into weights.
- Need deterministic business rules: Implement rules in code and use the model for interpretation and drafting, not for final decisions.
- Need guaranteed formatting: Fine-tuning helps, but you should still validate and repair outputs programmatically.
In other words, fine-tuning is best for shaping language behavior and consistency, not for replacing system design.