What is Fine-tuning? — AI Glossary 2026

Definition

Continuing to train a pre-trained model on a domain-specific or task-specific dataset to specialize its behavior. Ranges from full fine-tuning (updating all weights) to parameter-efficient methods like LoRA and QLoRA.

Related Terms

LoRA (Low-Rank Adaptation)

A parameter-efficient fine-tuning technique that adds small trainable rank-decomposition matrices to frozen model layers. Achieves near full fine-tuning performance while training less than 1% of parameters. Industry standard for adapting LLMs.

QLoRA (Quantized LoRA)

Combines 4-bit quantization with LoRA fine-tuning, enabling fine-tuning of 65B+ parameter models on a single consumer GPU. Published by Tim Dettmers et al. (2023). Made democratized fine-tuning of large models practical.

RLHF (Reinforcement Learning from Human Feedback)

Training LLMs using human preference signals: human raters compare model outputs, a reward model is trained on these preferences, then the LLM is fine-tuned via RL to maximize the reward. Used to align ChatGPT, Claude, and similar assistants.

DPO (Direct Preference Optimization)

An alignment training method that optimizes the model directly on human preference pairs (preferred vs. rejected responses) without needing a separate reward model. Simpler and more stable than RLHF, increasingly preferred for instruction tuning.

Articles about Fine-tuning

AISafetyBenchExplorer maps AI safety benchmarks

Why Zyphra Cloud on AMD Matters More Than Another Model Launch

Microsoft’s GoalCover finds fine-tuning gaps

Why RAG in Microsoft Foundry needs better indexes, not bigger prompts

xAI gives Anthropic access to Colossus 1

Definition

Related Terms

Articles about Fine-tuning

All Terms