flowchart LR
A[Large raw data] --> B[Pretraining]
B --> C[Base model]
C --> D[Fine-tuning on task/domain]
D --> E[Deployed model]
E --> F[Inference in production]
Pretraining
Model learns general language or multimodal patterns from very large datasets. This phase is expensive and done by model providers.
Fine-tuning
Model is adapted to specific domain behavior, tone, format, or tasks using curated examples.
Example: Fine-tune for legal contract summarization style.
Inference
Runtime stage when users send prompts and receive outputs. Inference quality depends on model choice, prompt quality, and retrieval context.
In real products
Many teams start with base model + prompt engineering + RAG.
Fine-tuning is added later if behavior still needs control.
Inference optimization focuses on latency and cost.