Models process text as tokens, not full words. A token can be a word, part of a word, punctuation, or symbol.
Why it matters: usage limits and pricing are often token-based.
The context window is the maximum tokens the model can consider in one request (prompt + conversation + retrieved docs + output).
Controls randomness in token selection.
Model selects next token from the smallest set whose cumulative probability is at least p.
| Task | Temperature | Top-p |
|---|---|---|
| Policy/compliance answers | Low | Low to medium |
| Brainstorming ideas | Medium to high | Medium to high |
| Code generation | Low to medium | Medium |