Tokens, Context Window, Temperature, Top-p

Tokens

Models process text as tokens, not full words. A token can be a word, part of a word, punctuation, or symbol.

Why it matters: usage limits and pricing are often token-based.

Context window

The context window is the maximum tokens the model can consider in one request (prompt + conversation + retrieved docs + output).

Temperature

Controls randomness in token selection.

Top-p (nucleus sampling)

Model selects next token from the smallest set whose cumulative probability is at least p.

Practical settings

TaskTemperatureTop-p
Policy/compliance answersLowLow to medium
Brainstorming ideasMedium to highMedium to high
Code generationLow to mediumMedium