Tokens, Context Window, Temperature, Top-p

Tokens

Models process text as tokens, not full words. A token can be a word, part of a word, punctuation, or symbol.

Why it matters: usage limits and pricing are often token-based.

The context window is the maximum tokens the model can consider in one request (prompt + conversation + retrieved docs + output).

Controls randomness in token selection.

Model selects next token from the smallest set whose cumulative probability is at least p.