Decoding Parameters and Output Control

Core controls

ParameterWhat it controlsTypical usage
temperaturerandomness of sampling0.2-0.4 factual, 0.8-1.2 creative
top_ksample from top-k candidatesconservative diversity
top_psample from nucleus probability massadaptive diversity
repetition/frequency penaltyreduce repeated tokenslong outputs, summaries
max_tokenshard output length capcost and latency control
stop sequencesforced termination stringstructured response boundaries

Practical presets

Important tuning note

Avoid extreme values on both top_k and top_p at the same time. Tune one primary sampling control first, then temperature.