Beta
ConceptsReading · ~3 min · 68 words deep

Temperature

A knob that controls how random or deterministic the AI's output is · 0 = same answer every time, higher = more varied.

TL;DR

A knob that controls how random or deterministic the AI's output is · 0 = same answer every time, higher = more varied.

Level 1

Temperature scales the probability distribution over next tokens before sampling. At temperature 0, the model always picks the highest-probability token (deterministic, boring, consistent). At 1.0, sampling matches the model's learned distribution (balanced). At 2.0, distribution flattens, output becomes erratic. Most production code uses temperature 0.0 to 0.7.

Level 2

Math: logits are divided by temperature before softmax. Lower temperature sharpens the distribution; higher flattens it. Temperature 0 is not technically in the formula (would divide by zero) · it's implemented as argmax (pick highest logit). Common uses: 0 for structured extraction, factual Q&A; 0.3-0.7 for balanced chat; 0.8-1.2 for creative writing. Temperature and top-p are often used together but shouldn't both be tuned aggressively · pick one.

Level 3

Temperature = 0 breaks ties by tokenization order in most implementations, which is why some "deterministic" calls still vary across infrastructure. True determinism requires seed parameter (OpenAI supports, Anthropic doesn't). Temperature affects reasoning quality non-linearly · too low produces brittle reasoning, too high introduces errors. For reasoning models, temperature usually sets at 0-0.3 range for correctness.

The takeaway for you
If you are a
Researcher
  • ·logit_i / T before softmax
  • ·True determinism needs seed · temperature 0 alone isn't enough
  • ·Reasoning quality peaks at 0.0-0.3
If you are a
Builder
  • ·0 for structured extraction and factual tasks
  • ·0.7 for chat · natural variation without chaos
  • ·Don't combine aggressive temperature + aggressive top-p
If you are a
Investor
  • ·Temperature is an API primitive · universal across vendors
  • ·Reasoning model temperature defaults drive reliability perception
  • ·Most API calls use defaults · pricing parameter matters more to most users
If you are a
Curious · Normie
  • ·How creative vs robotic the AI sounds
  • ·Low = same answer every time, high = wild answers
  • ·Most apps use a low-medium setting
Gecko's take

Temperature is over-tuned in most production code. Use 0 for structured, 0.7 for chat, and move on.

0.0-0.2 for factual/structured, 0.5-0.7 for chat, 0.8-1.0 for creative. Above 1.2 usually gets weird.