Is CoT the same as a reasoning model?

No. CoT is a prompting technique any model can use. Reasoning models are trained via RL to generate lengthy internal CoT by default.

Yes · you pay for the extra reasoning tokens. Budget 5-50× the token count of a direct answer.

ConceptsCoTReading · ~3 min · 93 words deep

Chain of Thought

Prompting the model to show its reasoning steps before the final answer · dramatically improves math, logic, and multi-step tasks.

TL;DR

Prompting the model to show its reasoning steps before the final answer · dramatically improves math, logic, and multi-step tasks.

Level 1

Basic

Chain-of-thought (CoT) was first shown in a 2022 Google paper: adding "Let's think step by step" to a prompt makes models solve problems they previously failed. Modern models trained on CoT data produce step-by-step reasoning by default. Reasoning models (o1, Claude Extended Thinking, DeepSeek R1) extend this further by generating internal CoT traces that are hidden from the user but billed as output tokens.

Level 2

Deep

CoT exploits the transformer's auto-regressive generation: each token conditions on all previous tokens. By generating intermediate reasoning, the model has more compute per final-answer token and can catch errors in earlier steps. Zero-shot CoT ("Let's think step by step") was the original trigger. Few-shot CoT provides worked examples in the prompt. Models trained on CoT data (Chinchilla and later) produce CoT naturally. Self-consistency improves CoT by sampling multiple reasoning paths and voting on the final answer. Tree-of-Thought extends CoT to branching exploration. For tasks with verifiable answers, sample-and-verify schemes dramatically outperform single-shot CoT.

Level 3

Expert

CoT's mechanism: each step token effectively adds a forward pass of compute. For an n-step reasoning problem, the total FLOPs are n times a direct-answer pass. This matches theoretical results that transformers need either depth or sequence length to solve certain problem classes. Self-Consistency samples k traces and picks the most common answer; works well when the reasoning distribution is unimodal around correct. Tree-of-Thought (2023) searches a tree of partial reasoning states with an explicit value function · more expensive but handles multi-modal reasoning distributions. Chain-of-Verification and Reflexion add self-critique loops · diminishing returns past 3-5 iterations.

Why this matters now

CoT-native training (DeepSeek R1's RL recipe) made CoT a core model capability rather than a prompt trick. Every frontier model now does CoT internally.

The takeaway for you

Depending on why you're here

If you are a

Researcher

·Generated intermediate steps give the model more compute per final-answer token
·Zero-shot trigger: "Let's think step by step"
·Self-Consistency + Tree-of-Thought are the main extensions

If you are a

Builder

·For math, logic, multi-step tasks: include CoT trigger or use a reasoning model
·For chat, short answers: skip CoT · it costs tokens without upside
·Hide or show CoT depending on UX · most users want the final answer only

If you are a

Investor

·CoT-native training is the core recipe behind the reasoning-model boom
·Reasoning models charge for hidden CoT · new revenue stream
·CoT effectiveness plateaus at ~10K tokens · limits upside of pure test-time compute

If you are a

Curious · Normie

·Asking AI to "show its work" before answering
·Makes AI way better at math and logic puzzles
·The reason reasoning models like o1 work

Gecko's take

CoT was a hack in 2022 and a core capability by 2025. The next bench will measure CoT efficiency, not CoT presence.

Frequently Asked Questions

Models below ~10B parameters often produce nonsensical CoT. Frontier models (70B+ or equivalent MoE) CoT reliably. The "Let's think step by step" trigger works best on models trained on CoT data.

Canonical sources

Read the primary sources

Chain-of-Thought paper (Google, 2022)arxiv.org
Self-Consistency paper (2022)arxiv.org
Tree-of-Thought paper (2023)arxiv.org

Chain of Thought

Basic

Deep

Expert

Depending on why you're here

Frequently Asked Questions

Read the primary sources

Related terms

Glossary

Explore live data

Cite or embed