What Is Chain-of-Thought Prompting? | GEO Glossary

Chain-of-Thought (CoT) is a prompting technique that gets an LLM to generate a step-by-step reasoning process before its final answer. Formalized by Wei et al. at Google Research in 2022, it has become the standard technique for lifting LLM accuracy on complex reasoning tasks.

Why It Matters

Early LLMs struggled with arithmetic, logic, and multi-step reasoning. In the original Wei paper, PaLM 540B solved only 17.9% of problems on the GSM8K grade-school math benchmark with basic prompting — but 56.9% with Chain-of-Thought. Same model, same questions, 2–3x better accuracy simply by giving the model "room to think." Since then, Claude, GPT, and Gemini have all internalized CoT as a core prompting pattern.

How It Works

CoT's core idea is making the LLM write out its reasoning first, then state the conclusion — instead of jumping straight to an answer. Because transformers condition each token on prior tokens, emitting intermediate reasoning puts that content into context and raises the final answer's quality. More "thinking tokens" give the model more "reasoning space."

Main Variants

Zero-Shot CoT: Add a single line like "Let's think step by step," with no examples. Proposed by Kojima et al. in 2022, it's the simplest and surprisingly effective form.

Few-Shot CoT: Include 2–3 example problems with their step-by-step reasoning in the prompt so the model imitates the structure.

Self-Consistency: Sample multiple CoT answers for the same question and pick the most common final conclusion — a "vote" over reasoning paths, more accurate than a single CoT.

Tree of Thoughts (ToT): Explore reasoning as a tree instead of a line, expanding only high-scoring branches. Good for complex planning and puzzles.

ReAct: Reasoning + Acting. Combines CoT with tool calls in a "think → act → observe → think again" loop. The standard prompting pattern for AI agents.

When CoT Helps

CoT doesn't help equally on all tasks.

Very effective: Math, logic puzzles, multi-step reasoning, complex decision-making, code debugging.

Less effective: Simple classification, sentiment analysis, summarization, and translation — where the answer is already immediate and CoT mostly adds latency.

2026 trend: Frontier models now ship with built-in "reasoning modes" (OpenAI o1, Claude Opus extended thinking) that run CoT automatically, so users no longer need to write CoT prompts manually. Attention shifts to other quality-boosting hints.

GEO Implications

CoT isn't a technique content writers apply directly, but it shapes what content LLMs find easiest to cite. If a blog post walks through complex concepts with explicit step-by-step logic, LLMs have an easier time using that section as grounding for their own reasoning. Explanations that unpack "why this follows" beat single-line conclusions when AI search picks what to quote.

Sources: