LLM Reasoning Playbook

Build Your Recipe

A guided, axis-by-axis workflow that turns your problem into a per-axis framework recipe — with an interactive builder, skip/nest rules, and a copyable card.

This page turns "here's my problem" into "here's my recipe" — a short, ordered plan that names one framework per axis, skips the axes you don't need, and stacks what's left in the right order. Answer a few questions in the builder and copy out a recipe card, or read the same steps below.

The mental model — it really is a recipe. Each of the seven axes is one kind of ingredient. A good recipe uses at most one ingredient from each kind, and leaves out the kinds the dish doesn't need. Order matters too: you plan and prep first (that shapes the whole dish), you cook the main ingredients in the pan (grounding runs inside the loop), and you taste and adjust at the end (checking wraps everything). Two frameworks share one axis only when one cooks inside the other — like a filling that bakes inside its pastry.

Three moves people miss. These are the quiet steps that separate a real recipe from a pile of ingredients:


Step 0 — Ground it first

Before you pick any framework, look at your actual problem and write down the rules a good answer must respect. This is the single highest-value step, and no framework does it for you.

It's like checking what's in your fridge before you start cooking — not halfway through.

Use a Step-Back prompt to pull the principles out:

markdown## Task
{{YOUR_PROBLEM}}

## Input
<input>
{{A SAMPLE OF YOUR REAL DATA}}
</input>

## Step back
Before solving, list the governing principles this answer must respect — the domain
rules, the traps hiding in this specific data, and what would make a result
misleading. Do not solve yet.

## Output
Principles: a short, numbered list

Keep that list handy. It goes in the recipe card at the end, and it steers every pick you make below.


Build it, axis by axis

Answer the questions for each axis. Pick one framework — or mark the axis skip. Work top to bottom; that's the order you'll assemble them in, not alphabetical.

Axis reference — pick one per axis, or skip

Work top to bottom (build order). The builder above uses exactly this logic.

E Abstraction

Does a general rule, law, or set of domain principles decide what a correct answer looks like?

Like remembering the formula before you plug in the numbers.

If you need…Use
Yes — name the principle firstStep-Back E1
I lack good examples; let it recall its ownAnalogical E2

Skip when there's no useful higher rule, or the fine details are the whole point. Two on one axis? Rarely stacked — Step-Back simply goes first.

D Decomposition

Does the task split into parts or stages?

Like a recipe with numbered steps — chop, then sauté, then simmer.

If you need…Use
Ordered parts, each builds on the lastLeast-to-Most D1
Needs a plan before doingPlan-and-Solve D2
Several lookups chained togetherSelf-Ask D3

Skip when it's really one step. Two on one axis? Pick one — decomposition styles don't stack.

A Topology

What shape does the thinking take?

Like picking your way through a maze — walk straight, scout the forks and back out, or sketch the whole map first.

If you need…Use
One straight lineChain of Thought A1
Long / messy input to readThread of Thought A2
Explore options, back out of dead endsTree of Thoughts A3
Solve parts, then merge themGraph of Thoughts A4
A list of points, want it fastSkeleton-of-Thought A5

Skip when never — some topology always applies (default to Chain of Thought). Two on one axis? Pick one shape.

C Grounding

Does the answer depend on something outside the model?

Like reaching for a calculator or making a phone call — you have to step outside your own head.

If you need…Use
Exact math / crunching real numbers or tablesPAL / PoT C2
Live info, search, tools, or actionsReAct C1
Fact-check claims it generatedChain-of-Verification C3

Skip when nothing needs looking up or exact computation. Two on one axis? The common two-on-one case: run PAL (code) INSIDE a ReAct loop, or wrap either in CoVe to fact-check.

F Self-Eval

Can you actually check the output — and do you get more than one try?

Like proofreading before you hand in the test — only worth it if you can still fix your answer.

If you need…Use
One pass, and a real check exists (tests, tool, lookup)Self-Refine F1
A retry loop with a clear pass/fail each timeReflexion F2

Skip when there's no real checker — unaided self-critique can make objective answers worse. Two on one axis? Pick one.

B Sampling

Is there one discrete answer you could vote on, and do you need extra reliability?

Like asking several friends the same question and going with the answer most of them give.

If you need…Use
Discrete answer, want it steadySelf-Consistency B1
Free-form answers, want robustnessUniversal Self-Consistency B2

Skip when there's no single answer to vote on (a build or open-ended task), or cost is tight. Two on one axis? Sampling wraps the whole recipe.

G Steering

Do you need to nudge focus, tone, or domain — without dictating the answer?

Like a coach shouting one keyword from the sideline — a nudge, not a script.

If you need…Use
Yes, cheaplyDirectional Stimulus (light) G1

Skip when no nudge is needed, or you're exploring and don't want to anchor early. Two on one axis? A cue injected into another template.


Assemble outside-in

Once you have your picks, stack them in this order. Earlier layers shape the prompt; later layers wrap it or run inside it.

Think of it like getting dressed: base layers first, coat and scarf last. You can't put the coat on before the shirt.

flowchart TB
    S0["① Ground it first Step-Back principles (E)"] --> S1["② Shape the prompt decompose (D) → topology (A)"]
    S1 --> S2["③ Ground in the loops PAL / ReAct / CoVe (C)"]
    S2 --> S3["④ Wrap with checking Self-Refine (F) · Self-Consistency (B)"]
    S3 --> S4["⑤ Steer lightly, if needed Directional Stimulus (G)"]

Drop the layers you skipped. A reusable assembly template — Markdown scaffolding with XML only around the injected data:

markdown## Principles (from Step 0)
<principles>
{{PRINCIPLES}}
</principles>

## Plan (D)
{{DECOMPOSITION_INSTRUCTION}}

## Reason (A)
{{TOPOLOGY_INSTRUCTION}}

## Ground (C) — runs inside the loops
{{GROUNDING_INSTRUCTION}}   // e.g. write code (PAL) inside each tool step (ReAct)

## Check (F) / Vote (B) — wraps the whole thing
{{SELFCHECK_OR_SAMPLING_INSTRUCTION}}

## Steer (G) — optional
Focus on: {{STIMULUS}}

The recipe card

Fill one of these per task. It's the whole recipe on a card: what you picked, why, what you skipped, the principles, and the assembly line. (The builder above fills it in for you.)

markdown# Recipe — {{TASK}}

## Domain principles (Step 0)
1. ...

## Picks — one per axis; skip freely
| Axis            | Pick | Why | Skipped? |
|-----------------|------|-----|----------|
| E Abstraction   |      |     |          |
| D Decomposition |      |     |          |
| A Topology      |      |     |          |
| C Grounding     |      |     |          |
| F Self-Eval     |      |     |          |
| B Sampling      |      |     |          |
| G Steering      |      |     |          |

## Assembly (outside-in)
{{pick → pick → [nested] → wrap}}

Worked in full — the health-tracker dashboard

Here's the whole procedure on one real problem. It's a build task, not a single-answer question — so it behaves differently from the Q&A stacks in Reasoning-Framework-Worked-Examples: it skips an axis and nests two frameworks on one axis, which real work does all the time.

The problem. "Build a comprehensive set of charts in the Dashboard sheet of my health tracker that show how tracked items — drinks, sleep, mood, stress — influence each other."

Step 0 — principles (grounded in the real sheet). Before choosing anything, look at the data and write the rules a good answer must respect:

  1. The sample is small, so treat every link (correlation) you find as a maybe, not a fact.
  2. Two things can drift the same way over time without one causing the other (a shared-time-trend confound) — think ice-cream sales and sunburns, both just rising in summer.
  3. Compare a day to the day before, not to itself (use a lag) — last night's sleep affects today's mood.
  4. Rank the days rather than trusting the exact numbers (rank correlation), and always show the dot plot (the scatter), not just a single score.
  5. The 1–5 mood and stress ratings are rankings, not real amounts (an ordinal scale) — a 4 isn't "twice" a 2, so don't do math on them as if it were.

That list is the part a pile of frameworks won't give you. It came from looking at the sheet, not from any technique.

The picks.

AxisPickWhySkipped?
E AbstractionStep-BackEstablish what counts as a valid "influence" before charting anything. Top pick.
D DecompositionPlan-and-SolveDesign the full chart spec before building a single panel.
A TopologySkeleton-of-Thought"Comprehensive set" = list out the panels first, then fill in each one.
C GroundingPAL inside ReActWork out the links and lags in code, against the real cells the tool loop reads. The two-on-one-axis case.
F Self-EvalSelf-RefineCatch coincidental (spurious) patterns before shipping.
B SamplingNothing discrete to vote on.skip
G SteeringDSP (light)Steer toward "leading indicator / early warning."

Assembly (outside-in). Step-Back → Plan-and-Solve → Skeleton-of-Thought → [per chart: PAL inside ReAct] → Self-Refine, with a light Directional Stimulus cue throughout.

Why this beats a plain prompt. The frameworks give the structure, but the list of principles — go easy on a small sample, watch for things that just move together, compare each day to the one before, rank the days and show the dots, don't do math on 1–5 ratings — is what keeps the charts honest. And that came from Step 0.


See Advanced-AI-Reasoning-Framework-Playbook for what each framework is and when to use it, and Reasoning-Framework-Worked-Examples for more finished stacks — including this one as Example 4.