CoT decompose into intermediate steps

Least-to-most prompting decomposition + subproblem solving增强generalization能力

Zero-shot: Let's think it step by step worse than few-shot

Analogical reasoner(类比,recall a related problem) generate relevant examples and knowledge

CoT Reasoning without Prompting CoT decoding: top-k token模型自带推理能力

Self-Consistency 不同的推理路径 最终筛选出现频率最高的答案, argmaxP(answer|question) consistency > 80%, accuracy -->100%

Universal Self-Consistency self-select the most consistent answer

Limitation

  1. distracted by irrelevant context
  2. cannot self-correct reasoning, change correct answer to wrong
  3. muti-LLM debate: worse than self-consistency
  4. premise order matters(reorder prompt会降低accuracy)

Suggestion

  1. define a right problem to work on
  2. solve it from the first principles