教程 / 技巧与窍门 / 系统化调试
📝 文字 ● 中级 更新于 2026-05-19

用 LingCode 系统化调试

大多数"调试"不过是凭猜测做模式匹配,然后逐一验证猜测。系统化调试每一步看起来更慢,整体却更快:复现、隔离、假设、验证、缩小范围——只有这之后才提出修复方案。只要你提供方法,LingCode 就会遵循这套方法。

为什么"猜测-打补丁"是条慢路

0

感觉很快的调试循环是这样的:读报错堆栈,凭报错的形态猜一个原因,试着修复,看看错误是否消失。它感觉很快,是因为每次迭代都很短。但它其实很慢,原因如下:

  • 第一个猜测几乎从来不是根本原因——它不过是最近见过的相似模式,是相关性,而非因果关系。
  • 如果没有先隔离问题,你根本无法判断修复是否有效。错误停止发生可能出于任何原因,你就宣告解决了,直到它在凌晨两点卷土重来。
  • "不再崩溃"不等于"问题已修复。"在症状周围套一个 try/catch 看起来测试通过,却掩盖了底层的状态损坏。

系统化调试用快速迭代带来的多巴胺,换取"我知道哪里出错了,也知道我的修复针对的就是那个问题"的确定性。本教程中的技能为 LingCode 提供了一套明确的五步协议,让它不会替你陷入模式匹配模式。

你需要准备什么

1
  • LingCode——下载安装包
  • 一个 Bug——测试失败、崩溃、输出错误,或者"之前还好用的"。
  • 能够在本地运行出错的代码。无法复现的 Bug 是另一个更难的问题。

第一步——复现,先于阅读任何代码

2

LingCode 要做的第一件事是在本地复现 Bug,并记录精确的命令、输入和观察到的输出。不是阅读源码,不是推测,只是复现:

Before reading any source, reproduce the bug. Report:
- Exact command run.
- Exact input (paste it; don't paraphrase).
- Exact output / error / stack trace (paste verbatim).
- Whether it's deterministic (run it 3 times — same result?).

If you can't reproduce, stop. Ask for: the minimal failing
input, environment differences, or a recording. Don't guess
at causes for bugs you haven't seen.

如果 LingCode 无法复现 Bug,后续每一步都是在走过场。诚实的回答是"我需要一个可复现的案例"——而不是尝试修复。

确定性与偶发性的区别很重要。如果 Bug 每五次只触发一次,在继续之前先缩小控制偶发性的变量(时序、顺序、随机性)。靠"改了点什么,没再见到它"解决的偶发性 Bug 迟早会回来。

第二步——隔离出最小的失败用例

3

大多数 Bug 报告都包裹在一个完整的应用、一套很长的测试或复杂的输入里。LingCode 接下来的任务是一层层剥开,直到 Bug 能从最小的测试中触发:

Reduce the failing case. Aim for:
- A single failing test (or a 10-line script) that triggers
  the bug.
- The smallest input that still reproduces.
- The narrowest call path — strip middleware, unrelated
  state, optional config.

Each reduction must still reproduce. If a reduction stops
reproducing, you've found something — either the bug is in
the part you removed, or your reduction was wrong. Note it.

"每次缩减都必须仍能复现"是铁律。一路删代码直到 Bug 消失,却不追究原因,就是在修错东西。

第三步——提出可证伪的假设

4

现在——也只有现在——LingCode 才阅读代码并形成理论。这个理论必须附带一个可以验证的预测:

Form a single hypothesis. It must:
- Name the specific function / state / interaction you think
  is wrong.
- Predict what will happen if your hypothesis is correct
  AND a tweak is applied that should change behavior.
- Predict what will happen if your hypothesis is wrong.

"I think it's a race condition" is not a hypothesis. "I think
read_config() returns before write_config() completes on cold
start, so adding a 100ms sleep before read should make the
bug rarer but not gone" is a hypothesis.

同时持有两个假设没问题。持有三个则说明你并没有真正在读代码——你只是在头脑风暴。先缩小范围,再去验证。

第四步——验证假设,而非验证修复

5

这一步是系统化调试与"猜测-打补丁"的分水岭。验证的目的是证明假设是对是错——而不是让 Bug 消失:

Test the hypothesis. Pick the cheapest probe that distinguishes
right-from-wrong:
- A print / log at the suspected boundary, showing the value
  you predicted.
- A breakpoint or debugger inspection.
- A unit test that exercises the suspected interaction.
- A diff of the value before and after the suspected mutation.

If the probe contradicts the hypothesis, the hypothesis is
wrong. Go back to step 3. Do not "adjust" the hypothesis to
fit — write a new one.
"错误消失了"不是验证。错误消失可能有许多与你的理论无关的原因——缓存、时序、测试环境的副作用。只有当探针显示出你预测的结果时,假设才算得到证实。

第五步——缩小到最小修复

6

一旦假设得到证实,LingCode 就编写能解决实际原因而非症状的最小改动:

Propose the fix. It should:
- Address the cause identified in step 4, not the symptom in
  step 1.
- Be the smallest change that achieves that.
- Come with a regression test that fails before the fix and
  passes after. (No regression test = no proof the fix works.)

Then re-run the original reproduction from step 2.
If it still fires, the fix is incomplete — go back to step 3.
If it doesn't, run the full test suite to check for collateral
damage. Document one-line cause + one-line fix in the commit
message.

回归测试是证明你理解了这个 Bug 的凭证。如果你写不出能捕捉它的测试,说明你并没有真正理解它——只是让它暂时停止发生而已。

何时打破协议

7

系统化调试是正确的默认做法。但有两种情况下它属于过度:

  • 简单的拼写错误和明显的错误。第 42 行出现"Cannot read property 'name' of undefined",而第 41 行没有 await 一个异步调用——直接修复就好,不必郑重其事地做假设。
  • 生产环境火情。如果用户服务中断,正确的行动是最快的已知缓解措施(回滚、关闭功能开关、重启)。之后在事后复盘时,在系统稳定的情况下再进行系统化调试。

介于两者之间的一切——偶发性测试失败、"在我这里好用"的报告、间歇性崩溃、莫名其妙的性能下降——这套协议都物有所值。让 LingCode 在一开始就问清楚"这是火情还是调查?":

Before debugging, classify:
- FIRE: users down, ship a mitigation now, investigate after.
- INVESTIGATION: bug is contained, follow the 5 steps.
- TYPO: obvious error visible in the diff, just fix it.

Don't run a full systematic-debug ritual on a typo.
Don't skip steps on an investigation because it "looks easy."

在 LingCode 中使用这套方法

8

完整的调试协议已打包为一个技能——将它放入你的 LingCode 技能文件夹,LingCode 就会在遇到任何测试失败或崩溃时自动调用它:

---
name: systematic-debugging
description: Use when encountering any bug, test failure, or unexpected behavior, before proposing fixes. Triggers: 'this is broken', 'why does X fail', 'debug this', 'find the bug', stack trace pasted, failing test output, runtime exception, 'doesn't work', 'unexpected behavior'. Actions: five-step protocol — reproduce → isolate → hypothesize → test the hypothesis (not the fix) → narrow to minimal fix with a regression test. Triage gate: FIRE (drop everything) / INVESTIGATION (full protocol) / TYPO (one-line fix, skip protocol). Anti-pattern: pattern-match-and-guess, fix-before-reproducing.
---

Debug systematically, not by pattern-matching. Most "fixes"
that make errors disappear without proving cause come back.

Classify first: FIRE (mitigate now), INVESTIGATION (follow
the protocol), or TYPO (just fix it).

For INVESTIGATIONS:

1. REPRODUCE. Before reading any source. Capture exact
   command, input, output. Confirm determinism. If you
   can't reproduce, stop and ask — don't guess.

2. ISOLATE. Reduce to the smallest failing case. Each
   reduction must still reproduce. If a reduction stops
   reproducing, investigate why before continuing.

3. HYPOTHESIZE. One specific theory naming function /
   state / interaction. Include a falsifiable prediction:
   what should happen if right, what if wrong.

4. TEST the hypothesis, not the fix. Use the cheapest
   probe that distinguishes right from wrong (log, debugger,
   diff). "The error went away" is not confirmation —
   the probe must show what you predicted.

5. NARROW to a minimal fix that addresses the cause, not
   the symptom. Write a regression test that fails before
   and passes after. Re-run original repro. If it still
   fires, the fix is incomplete.

Document cause + fix in one line each in the commit.

保存为 ~/.lingcode/skills/systematic-debugging/SKILL.md——参见安装技能了解确切位置以及技能的发现方式。

获取 LingCode →

接下来