与人工编写者合作的 TDD 关注的是设计约束力。与 AI 助手配合的 TDD 则是另一回事:它是防范"看起来合理却根本不能正确运行的代码"的最低成本手段。先写测试,LingCode 就无法伪造通过。
经典 TDD 文献将红-绿-重构循环定位为一种设计规范——先写测试会迫使你在实现之前先思考接口。这一点依然成立。但当你与 LingCode 结对时,一个更尖锐的好处浮现出来:
因此,与 LingCode 配合的 TDD 首先是一种反幻觉工具。本教程中的技能让 LingCode 始终保持在红-绿-重构循环中,这样你就可以通过运行代码来审核输出,而不是靠仔细阅读来发现隐秘的 bug。
第一步是向 LingCode 索要测试,而不是函数。明确表达你希望它失败:
Write the first failing test for <feature>. Do NOT write the
implementation. The test must:
- Reference the public API by name (file, function, signature)
even if it doesn't exist yet.
- Assert behavior the user actually cares about, not internal
structure.
- Run and fail with a "no such symbol" / "not implemented" /
"expected X got Y" error — not a syntax error.
Then run the test and paste the exact failure output.
"粘贴确切的失败输出"这一步就是审计环节。如果 LingCode 只说"测试已写好,它会失败"而不展示失败信息,就自己运行一遍。一种常见的偏差是写出一个在到达任何断言之前就崩溃的测试——那不算真正的红色,也无法约束实现。
一旦你确认了红色,就向 LingCode 索要让测试通过所需的最小改动:
The test is red. Write the minimum implementation to make this
ONE test pass. Constraints:
- No code for behavior the test doesn't exercise yet.
- No extra abstractions ("in case we need them later").
- No error handling for cases this test doesn't cover.
Run the test. Paste the exact output. Confirm green.
"不为测试尚未覆盖的行为写代码"是防止 LingCode 热心添加未经测试边界情况的规则。如果测试只覆盖了正常路径的字符串输入,而 LingCode 添加了一个空值检查,那个检查就是未经测试的代码——要么删掉它,要么为它补一个测试。
现在逐条断言地扩展覆盖率。规范是:每一个新的红色暴露一个缺失的行为;每一个新的绿色仅添加足以覆盖它的代码:
Add ONE more test case. Pick the next unhandled behavior:
- An edge case (empty input, boundary value, off-by-one).
- A different equivalence class (negative number, unicode,
missing field).
- An error path (invalid input, dependency failure).
Confirm it fails. Then add minimum code to make it pass.
Confirm everything is still green.
Repeat until every behavior in <feature spec> has at least
one test.
诱惑是一次写五个测试,然后一起实现。要抵制这种冲动。一次写多个测试会退化回"写好函数,希望测试能通过"——这正是 TDD 要防范的工作方式。
测试通过并不总意味着测试在做你以为它在做的事。假绿来自三个地方,而 LingCode 三种都会可靠地产出:
assertTrue(result) 对任何真值都会通过。正确的写法是 assertEqual(result, expected),其中 expected 是手动计算好的值。发现假绿最简单的方法是故意破坏实现后重新运行:
For each test that just turned green, mutate the implementation
in one of these ways and confirm the test now fails:
- Replace the return value with a hard-coded wrong value.
- Comment out the body and return early.
- Skip the side effect the function should produce.
If the test still passes after a destructive mutation, the test
is not actually exercising the behavior. Fix the test.
这就是小型变异测试——在与 LingCode 结对时,这是单个收益最高的习惯。五秒钟的"破坏它、重新运行、确认红色",能省下一个下午调查"经过完整测试"的函数为什么还是发布了 bug。
一旦测试套件覆盖了所有行为且全部绿色,重构才是安全的。请 LingCode 在不改变行为的前提下改善结构:
All tests are green. Refactor the implementation for:
- Name clarity.
- Removing duplication (DRY only when names converge — don't
over-extract).
- Extracting helper functions that have a single reason to
change.
- Replacing conditional pyramids with early returns.
After each refactor step, re-run the full suite. If a test
goes red, the refactor changed behavior — revert and try
again. Don't "fix" the test to match new behavior.
"不要修改测试以匹配新行为"是核心规则。测试就是规格。如果重构破坏了规格,是重构出了问题,不是规格。
TDD 是处理逻辑的默认方式。但在少数情况下它是过度设计——甚至适得其反:
让 LingCode 在开始前先分类:
Before starting, classify the task:
- LOGIC: anything with branches, state changes, or
computed return values. Use TDD.
- EXPLORATION: throwaway code to learn. Skip TDD; delete after.
- UI: pure presentation. Verify by eye.
- CONFIG: build / env / data. Verify by running.
For LOGIC, do not write implementation code without a failing
test first.
当 bug 出现在非 TDD 编写的代码中时,最低成本的路径是补写一个测试:写一个能复现该 bug 的失败测试,然后修复直到绿色。这个失败测试同时也是调试协议要求的回归测试——同一个产物,一举两得。
TDD 编写的代码同样会有 bug,但它们出现得更早、规模更小——通常是某个没人想到要测试的等价类缺失。补上测试,看它失败,然后修复。这个循环和功能开发是一样的。
TDD 规范已打包为一个技能——将它放入你的 skills 文件夹后,LingCode 在编写逻辑之前会自动调用它:
---
name: test-driven-development
description: Use when implementing any feature or bugfix, before writing implementation code. Triggers: 'write tests', 'add test coverage', 'TDD', new feature, new function, regression fix, 'verify it works', 'test first'. Actions: red (write failing test, paste output to verify real red), green-minimum (smallest code that passes), mutation-test each new green (catch false greens), refactor only while green. Reframes TDD as anti-hallucination tool — stops AI from writing plausible code that doesn't actually work. The test is the spec; AI can't fake passing.
---
Write logic via red-green-refactor. With an AI assistant, TDD's
primary value is as an anti-hallucination tool — a failing test
is the one constraint plausible-looking code can't fake.
Classify first: LOGIC (use TDD), EXPLORATION / UI / CONFIG
(skip, verify another way).
For LOGIC:
1. RED. Write the first failing test before any implementation.
Reference the public API by name. Assert user-visible
behavior, not internal structure. Run it. Paste the exact
failure output. If it already passes, the test is wrong —
rewrite.
2. GREEN. Write the MINIMUM implementation to make this one
test pass. No code for untested behavior. No "in case"
abstractions. Run it. Paste the exact pass output.
3. ONE-AT-A-TIME. Add the next failing test (edge case,
equivalence class, error path). Confirm red. Add minimum
code. Confirm green. Repeat until the spec is covered.
4. WATCH FOR FALSE GREEN. For each new green test, mutate the
implementation destructively (wrong return value, no-op,
skipped side effect). If the test still passes, the test
isn't exercising the behavior. Fix the test.
5. REFACTOR. Only with all tests green. Improve names, remove
real duplication, simplify control flow. Re-run after each
step. If a test goes red, REVERT the refactor — do not
"fix" the test.
For bugs in pre-existing code: write a failing test that
reproduces the bug, then fix until green. The test becomes
the regression test.
保存为 ~/.lingcode/skills/test-driven-development/SKILL.md——详见安装技能,了解确切位置以及技能的发现机制。