TDD
TDD works in a tight feedback loop of three steps
- write failing test
- write code to make the test pass
- refactor
The key point is to think about the test before the implementation,
which also forces you to think about the API before the implementation,
and to work in small incremental steps.
Test suite as oracle
With LLM code, having a good test suite becomes even more valuable
- although not if the tests are just written after the implementation by the LLM!
You should iterate with the LLM on the tests, in particular
- test that check the code does what it should do with some valid input (happy path)
- also what happens with invalid inputs (unhappy path)
- end goal is a test suite that more or less specifies the code
If you understand the test suite, and it covers all the key behaviours of the code,
then you can be fairly confident in the correctness of the generated code, even without understanding the code!
Anecdote: one OpenAI developer still writes their tests by hand, the LLM then writes all the code.
Context
The LLM that wrote the code has all that code in its context.
To get a less biased opinion, clear the context and/or use a different model:
/review in claude/codex does a review with fresh context window
/clear then ask it e.g. to review the tests and if they cover all relevant cases
- use another model to review code and tests, and ask it questions