2024-10-17

Unit tests as documentation

Original Article ↗ Hacker News Discussion ↗

Role of Unit Tests as Documentation

Many argue unit tests are a useful form of documentation: they show concrete inputs/outputs, expected behaviors, and “contracts” that should remain stable over time.
Tests can help new contributors understand how to call APIs, what shapes of data are expected, and how subsystems interact.
Some treat tests (especially in TDD) as executable specifications or “living specs,” particularly useful for legacy or poorly understood code.

Limits and Critiques

Strong pushback on treating tests as the documentation.
Tests and code show “what happens”, but often not “why”, business rules, trade‑offs, or domain concepts.
Tests enumerate finite cases and may omit important edge conditions; absence of a test does not equal undefined behavior.
For end‑users and non‑coders, tests are essentially useless as docs.
Poor tests (flaky, incomplete, mislabeled) can mislead as much as stale prose.

Integration vs Unit Tests

Several contend integration tests are far more valuable as documentation of real usage patterns and behavior across components.
Others defend a mix (test pyramid): unit tests catch fine‑grained issues and are easier to debug; integration tests validate lifelike flows.
Debate over whether integration‑heavy strategies can replace most unit tests, with concerns about combinatorial explosion and debugging cost.

Keeping Docs and Tests in Sync

Thread highlights tools and language features where examples in documentation are executable tests (Python/Rust doctests, Elixir, D, etc.).
This helps ensure examples stay valid, though prose explanations can still rot.
Some projects generate docs from tests or embed tests in docs to get “executable documentation.”

Test Quality: Names, Structure, and Readability

To be useful as documentation, tests should be simple, independent, and focused on clear scenarios.
Descriptive test names and/or BDD‑style descriptions are promoted, but many note social friction in enforcing good naming.
Others claim organization, fixtures, and comments around tests matter more than function names.

AI and Tests/Docs

People report success feeding test suites to LLMs to generate human‑readable documentation for under‑documented libraries.
Conversely, AI‑generated tests that merely mirror implementation can create noisy “regression tests” with little specification value.