When to test, when to wait

When you first start working on a difficult, open-ended problem, the codebase is like a jigsaw puzzle. The pieces are scattered and you’re not really sure how things connect. At first, your actions are haphazard. You shuffle and reshuffle the pieces into clumps by color, randomly try to fit them together.

There’s sort of a rich-get-richer phenomenon – when you really understand a codebase, you have immediate and correct intuition about how to proceed. When you’re roaming in the wilderness, though, the only real recourse is to fall back on a sort of evolutionary, quasi-intelligently-random process of trial and error that plays out over the course of dozens or hundreds of individual touches on the code. At the start, the only way to develop a meaningful understanding of a new, non-trivial problem is try lots and lots and lots of different approaches.

It’s like the “burn-in” phase in machine learning algorithms – a period of volatility and randomness precedes the stable, workable result. As the codebase becomes more established, each iteration of changes becomes increasingly “shallow,” where the shallowest possible change is a commit that just adds new code, and doesn’t change anything that’s already there.

Anyway, though, my argument is this: Don’t write tests during an intense burn-in phase. There are two related risks here:

  1. Testing too early adds “latency” to each progressive iteration of experimental changes during the burn-in period. The burn-in is a series of fast, almost improvisational updates to the codebase. It’s like evolving bacteria – things change so fast because each generation only lives a couple minutes. Premature testing is risky because drags out lifespan of each of these updates at exactly the time when they need to be as short as possible. Assuming that development time is finite, this means that the burn-in phase will churn through fewer iterations. My theory here is that it’s just the total number that matters, not any metric of quality – you can’t compensate by being especially clever over the course of a smaller number of generations.
  2. The presence of tests can prematurely ossify the codebase before it’s had time to stabilize. Why? Because passing tests can make you numb to bad programming. The tests pass, right? Maybe, but it’s not hard to write passing tests for awful application code. When I block in tests too soon during a burn-in period, I think it can actually reduce the probability that I’ll go back and thoroughly refactor the rough areas. Never use tests as Febreeze for smelly application code. All good code has passing tests; but not all passing tests are testing good code.

Of course, some projects are inherently stable and don’t require a burn-in period. I’ve written enough Omeka plugins at this point that I can sit down and write confident, well-organized code from the start.

When this is the case, I start testing immediately. Why wait? In the context of modern web development, this might actually be the rule, not the exception. I suspect that this is why test-driven development is so effective when paired with opinionated tools like Rails. If burning-in on a project is the process of building your own set of “rails” and structural conventions for the codebase, then Rails (or Django, or Zend, etc.) comes “pre-burned-in.”

So, to sum up – when you know what you’re doing, test first. But when you don’t know your tools, or when you’re trying to build something exotic or conceptually fuzzy, test early, test often – but maybe don’t test immediately.