Three ways to make unit testing more effective

Writing unit tests for your code. Tedious waste of time or an awesome productivity tool? Depending on how unit testing is implemented in a project, it can be either. In order to understand how to write good unit tests, it's good to give some thought on why an application or library should be unit tested in the first place.

This post isn't meant to be a tutorial to unit testing. Although it contains links to the relevant techniques, it assumes some familiarity with writing unit tests. My intention is to share three best practices about how I've been able to make unit testing useful. Whether you agree or disagree, I'd like to hear from you!

Use unit testing to design good APIs

As many unit testing devotees will tell you, the best time to start writing tests is before you even start writing your unit. Test-driven development is claimed to increase code quality and programmer productivity. While this may well be true, in my experience the true value of TDD comes from producing good API for modules.

When starting to write a class (or function, or module, or whatever is a natural unit in a given programming language), the programmer has a general idea what it should do. Maybe it's WidgetTweaker, UserCredentialStorage, or TournamentScoreboard. Whatever the purpose of the class is, the programmer has to decide an interface used by other code to interact with it. And what piece of code will interact with it before releasing it to the production? The unit test of course!

Test-driven development is iterative. Start by writing one test case that performs one action with the class and verifies its result. Say, adds one match to TournamentScoreboard and verifies that the winner gets a point. You have already identified the need for two methods (adding match, querying points). Implement them and the test passes. Rinse and repeat: What about getting the overall winner of a tournament? Tie-breakers if multiple teams have the same number of points? Handling exceptional situations like trying to add matches from different seasons? Disqualifying a team for violating the rules? By writing the tests first, you get to choose the most intuitive and programmer-friendly way to interact with the class. Or in the modern software parlance: API first.

Summary: Test-driven development helps in developing a small, self-contained piece of functionality quickly. When it's time to integrate it with the rest of the application, it already comes with a great API.

View unit testing as an opportunity to refactor

If you're a software developer, chances are you're not just writing classes from scratch. Sometimes you need to touch some legacy code. Maybe the original author has already left the company, and no-one can explain some obscure function call to a seemingly unrelated subsystem. And the class initializes like two dozen dependencies and interacts with them in a complicated way. You'd rather just change as little as you can, and leave the stuff around it do what it has been doing reasonably well all these years.

How do you verify that what you're doing behaves as you intended? If there are no unit tests available, then you'll probably have to do something like this:

  1. Identify what behavior of the application depends on the code in the class you're about to touch
  2. Carefully change what you must
  3. Test that the behavior of the application remains as is, modulo what you actually intended to change

This is obviously time-consuming and frustrating experience. If the class in question is in the hearth of a high-traffic e-commerce website or an embedded medical device, it may even be terrifying. Wouldn't it be nice if the next time you or your colleague touches that code, they could verify their changes quickly and in isolation?

Unit tests are a good solution for testing quickly and in isolation. But that's only possible if the class is reasonably decoupled from its dependencies. If you're working with a class that both initializes real dependencies (handle to a device driver, database connection etc.), and implements some business logic interacting with them, jumping right into writing tests is likely a waste of time. You should first ensure that the class is testable.

There are many ways to do this. I have found the following recipe useful. Do the following very time-consuming and frustrating steps once and be confident that the next time you touch the class it's much less painful:

  1. Identify what behavior of the application depends on the code in the class you're about to touch
  2. Identify what are the different functions of the class. These might be things like: initialize device drivers/database handles, control access to them, validate user input, translate user input into batches of commands to the device or database etc.
  3. Pull these functions apart and into their own classes. Don't make the classes depend on each other directly. Make them depend on interfaces. Inject objects implementing those interfaces at runtime.
  4. While refactoring, test that the behavior of the application remains as is
  5. Write unit tests for the newly created classes. Because the classes are decoupled thanks to the work done earlier, you can use mocks instead of the real dependencies when writing the tests.
  6. Carefully change what you must, and don't forget to unit test that too
  7. Ensure that the test suite passes and your changes work as expected

Summary: Sometimes you need to refactor before writing tests even makes sense. While it's painful at first, the hard work pays off the next time you need to make changes to the class.

Make a good use of code coverage metrics

Code coverage in its multiple variants is surely the first metric that comes to mind when trying to quantify the... well... coverage of the test suite. The higher the code coverage, the more robust the testing, right? I tend to agree, but there are caveats. Using code coverage as a metric for sotware quality I would even call an antipattern.

The motivation of tracking code coverage in an enterprise settings is understandable. The (project) managers are accountable for the code someone else is writing, and it's a reasonable demand to have visibility over the work of the developers they are managing. Quantative data is better than opinions, and having automatically generated quantative data is even better. And CI spitting out code coverage metrics after running the unit test suites is certainly quantative data that is well supported by ubiquitous tools.

How do you know what coverage should you aim at? Is a code base with 90% coverage better tested than a code base with 50% coverage? Maybe, but what if the former is a typical utility library, and the latter is a device driver? And should I be worried if the coverage was 80% last week and is now down to 70%? Perhaps, at least it would raise an eyebrow.

More importantly than trying to guess what is an appropriate code coverage target for a particular application, it's important to understand the story behind the numbers. Too many times have I seen code where a developer has faithfully replicated bugs in the test suite. Or breaking encapsulation of the production interface to tweak the internals mid-test. It's painfully clear that these tests were written only to drive the coverage metrics up. If a a software quality metric drives bad engineering practice, then it's a very bad metric indeed.

I recommend using the coverage metrics to verify hypotheses. What do you expect the coverage of a fresh piece of business logic developed with TDD to be (hint: 100%)? Should the coverage of the legacy class I just refactored be closer to 20% or 80%? Was there even a test to cover this error condition that caused an outage? My rule of thumb is that if you don't have a rough idea what the code coverage ought to be, you probably shouldn't care what it actually is.

Summary: High code coverage doesn't always mean tested code. Make hypotheses about expected code coverage, and use the coverage tools to verify those hypotheses.