When are you done testing?

On my way to the airport a few months ago, I happened to catch an interesting story on NPR. Tom’s of Maine was looking to remove fossil fuels from their deodorant. After developing what seemed to be a good formula in the lab, they sent it out to customers for real-world testing. When the testers reported back positively, the company released the new formula. That’s when they started getting complaints. It turns out that the new formula had problems in warm weather. All of the testers were in New England and tested it during the winter months.

The folks at Tom’s of Maine certainly thought they were giving their product a thorough test. That they didn’t was a failure of imagination. Perhaps the most valuable skill I’ve gained in my career is the ability to imagine more possible ways something can fail. Indeed, I consider that the hallmark of “senior” status. I don’t consider myself particularly excellent in this regard, just better than I was a decade ago. Much of the ability to imagine failures comes from getting burned by not anticipating what can go wrong.

Thats what makes quality assurance a challenging (and underrated) discipline. It’s more than just trying some things and seeing what breaks. Good QA first requires identifying the entire universe of possible failures and then designing tests to make sure the outcome in each of those cases is the desired one.

When are you done testing? Hopefully when you’ve exhausted the space of possible conditions. In reality, it’s closer to when you’ve exhausted the space of likely conditions that are worth handling. Excluding conditions you don’t care about (e.g. not testing your software on end-of-life operating systems because you’re not going to provide support for those platforms) is a great way to shrink the universe.

If you never test it, it doesn’t exist

Did you hear the one about the Texas couple who spent seven years paying for an alarm system that never worked? It’s easy to blame the vendor (especially since it’s Comcast) since 1) the system was not correctly installed and 2) when the homeowner noticed, the customer support agent said the system hadn’t reported in since 2007. Certainly Comcast shoulders a lot of the blame. After some pressure, they agreed to refund the full seven years worth of payments. However, the Leeson family is responsible as well. In seven years of paying for an alarm system, they apparently never tested it themselves.

A service that is never tested does not exist. If you don’t test it when you don’t need it, you can’t count on it being available when you do. It’s why emergency managers test outdoor warning sirens. It’s why hospitals test their generators. It’s why sysadmins test their backups. So here’s my challenge to you, dear reader: think about systems you rely on and test them — before you need them.