Today I read in short succession two articles about the code coverage metric, its use and potential harm.
First I read Increasing Code Coverage May Be Harmful where Dan Mange describes how adding tests just to increase code coverage might cause problems. Secondly Meera Subbarao asks is code coverage important?
I think there are a lot of problems with the whole code coverage discussion. The phrase with ‘tool’ and ‘fool’ comes to ones mind.
So this is my list of problems I have with 99% of all code coverage discussions
- Code coverage isn’t properly defined at all. Are you talking about statement, block, path, method coverage? What exact flavor thereof? With out defining it the term code coverage is pretty much useless.
- Any kind of Code Coverage of 100% percent suggests some kind of completeness of the test. Which is pretty much bullshit. If your tests are wrong, coverage says nothing. If your test is correct (in the sense of checking for the correct result) it doesn’t mean the code will have the correct result in all possible cases. Other things like configurations, annotations and third party libraries, including operating systems and hardware often go completely untested although some metric says we have a code coverage of 100%. So it is important to realize 100% code coverage of your preferred flavor is just some completely arbitrary degree of completness for the test suite.
- Closely to related to the second point is the assumed ideal of a completely tested code base. Such a complete coverage in the sense of guaranteed surfacing of every possible contained bug is impossible. No matter how trivial you program is. Even for the most trivial program it is easily possible to construct hundreds of testcases.
- In most projects only automated unit tests contribute to reported code coverage. But normally there is a lot of manual testing to accompany that. So while 100% coverage doesn’t mean there are no bugs, 30% doesn’t mean the software is bug ridden.
So where is the real target for testing? As so often it is a tradeoff. A trade off between effort invested in development and maintanance of test code vs. uncovered bugs in the code base. So one needs to define the limitations in budget and the limitation in bugs to stay undetected in the codebase. Which again are constraint by feasability: Some kind of bugs are difficult to find by tests: Bugs in extremly simple code could hide, when the test code is more complex and therefore errorprone then the code under test. Other Bugs you cannot possibly find by tests: Features that are not required yet implemented (called surprises by Robert V. Binder in his book Testing Object Oriented Systems: Models, Patterns and Tools (Addison-Wesley Object Technology)) as well as not implemented but required features.
When you are working on software that does not control life support systems nor large amounts of money I’d consider it quite naturally that quite a piece of the code isn’t covered by automated tests at all but the tests concentrate on the most complex most error prone pieces of the software.
So what to do? Whenever your actions are controlled by just one number, stop it. Code Coverage is a valuable tool to find areas that need testing.
But before blindly writing tests ask some questions first.
- Do we need to increase the quality (in the meaning of decreasing the number of bugs). Ok most of the time the answer is a big yes to this one.
- How much do we want to increase it.
- How much are we willing to pay for that
- What is the best way to increase the quality? Increase code coverage? Increase thoroughness of the test? Which in turn leads to the question of fault models (what kind of bugs are we looking for) and test strategies (how to hunt bugs effective AND efficient)
And if the team doesn’t want to answer all these questions but still increase the code quality through test, I’d opt for some training possibly by reading the already mentioned book by Robert V. Binder. I have some critism for the book as well, but that must wait for another post. It at least clarifies some facts about what tests can achieve and what they can’t achieve.