Test coverage’s hidden meaning…

Published in

New Work Development

8 min readFeb 12, 2018

Most developers share an underlying feeling or obligation to test their code. Usually this leads to employing automated testing frameworks in order to keep these tests easily repeatable and to be able to perform a quick regression check on your code. This is then used in a wide range of applications; from deciding whether to deploy or release a new version of an application to checking whether one is on the right track with developing a particular new feature.

At a certain point we might want to start gathering some health information regarding our test suite though. Questions like “Did i really test enough?” or “Are these slow full stack tests really necessary here?” come to mind. Usually at this point you start to stumble across one term more and more: test coverage.

At XING we use test coverage as tool in various ways, starting from its use in its traditional form and leading up to a tool that is used on a feedback loop while developing new features. It is however very rarely used as a metric. This article is aiming to illustrate why we usually do not rely on test coverage as metric but still find useful components that can be utilized in daily development.

Test coverage is intended to answer questions related to the code executed by the tests. Generally, observing the code covered by tests encompasses looking at the pieces of code that are actually executed while running the test suite. This leads to different subclasses of test-coverage. Branch coverage for
example illustrates how many of the branches of code are covered. Any
if-clause in your code starts off another branch, one in which the condition is true and the attached then-part of your if-block is executed and one in which the condition evaluates to false and the block is not executed.
See the following example for an illustration:

So far branching is quite straight forward. It looks almost like lines executed, right? Assuming we always write our code in this very verbose way, it can be equivalent to the executed lines. But this is usually not the case for real-world applications. Let’s take a look at this excerpt instead:

Yes, the famous ternary operator (?:) can be quite problematic. It creates branch-off points as well as execution in a single line. Similarly explicit return statements tend to kill branches off early. I guess it would be quite bad news to tell you now that most coverage report tools only report line-based coverage. There are a few languages that make it easier to build proper abstract syntax trees to represent the potential execution paths of a program and to traverse that tree properly. Most languages that we use in web-development, e.g. Ruby, Javascript, PHP or Python don’t belong on that list.

Implications of line-based coverage

Being limited to line-based test coverage reports should not automatically make these reports worthless to you. They still provide quite a wealth of information, even though a complete answer of the question Have i tested everything? is not possible.

Let’s dive into the why first, before we concern ourselves with the actual
pieces of information we can glean from these reports. As we have seen in the previous example a line-based coverage only determines whether a particular line has been executed during the execution of the test-suite. This means that lines of code which contain multiple branch-off points (e.g. ternary operator or postfix if-clauses) cannot be properly reported. By extension this results in a potential 100% test coverage report from a lb-reporter (line-based reporter) not reflecting a full branch coverage. Therefore not all branches are covered. This is however fine, as long as we are aware of this and keep the implications in mind. A full branch coverage itself does not even result in the mystical 100% test coverage that we might be interested in. Why?

This is due to a definition problem. We say that branch-coverage puts a number on the percentage of all branches of code that have been executed. But what we are really interested in is coverage of all eventualities our program might encounter. This number is much higher. It is especially the case for dynamically typed interpreted languages. Statically typed languages, where one needs to define the type of every argument to a function and every variable or constant declared, have the benefit of being able to test the adherence to these types at compile-time. This means that a program that does not adhere to these types does not compile and therefore will never reach execution.

So right off the bat dynamically typed languages, like Ruby for example, are at a disadvantage. But before you run to your team lead or the rest of your team and propose to only use statically typed languages from now on, one needs to keep in mind, that not even for those languages a 100% test coverage is possible. This is due to the fact that we can’t see what we didn’t implement unless we wrote a test for it. If for example we forgot to check for null values, or provided subclass types (which can be provided as arguments to methods which expect the superclass — even in the case of statically typed languages, types are not fully locked in), we won’t see that this is not covered unless there is actually a test which covers this.
This test would then fail, which would definitely lead us to implement the
missing code piece. But unless we know that we need to test this edge-case we will never see it in a coverage report. In the coverage report we would only see it as an additional execution for the same line or branch.

And while this may sound like a potentially far fetched edge-case scenario, I would argue that for any given application with enough complexity, e.g. where one can’t explain the details of each scenario to their proverbial grandmother, this will be the common case once an application shows 100% (or even 95%) line-based test coverage.
That is because the lines not covered yet tend to belong to trivial cases like
rarely called upon getters and setters. Therefore all of the remaining
complexity lies in code that was already executed at least once.

Keeping all this in mind, what can we actually use test coverage reports for?

The benefits of a test coverage report

Ultimately a test coverage report can be a very useful tool, if utilized
properly. It provides an interesting perspective from a high-level point of view on a project. It splits code in a very distinctly visible fashion into two
classes: the classes of covered and not-covered code. Therefore it lends
itself quite easy as a task-finding tool in later stages of the project, when
one is very confident that their application works as intended but one wants to keep it that way, prove it to others or just wants to create reliable code.

1. discover untested code

The detailed coverage report can point one directly to the modules/classes and methods/functions that have not been executed as part of the tests. It can therefore reveal code that is potentially obsolete, because it is not executed at all, or that is not properly tested yet. Especially in the later stages of a project this can also point (harmlessly) to trivial lines of code, e.g. simple getter definitions, that are mostly used for introspection by a developer and not during the execution of the program. More often however it points to something that we missed. Our example shows a main execution path that was not covered by any tests. The method is supposed to add an element to our Set if it does not exist yet. In any case it is supposed to return the matching object. However the scenario in which nothing is added to the Set is not covered by tests.

2. discover previously unknown edge cases

In the later stages of development it is usually not complete classes or methods that are not covered — assuming that one is already monitoring the quality of their test suite — but single lines or small groups of lines instead. More often than not you might discover that they pertain to the else case of an if-clause. We tend to write our code in such a way that the then part represents the positive outcome, the so called happy path. For most sections of our code this is also the common case. Therefore the uncommon sad paths can often be found inside an else case and most definitely in rescue/catch blocks.

3. validate the execution path of an individual test

One does not necessarily need to run the test coverage tool on their complete test suite. During development of a new feature it can be quite helpful to run the newly minted test in an isolated fashion in order to see how it actually moves through the code. We tend to pride ourselves in being able to execute the program in our head and think of the paths the users instructions take while they move through our code base. But with any complex system the amount of possibilities can seem endless, even when we reduce the scope to a proverbial unit. And test coverage tools can aid us in our journey to better understand the code we just wrote mere moments ago.

Conclusion

When we start looking at test coverage as more of a tool and less like a metric we are able to take a closer look at different parts of our code and use this relatively simple tool for introspection. It allows us to discover pathways through our code that we had not thought of yet. This might uncover bugs, missing error handling or even the opportunity for a useful feature. At the very least it will help us improve our test suite though.

Over the course of this short article I’ve used Ruby to illustrate example cases. Ruby, like most dynamically typed interpreted languages, traditionally only supports line-based coverage reports, since its internal tracking utilities only allowed for a line-based granularity. The end-of-year release of Ruby 2.5 changed this however. The Ruby-Core Team integrated branch based coverage tracking. While it will take a while for the actual test-coverage tools to integrate and utilize this in their representation; I think this is certainly something to be enthusiastic about, it allows us to pull the curtain back even further, even though there are still some things that’ll remain in the shadows.