Code Coverage - Tales From the Trenches

Paul Johnson

paul@pjcj.net

http://www.pjcj.net

YAPC::Europe::2003


Introduction

I gave a presentation on code coverage last year at YAPC::Eu in Münich (http://www.pjcj.net/testing_and_code_coverage/). That presentatation focussed very much on the theoretical aspects of code coverage, the different types of coverage criteria, and the types of errors they could expose. This year I want to focus more on the practical aspects of using code coverage techniques with Perl. If you are unfamiliar with code coverage it might be useful to look at that presentatation before continuing with this one., although I will try to make this one as self contained as possible.

There are two coverage modules available on CPAN: Devel::Coverage and Devel::Cover. I compared and contrasted them last year. I will only be looking at Devel::Cover in this presentatation.


Basic Example

Let's jump in at the deep end. Well, OK then, the shallow end. Here is a trivial little program we will use as our first example.

     $ cat tests/statement
     #!/usr/local/bin/perl
     my $pi = 3.14159;
     my $r  = 4;
     my $c  = 2 * $pi * $r;
     my $a  = $pi * $r * $r;

It's not very flexible, it doesn't do much, and what it does do is kept secret. However, it might be the start of something great. Let's run it and see what code coverage tells us.

     $ perl -MDevel::Cover tests/statement
     Devel::Cover 0.2010: Collecting coverage data for branch, condition, statement and time.
     Selecting packages matching:
     Ignoring packages matching:
     Ignoring packages in:
         .
         /usr/local/pkg/perl-5.8.0/lib/5.8.0
         /usr/local/pkg/perl-5.8.0/lib/5.8.0/sun4-solaris
         /usr/local/pkg/perl-5.8.0/lib/site_perl
         /usr/local/pkg/perl-5.8.0/lib/site_perl/5.8.0
         /usr/local/pkg/perl-5.8.0/lib/site_perl/5.8.0/sun4-solaris
     ------------------------------------------ ------ ------ ------ ------ ------
     File                                         stmt branch   cond   time  total
     ------------------------------------------ ------ ------ ------ ------ ------
     tests/statement                            100.00    n/a    n/a 100.00 100.00
     Total                                      100.00    n/a    n/a 100.00 100.00
     ------------------------------------------ ------ ------ ------ ------ ------

OK. We will look at that output and what it means shortly. For now, let's press forward and see the results.

     $ cover -report html
     Reading database from cover_db
     ------------------------------------------ ------ ------ ------ ------ ------
     File                                         stmt branch   cond   time  total
     ------------------------------------------ ------ ------ ------ ------ ------
     tests/statement                            100.00    n/a    n/a 100.00 100.00
     Total                                      100.00    n/a    n/a 100.00 100.00
     ------------------------------------------ ------ ------ ------ ------ ------
     HTML output sent to tests/statement/cover_db/cover_db.html

If we look at the output we see

   Coverage Summary
   Database: cover_db
   File            stmt  branch cond time  total
   tests/statement 100.0  n/a   n/a  100.0 100.0
   Total           100.0  n/a   n/a  100.0 100.0

Not too much new there, but we can take a closer look by following the link tests/statement.

    File Coverage
           File: tests/statement
       Coverage: 100.0%
   Perl version: 5.8.0
       Platform: solaris
   line stmt branch cond time                                     code
    1                         #!/usr/local/bin/perl
    2
    3                         # Copyright 2003, Paul Johnson (pjcj@cpan.org)
    4
    5                         # This software is free.  It is licensed under the same terms as Perl itself.
    6
    7                         # The latest version of this software should be available from my homepage:
    8                         # http://www.pjcj.net
    9
    10   1               105  my $pi = 3.14159;
    11   1                14  my $r  = 4;
    12   1                30  my $c  = 2 * $pi * $r;
    13   1                17  my $a  = $pi * $r * $r;

Now that's a little more intersting. We can see that we executed each of the four statements once and hence achieved our 100% statement coverage. We also get a rough idea of how long each line took to run. Don't rely on those timing figures too much.

Console output

Let's go back and look at some of that output in a little more detail. First, how do we run the program and invoke Devel::Cover?

Devel::Cover can be thought of as having four phases. There is a setup phase which is run before your program. There is the actual running of your program combined with the collection of the coverage data. There is an end phase in which the coverage data is collected and saved to a database. And finally, there is a phase in which the collected data is manipulated and reports generated. This last phase will be discussed in a little more detail later.

The first three phases happen with this command:

     $ perl -MDevel::Cover tests/statement

That tells perl to use Devel::Cover before starting to run the program. This provides the hook for the startup phase. Your program is then run, and when it has finished the coverage data are analysed and stored.

The first line of output

     Devel::Cover 0.2010: Collecting coverage data for branch, condition, statement and time.

is simply for information. We see which version of Devel::Cover we are running and which coverage criteria we are collecting. Then we have some details of which modules we are interested in:

     Selecting packages matching:
     Ignoring packages matching:
     Ignoring packages in:
         .
         /usr/local/pkg/perl-5.8.0/lib/5.8.0
         /usr/local/pkg/perl-5.8.0/lib/5.8.0/sun4-solaris
         /usr/local/pkg/perl-5.8.0/lib/site_perl
         /usr/local/pkg/perl-5.8.0/lib/site_perl/5.8.0
         /usr/local/pkg/perl-5.8.0/lib/site_perl/5.8.0/sun4-solaris

Normally you won't be interested in any code from the perl core, or any modules you might have installed from CPAN or elsewhere. Devel::Cover lets you specify exactly which files you want to collect coverage information for.

At this point the program is run, but with this particular program there is no output. Finally, a brief summary of the coverage is shown:

     ------------------------------------------ ------ ------ ------ ------ ------
     File                                         stmt branch   cond   time  total
     ------------------------------------------ ------ ------ ------ ------ ------
     tests/statement                            100.00    n/a    n/a 100.00 100.00
     Total                                      100.00    n/a    n/a 100.00 100.00
     ------------------------------------------ ------ ------ ------ ------ ------

This tells us that we have coverage data for the file tests/statement, that we got 100% statement coverage (we executed all the statements in the file), that there were no branches or conditions in the code, and that running the code in that file took all the time.

You can tell Devel::Cover to be quiet with the -silent option if you don't want these messages.


Hilo example

Let's look at a slightly more realistic example.

     $ cat tests/hilo
     #!/usr/local/bin/perl -w
     use strict;
     my $number = @ARGV ? shift : int rand(100) + 1;
     my $guess = -1;
     while ($guess != $number) {
         print "What is your guess: ";
         $guess = <STDIN>;
         if ($guess > $number) {
             print "Too high\n";
         } elsif ($guess < $number) {
             print "Too low\n";
         } else {
             print "Right!\n";
         }
     }

This little program is a guess the number game. It will choose a number between 1 and 100 which you have to guess. After each guess it will tell you if the guess is too high or too low. For testing purposes it is also possible to pass in the number to be guessed on the command line. Let's test it.

     $ perl -MDevel::Cover tests/hilo 2
     Devel::Cover 0.2010: Collecting coverage data for branch, condition, statement and time.
     Selecting packages matching:
     Ignoring packages matching:
     Ignoring packages in:
         .
         /usr/local/pkg/perl-5.8.0/lib/5.8.0
         /usr/local/pkg/perl-5.8.0/lib/5.8.0/sun4-solaris
         /usr/local/pkg/perl-5.8.0/lib/site_perl
         /usr/local/pkg/perl-5.8.0/lib/site_perl/5.8.0
         /usr/local/pkg/perl-5.8.0/lib/site_perl/5.8.0/sun4-solaris
     What is your guess: 2
     Right!
     ------------------------------------------ ------ ------ ------ ------ ------
     File                                         stmt branch   cond   time  total
     ------------------------------------------ ------ ------ ------ ------ ------
     tests/hilo                                  80.00  50.00    n/a 100.00  68.75
     Total                                       80.00  50.00    n/a 100.00  68.75
     ------------------------------------------ ------ ------ ------ ------ ------

We can see the output is very similar to the first example, but now you can see the program runing in the middle. We can also see that we have managed 80% statement cover and 20% branch cover with that test. More details can be found in the coverage report. The report is generated as for the previous example, and looking at the report we can see which parts of the program have not been exercised.

    File Coverage
           File: tests/hilo
       Coverage: 68.8%
   Perl version: 5.8.0
       Platform: solaris
   line stmt branch cond  time                                      code
    1                           #!/usr/local/bin/perl -w
    2
    3                           # Copyright 2003, Paul Johnson (pjcj@cpan.org)
    4
    5                           # This software is free.  It is licensed under the same terms as Perl itself.
    6
    7                           # The latest version of this software should be available from my homepage:
    8                           # http://www.pjcj.net
    9
    10                          use strict;
    11
    12   1     50         153   my $number = @ARGV ? shift : int rand(100) + 1;
    13   1                 18   my $guess = -1;
    14
    15   1                 57   while ($guess != $number) {
         1                 91
    16   1                 26       print "What is your guess: ";
    17   1               846603     $guess = <STDIN>;
    18   1     50         105       if ($guess > $number) {
               50
    19   0                              print "Too high\n";
    20                              } elsif ($guess < $number) {
    21   0                              print "Too low\n"
    22                              } else {
    23   1                162           print "Right!\n";
    24                              }
    25                          }

We can see that there are 10 executable statements in this program. Eight of them have been executed once, and two of them have not been run at all. We can also see that there are three branches, each of which has 50% coverage. We will look at the branches in more detail later. It is normally easier and more productive to work on improving statement coverage before any of the other criteria.

Looking at the statements which have not been executed we can see that they are the two lines which tell us that the guess was either too high or too low. Well, that's not surprising, since using my powers of ESP I had managed to guess the answer straight away. So lets try again, and this time I'll guess too high.

     $ perl -MDevel::Cover tests/hilo 2
     Devel::Cover 0.2010: Collecting coverage data for branch, condition, statement and time.
     Selecting packages matching:
     Ignoring packages matching:
     Ignoring packages in:
         .
         /usr/local/pkg/perl-5.8.0/lib/5.8.0
         /usr/local/pkg/perl-5.8.0/lib/5.8.0/sun4-solaris
         /usr/local/pkg/perl-5.8.0/lib/site_perl
         /usr/local/pkg/perl-5.8.0/lib/site_perl/5.8.0
         /usr/local/pkg/perl-5.8.0/lib/site_perl/5.8.0/sun4-solaris
     What is your guess: 3
     Too high
     What is your guess: 2
     Right!
     ------------------------------------------ ------ ------ ------ ------ ------
     File                                         stmt branch   cond   time  total
     ------------------------------------------ ------ ------ ------ ------ ------
     tests/hilo                                  90.00  66.67    n/a 100.00  81.25
     Total                                       90.00  66.67    n/a 100.00  81.25
     ------------------------------------------ ------ ------ ------ ------ ------
    File Coverage
           File: tests/hilo
       Coverage: 81.2%
   Perl version: 5.8.0
       Platform: solaris
   line stmt branch cond  time                                       code
    1                            #!/usr/local/bin/perl -w
    2
    3                            # Copyright 2003, Paul Johnson (pjcj@cpan.org)
    4
    5                            # This software is free.  It is licensed under the same terms as Perl itself.
    6
    7                            # The latest version of this software should be available from my homepage:
    8                            # http://www.pjcj.net
    9
    10                           use strict;
    11
    12   2     50          311   my $number = @ARGV ? shift : int rand(100) + 1;
    13   2                 36    my $guess = -1;
    14
    15   2                 113   while ($guess != $number) {
         3                 219
    16   3                 68        print "What is your guess: ";
    17   3               3309155     $guess = <STDIN>;
    18   3    100          280       if ($guess > $number) {
               50
    19   1                 545           print "Too high\n";
    20                               } elsif ($guess < $number) {
    21   0                               print "Too low\n";
    22                               } else {
    23   2                 311           print "Right!\n";
    24                               }
    25                           }

So that extra guess has increased our statement coverage to 90%, and branch coverage to 67%. But notice that the statement counts don't represent only that test. The counts have been added to what was already there. This is important because you will not normally have one test that tests the entire program, and it is the combined coverage which is important. Now lets guess too low.

     $ perl -MDevel::Cover tests/hilo 2
     Devel::Cover 0.2010: Collecting coverage data for branch, condition, statement and time.
     Selecting packages matching:
     Ignoring packages matching:
     Ignoring packages in:
         .
         /usr/local/pkg/perl-5.8.0/lib/5.8.0
         /usr/local/pkg/perl-5.8.0/lib/5.8.0/sun4-solaris
         /usr/local/pkg/perl-5.8.0/lib/site_perl
         /usr/local/pkg/perl-5.8.0/lib/site_perl/5.8.0
         /usr/local/pkg/perl-5.8.0/lib/site_perl/5.8.0/sun4-solaris
     What is your guess: 1
     Too low
     What is your guess: 2
     Right!
     ------------------------------------------ ------ ------ ------ ------ ------
     File                                         stmt branch   cond   time  total
     ------------------------------------------ ------ ------ ------ ------ ------
     tests/hilo                                 100.00  83.33    n/a 100.00  93.75
     Total                                      100.00  83.33    n/a 100.00  93.75
     ------------------------------------------ ------ ------ ------ ------ ------
    File Coverage
           File: tests/hilo
       Coverage: 93.8%
   Perl version: 5.8.0
       Platform: solaris
   line stmt branch cond  time                                       code
    1                            #!/usr/local/bin/perl -w
    2
    3                            # Copyright 2003, Paul Johnson (pjcj@cpan.org)
    4
    5                            # This software is free.  It is licensed under the same terms as Perl itself.
    6
    7                            # The latest version of this software should be available from my homepage:
    8                            # http://www.pjcj.net
    9
    10                           use strict;
    11
    12   3     50          471   my $number = @ARGV ? shift : int rand(100) + 1;
    13   3                 53    my $guess = -1;
    14
    15   3                 170   while ($guess != $number) {
         5                 851
    16   5                 104       print "What is your guess: ";
    17   5               7378468     $guess = <STDIN>;
    18   5    100          454       if ($guess > $number) {
              100
    19   1                 545           print "Too high\n";
    20                               } elsif ($guess < $number) {
    21   1                 150           print "Too low\n";
    22                               } else {
    23   3                 487           print "Right!\n";
    24                               }
    25                           }

Fig 5 - tests/hilo File Coverage 3

So that's our 100% statement coverage. It is now time to look at branch coverage, currently standing at to 83%. Following the link on one of the branches takes us to a detailed report of the branch coverage.

    Branch Coverage
           File: tests/hilo
       Coverage: 83.3%
   Perl version: 5.8.0
       Platform: solaris
   line  %  coverage            branch
    12  50   T    F  @ARGV ? :
    18  100  T    F  if ($guess > $number) { }
        100  T    F  elsif ($guess < $number) { }

This shows us the three branches in the program. In this case each branch has a true and a false path. We can see that both paths have been taken on each of the comparisons of $guess with $number. It is the test for @ARGV which has only ever been true. (The output is tastefully coloured to assist in this determination. ) We've not tested the random number part.

     $ perl -MDevel::Cover tests/hilo
     Devel::Cover 0.2010: Collecting coverage data for branch, condition, statement and time.
     Selecting packages matching:
     Ignoring packages matching:
     Ignoring packages in:
         .
         /usr/local/pkg/perl-5.8.0/lib/5.8.0
         /usr/local/pkg/perl-5.8.0/lib/5.8.0/sun4-solaris
         /usr/local/pkg/perl-5.8.0/lib/site_perl
         /usr/local/pkg/perl-5.8.0/lib/site_perl/5.8.0
         /usr/local/pkg/perl-5.8.0/lib/site_perl/5.8.0/sun4-solaris
     What is your guess: 50
     Too low
     What is your guess: 75
     Too low
     What is your guess: 88
     Too high
     What is your guess: 82
     Too low
     What is your guess: 85
     Too high
     What is your guess: 84
     Right!
     ------------------------------------------ ------ ------ ------ ------ ------
     File                                         stmt branch   cond   time  total
     ------------------------------------------ ------ ------ ------ ------ ------
     tests/hilo                                 100.00 100.00    n/a 100.00 100.00
     Total                                      100.00 100.00    n/a 100.00 100.00
     ------------------------------------------ ------ ------ ------ ------ ------

And there we have it, 100% statement and branch coverage. You will note that we have used four tests to achieve this coverage. It is easy to see that we could have done the same with two tests. At some point I will write a tool (or, if i am lucky someone else will) that will determine which tests are not necessary in order to increase the coverage and provide an optimal order in which to run the tests such that as much coverage as possible is achieved as soon as possible.


Command line Options

Command line options can be passed to Devel::Cover in the following fashion:

    perl -MDevel::Cover=-db,cover_db,-coverage,statement,time yourprog args

It's pretty ugly but there's not a great scope for improving it. Some of the options available include:

    -coverage criterion - Turn on coverage for the specified criterion.

By default the coverage criteria collected and reported on are statement, branch, condition and time coverage. Other criteria available are path and pod coverage. You can also use all and none. (I'm not sure why you would specify none.)

    -db cover_db        - Store results in coverage db (default cover_db).

Devel::Cover stores its data in what I lovingly call a database. It's actually a directory with a file written by Data::Dumper. But one day it might be so much more. The default database if cover_db. You can make it anything you want.

    -merge val          - Merge databases, for multiple test benches (default on).

Normally when you run a test you want to merge the coverage results with the previous coverage results. Occasionally you want to start from scratch and ignore what happened before.

    -summary val        - Print summary information iff val is true (default on).
    -silent val         - Don't print any output (default off).

Control the output from Devel::Cover

    -inc path           - Set prefixes of files to ignore (default @INC).
    +inc path           - Append to prefixes of files to ignore.
    -ignore RE          - Ignore files matching RE.
    -select RE          - Only report on files matching RE.

Most Perl programs will pull in a fairly large number of modules that either come as standard or which are available on CPAN. Normally you won't want to collect coverage data on those modules - you are only interested in your programs and modules. Devel::Cover maintains three lists to help decide which files will be covered. The rules are that if a filename matches one of the select regular expressions then data is collected on that file. Otherwise, if a filename matches one of the ignore regular expressions then data is not collected on that file. Finally, if a filename is in one of the ignore directories then data is not collected on that file. The ignore directories are seeded from @INC, but you can replace or add to that list.


TODO

Timings are elapsed time.

if elsif branches all on if part.

gcov


The four phases of Devel::Cover

Earlier we noted that there are four phases involved in the running of Devel::Cover. Let's look at those in a little more detail.

The setup phase

It is necessary for Devel::Cover to get control of the perl interpreter before the test is run in order to do some setup. There are three important things which happen in this phase. First, the test program is subjected to a superficial analysis. Secondly, the coverage criteria to be collected are determined. And finally, most importantly, perl's runops function is replaced with a custom version.

The runops function is a small C subroutine which is the heart of the perl runtime. Its job is the take the optree representing your program and run the appropriate function for each op. This is not very difficult, since the tricky parts are all handled elsewhere. The important thing as far as Devel::Cover is concerned is that the runops function is pluggable, meaning that at runtime I can replace this function with one of my own. Devel::Cover makes use of this to substitute its own runops function which also collects information about which ops have been run, how long they took to run and, to a certain extent, in which order they were run.

The run phase

During this phase the target program is run, but using Devel::Cover's runops function, and data about the ops run are collected.

The end phase

The setup phase adds an END block which is called when the target program ends. In this block the data collected during the run phase are analysed and then stored in a database.

The report phase

This is simply the part in which you query the database to find out more about the coverage information. This will probably be by running cover and selecting one of the output formats at the moment but there is no reason why other tools to access the database could not be written. There is even an API available for that purpose.


Conclusion

Devel::Cover is not perfect. There are bugs and there is work to do, but it is in a state where it can help improve the quality of code by showing weak points in test suites.