Devel::Cover Projects

The following projects require a reasonable amount of work and might be suitable for a student to work on as part of Google's Summer of Code in which The Perl Foundation is taking part, or for someone to work on as part of a grant from The Perl Foundation. Or you might fancy doing one just for fun.

Go mad
It should be possible to integrate Devel::Cover with Larry's MAD work that Nicholas has recently integrated into bleadperl. The MAD code should allow the exact source code to be reproduced from data stored when perl parses the code.
At present, Devel::Cover reports its findings using source code information that is either read from the original source code or regenerated from the optree using B::Deparse. Both of these approaches have their problems.
Reading from the source file requires first being able to find it. It is possible that the file could have been moved or deleted or is in some other way unreadable. It is also possible that there might never have been a file in the first place or that a __LINE__ directive could be lying. Then there is also the problem that a line of source code does not always map to a statement. And even when the file is still there, perl doesn't store the absolute path to find it. Devel::Cover has some reasonably complicated code designed to notice when chdir happens in order to be able to map relative paths to absolute so the source file can be found but there are still edge cases where this doesn't fully work. It would be nice to be able to bypass the whole mess and use the MAD data.
Using B::Deparse has the obvious problem that it does not recreate the exact code as the original source which can be reasonably confusing.
Devel::Cover currently uses B::Deparse to report branch and condition coverage and reads the source file to report on other criteria.
I haven't had time to investigate the MAD code, but it seems to me to be just what the doctor ordered in this case.
Unreachable code
Sometimes we write code that we do not expect to be executed for one reason or another. Devel::Cover includes preliminary support for marking such code and setting an error condition if it is exercised, rather than if it isn't. Extending this support could involve:
- Ensuring that all criteria can be marked as unreachable. This includes devising a scheme to be able to mark specific branches and conditions as unreachable.
- Providing a mechanism whereby code could be thus marked in the source file itself, probably by using comments but possibly in some other fashion. At present the unreachable code information is stored in a file with minimal command-line support for managing this file. This needs to be extended and it might be nice to add some form of gui support, possibly tied in to a reporting mechanism.
- Being able to specify different reasons for code being unreachable, and being able to select between them in reports.
- Ensuring that the reports correctly report on the unreachable code.
Reports
A lot of people like the current default HTML report that Devel::Cover produces, but it is not without its problems. There are some data gathered that the default report does not show. It also does not work with the unreachable code (see section 2). More importantly, the condition cover report is actually incorrect for certain complicated conditions. Some investigation is required to determine the best way to report on complicated conditions. The thresholds for the colours in the report are fixed.
I have started work on a new, cleaner HTML report in an attempt to solve these problems, but only the basics are in place.
These HTML reports are nice in that they are static, but a strong case can be made for in interactive report, either as a thick client using Tk, gtk or some other cross platform widget set, or as a fully buzzword compliant thin client, full of AJAXy goodness. In the latter case it might be nice to bundle a minimal webserver, as Catalyst does for example, so that it doesn't become necessary to install and configure a complete webserver just to view your coverage results. An interactive report could also be used to provide information about the source code, such as which parts are unreachable (see section 2).
Profiling
Devel::Cover has basic profiling support named time coverage, which measures how long individual perl ops take to run. This isn't very accurate, but it can be useful to show rough timing information. Someone might have a good idea on how to improve this, but a better idea might be to somehow integrate with a real profiler and then display the profiling information together with the coverage information. Devel::Cover reports have the concept of "annotations", which allow extra information to be displayed in the coverage report. For example, there is a SVK annotation which shows who last edited each line and when, so you can see who checked in that uncovered code. This facility might need to be extended to cope with profiling information, or the information might need to be more tightly integrated with a report. This work could well be merged with section 3.
Threads support
At the moment Devel::Cover doesn't work with perl threads. One of the main reasons for this is that Devel::Cover modifies the perl optree, and this optree is shared between all threads. In particular, op_ppaddr is overridden to collect condition coverage information. I have done some work to lock sections of code which mess with the optree, but there is evidently more work to be done.
Or a totally different approach could be taken. For example, it might be worth considering duplicating the optree for each thread. (This may even be useful for core perl itself.) Or it could be interesting to look at other ways for Devel::Cover to gather the code coverage data without having to modify the optree so drastically, or even at all. Since I devised the data gathering scheme, five bits per opcode have become available, which might prove useful here.
Test analysis
When a construct, in particular a branch or condition, is not exercised it might be helpful to be able to explain what needs to be done in order to exercise it.
cpancover
cpancover is a program which runs Devel::Cover on a number of modules and aggregates the results. I also have a program which installs perl, apache, mod_perl and lots of modules ready for cpancover to be run. It might be interesting to set up a smoke system to automatically build these modules and check their coverage.
Possibly even more useful might be a system whereby people could run this, or generate coverage databases in some other way, and upload them to a central server which would merge and display the results. In this way the coverage over multiple operating systems and environments could be measured.
Path coverage
At the moment path coverage is not collected. There are a number of problems to solve here, starting with determining what constitutes a path and how the data should be collected and reported.
Mutation coverage
If you have 100% coverage of the current criteria (and an increasing number of projects do, or are getting there) or at least if you have a very high coverage, mutation coverage might still be able to show you problems in your code. The idea is that random changes are made to your code, for example a <= might be changed to <, and your tests are run again. If they still all pass, then they are not sufficient and there could still be lurking bugs in your code.
This fits in well with existing code coverage because running a complete test suite can be expensive. Devel::Cover already knows which tests exercise which constructs and so it can rerun only the appropriate subset of the tests to check the mutation.
Test suite optimisation
Devel::Cover knows which tests exercise which constructs and how long each test takes. This gives enough information to be able to provide an optimal ordering for the tests and say which tests are redundant. From the coverage point of view, anyway. I have made a start on a report to do this, but it is very basic. This alone is probably not enough for a complete project. But it probably would be with some way to break this down by individual test point.
Tests
Devel::Cover can't be run on itself. This is a fundamental limitation, but it's a real shame. Some way to make that possible would be wonderful. Failing that, adding tests the old fashioned way would also be very useful. I fear the coverage of Devel::Cover itself is much lower than it should be.
Something else
Anything else you might think of that is useful, interesting and exciting.

Hopefully someone will find something here that piques their interest.

Devel::Cover Projects

Go mad

Unreachable code

Reports

Profiling

Threads support

Test analysis

cpancover

Path coverage

Mutation coverage

Test suite optimisation

Tests

Something else