Analyzing unit tests and interpretation speed

Pascal Cuoq - 1st Mar 2011

I once claimed in a comment that with the value analysis, when the source code is fully available, you can analyze many executable C programs with maximum precision. For this to work, all inputs must have been decided in advance and be provided. It can therefore be likened to running a deterministic unit or integration test. In fact, existing unit and integration tests are choice candidates for this kind of analysis.

One of the ideas to remember from this post on function memcpy is that even horribly low-level code can be analyzed precisely as long as you have precise inputs —for instance code that copies a pointer one char at a time. There are some known situations where the analysis may start losing precision without the program doing anything too arguably wrong. Tell us about it if you find one —some of the known precision-loss situations are remediable it's all just a question of priorities.

Such a maximally precise analysis provides better guarantees than merely executing the program with the inputs you have crafted: it detects all undefined behaviors in the sense of the value analysis. This is more systematic and reliable than letting executions chance on issues in the code even assisted by dynamic tools such as Valgrind.

Still in order to get a maximally precise analysis you need to unroll all loops by passing a sufficiently large argument to option -slevel. If you have ever tried it you may have noticed that it can be slow. But how slow? While writing the beginning of the Skein tutorial the first meaningful analysis was taking 10s on the computer I had then (note that the 10 seconds were for analyzing a complete useful program not a unit test). I had no idea how much time executing the same code took because it was too fast. I assumed the difference was something between a factor of 1000 and a factor of 10000 for "normal" programs such as that one.

Note: if you wanted to make the comparison fairer you could compare the time taken by the value analysis to the time taken by compilation plus execution. But this is not the point I am interested in here.

I have been running and analyzing more of these deterministic programs recently so I had a chance to improve my estimate: it's more like a factor between 10000 and 100000. It depends very much on the shape of the analyzed program: the value analysis remembers the value of all variables for each traversed statement so each additional global variable makes the analysis marginally slower on all statements even when it is not accessed. And this is only one of the fundamental differences between the two.

A good rule of thumb is that a program that enumerates all values of a 32-bit variable —say because of a bug— may take a few minutes to execute if it's not too many instructions that are executed 2^32 times. It's about the same for complete unrolling of the value analysis except that the cut-off size for the variable is 16-bit instead.

So in the race between execution and completely unrolled analysis execution triumphs for program unsigned int i; main() { while(++i); }. It has an advantage for program unsigned short i; main() { while(++i); } —did we say we were including compilation time again? And finally positions are reversed for program unsigned short i=17; main(){ while (i+=42); } where execution literally takes forever whereas when completely unrolled the value analysis quickly diagnoses the program as non-terminating.

One handicap which is also an advantage for the value analysis is that it records previously seen states. It can therefore detect when i becomes 17 that this variable has been 17 before and it knows what happens then: a lot of computations take place and then i is 17 again. Recording values also mean that it's possible to inspect values in the graphical user interface. And one last advantage of the value analysis over execution is that it can target about any architecture regardless of the architecture it is running on. It can analyze a program for a 64-bit big-endian architecture on a 32-bit little-endian one.

A factor of up to 100000 may seem so expensive as to be impractical but you should try Carbon on examples of increasing sizes before altogether dismissing it as unrealistic. The value analysis' algorithms for efficient use of time and memory have become rather sophisticated over time. 100000 is not that bad: we have very fast desktop computers nowadays but some of the C programs we are interested in are still small either because they have no particular reason to grow and fill the computation time available or because they must run on power-limited mobile devices.

Lastly fully unrolling deterministic programs is nice because in this mode there never are any false positives that waste human time without revealing bugs. But the value analysis can be useful even when it is not completely unrolled. For instance it also concludes that the earlier program with while (i+=42); does not terminate without being unrolled.

Pascal Cuoq
1st Mar 2011