Oxygen is stricter about types and why you should get used to it
Pascal Cuoq - 27th Jul 2012I have just sent a list of changewishes (1 2) to a static analysis competition mailing-list and that reminded me of a blog post I had to write on the strictness of the type-checker in upcoming Frama-C release Oxygen. This is the blog post.
This post is not about uninitialized variables
The static analysis competition in question evaluates each participating tool on many testcases. A good number of these testcases are rejected by the current Frama-C. This is why I was writing to the mailing-list: Frama-C cannot participate in the competition if the testcases are not changed (Frama-C does not have to participate of course. It's just that it looks fun and we would probably like to be in it).
In fact many of the testcases were already problematic for Frama-C version 20111001 (Nitrogen) the version in the current Debian Ubuntu FreeBSD NetBSD and other Unix distributions. Indeed a lot of the testcases rely on uninitialized variables for entropy which this post by Xi Wang and this post by Mark Shroyer show is wrong. Instead of the problem that is supposed to be detected (or not) Nitrogen detects the uninitialized use. I covered this already; this blog post is not about uninitialized variables (keep reading!).
This post is about C type-checking
While trying to get a list of uninitialized-variable-using testcases I realized that something had changed since my last evaluation of the competition's benchmarks. Many of them were now rejected at type-checking!
The new problem is many testcases in the benchmarks call functions without having provided a prototype and some ultimately define the called function with a type incompatible with the type inferred at the call site. Frama-C used to be lenient about typing issues but after fixing one soundness bug too many that was caused by the leniency we have decided to throw in the towel. Virgile described one of the typing issues for which Frama-C used to be too lenient. It was not the only one.
This is an unusual position to take. In the article A Few Billion Lines of Code Later: Using Static Analysis to Find Bugs in the Real World the makers of a commercially successful bug-finding tool state:
Our product's requirements roughly form a "least common denominator" set needed by any tool that uses non-trivial analysis to check large amounts of code across many organizations; the tool must find and parse the code [...]
The above is wise: to be maximally useful the static analyzer should choke on as few idiosyncrasies as possible. The analyzed code can be assumed to compile with some compiler and to go through some tests (there must still be a few of these right)? Why warn when a function is called without a prototype if the compiler accepted it? Why reject when the function's implementation has a type that is incompatible with the type that was inferred at the call site? This is probably not the issue the user is looking for (if it was the user would have fixed it when eir compiler warned for it).
Oxygen is strict and that's it
Well our answer is that we tried and we found that it was too much work to try to be sound and precise with these constraints as exemplified by Virgile's blog post. Show us a tool that accepts your weakly typed program and we will show you a tool that probably isn't sound and precise at the same time (we have kept the examples that led to the decision. These examples demonstrate real issues masked by lenient typing. If you think you have found a sound analyzer that is at the same time conveniently permissive on typing it will be our pleasure to provide the examples for you to try).
We hope that Frama-C will still be useful with these new restrictions on typing. Fortunately for us there are more real worlds than the expression “the Real World” in the cited article's title might lead you to think (and this quip should not be taken as a reproach towards the authors of “A Few Billion Lines of Code Later: Using Static Analysis to Find Bugs in the Real World” a very useful document with an appropriately chosen title considering its intended public).