Features in Frama-C Nitrogen, part 1

Pascal Cuoq - 14th Oct 2011

Here is a series of posts that highlight interesting features in the recently released Frama-C Nitrogen 20111001. There is new functionality in Nitrogen, that we hope will entice both existing and prospective users. Other improvements yet will only have meaning for existing users. I will start off with two items from this second category.

Nitrogen compiles with OCaml 3.10.2.

The only non-optional dependency for compiling Nitrogen is an OCaml compiler. But which one? Despite a now long history, the OCaml language is still evolving relatively rapidly. We have never been happier with our choice of an implementation language: it is evolving in a direction that suits us, and it is evolving conservatively (compare Perl 6, PHP 5, Python 3, …). On the other hand, Frama-C is large and contains tricky code; the plug-in architecture with dynamic loading in general on the one hand, and the hash-consing programming technique on the other hand, are only two examples. It is large enough and tricky enough to reveal many of the subtle difference between any two versions of the OCaml compiler.

Nitrogen compiles with the latest version of OCaml, of course. That's 3.12.1 at the time of this writing. We already know that it won't compile as-is with the future OCaml 3.13 (a patch will be released in due time). Similarly, support for older versions has to be gained, version by version. If you have only written medium-sized OCaml programs, you could easily underestimate the difficulty of this. I was lucky enough not to have to deal with it much during this cycle, but some of my colleagues had to. It always is a struggle. Sometimes the equivalent of #define/#if constructs from C pre-processing would help, but this is not idiomatic in OCaml. Again, the only non-optional dependency for compiling Nitrogen is an OCaml compiler, so we won't use fine solutions such as Cppo.

For Windows and Mac OS X OCaml is not part of the operating system so we ship the version we like together with the binary package (if we make one). With Unix the issues are different: there are too many flavors for us to distribute binaries but there are efficient package systems and distributions to painlessly bring in required dependencies. Often Frama-C itself is packaged in binary or source form as part of these distributions thanks to the work of many volunteers. It may take some time for packages to be created for Nitrogen and some users do not want to wait. Linus Token for instance may rely on a computer he bought two years ago. Frama-C works fine on two-years old computers as seen in the next section. Linus installed his Unix distribution of choice when he bought his computer and now he expects Frama-C's sources to compile with the OCaml version from his distribution (OCaml programmers can be expected to have the latest OCaml compiler installed from its own sources but Linus is not an OCaml programmer). The Unix distribution installed two years ago was on average 3 months old at that time and it may have been frozen for stability 3 months before being released. For Linus Frama-C has to compile with a 2.5-year-old compiler. And then there are industrial users who like to trail a little bit on the Linux front but at the same time want all the latest Frama-C features. For Nitrogen we chose to retain compatibility with OCaml 3.10.2 released in February 2008 and all OCaml versions released since.

The value analysis is up to four times faster on realistic examples

The second Nitrogen feature I want to highlight today the value analysis' speed. Here is a quick and dirty comparison for programs that could already be analyzed with Carbon. There are new alarms and precision improvements in Nitrogen but I made sure that in this comparison the two versions were doing the same amount of work.

For this comparison I did not cherry-pick benchmarks. I looked for programs of varying sizes in the archives of the blog and used the three that came first having decided in advance that I wouldn't dismiss any results I didn't like. Each analysis was run three times and the median time was kept. This is on a Core 2 mobile processor and the frequency is pinned at 1.5GHz through CoolBook. In plain English the timings are for an oldish but not antiquated notebook.

The options I used were:

-slevel 1000 -val count.c
-slevel 10000 -val -no-results -cpp-command "gcc -C -E -m32 -Dprintf=Frama_C_show_each" sha1.c
-val -slevel 1000 -slevel-function main:0 *.c -cpp-command "gcc -m32 -C -E -I. "

The programs count.c and sha1.c came from this post. For sha1.c I had to disable the endianness detection code implemented with a switch to put the Carbon and Nitrogen versions on an equal footing. With the Carbon version there used to be a difficult choice to make in presence of switch remember? I could also have used option -simplify-cfg for the Carbon execution but then the analyzers would not have been analyzing exactly the same program. The Skein-256 example is the verification that an arbitrary number of calls to Skein_256_Update(... 80) never cause a run-time error using Josh Ko's instrumentation.

                             count.c     sha1.c      Skein-256
Carbon                       0m0.491s    0m2.862s    1m10.201s
Nitrogen without Zarith      0m0.410s    0m1.724s    0m37.782s
Nitrogen with Zarith         0m0.313s    0m0.962s    0m16.700s
Total speed-up                 1.5          3            4

How did the value analysis improve so much overall? As exemplified by the timings with and without Zarith this is the result of many small enhancements and refinements at all levels.

Conclusion

As promised the two features described here are only worth noting for faithful users. It only matters that Frama-C compiles with OCaml 3.10.2 if you intend to compile it at all and you only care that it is much faster than before if you were already using it. Even so some of the existing users may not notice them. This is the kind of feature that I like because it does not complicate the user interface —the value analysis benchmarks above use the same options and produce the same results— and improves the software nonetheless. Existing users and people who try Frama-C for the first time with Nitrogen will both have a better time of it thanks to the effort spent on the two points described in this post and on tens of others big and small some of which I hope will receive their own minute of spotlight in this series. I have been forced to allude to one other small improvement a better treatment of switch in the value analysis that I like even better because it removes the need to learn one option.