On memcpy (part 2: the OCaml source code approach)

Pascal Cuoq - 31st Jan 2011

When picking up the title for the previous post on function memcpy I anticipated this second part would describe the Frama_C_memcpy built-in function. The subtitle "part 1: the source code approach" seemed a good idea since the first part was about using C source code to tell the analyzer what this function memcpy it often encounters is.

I did not think further and did not realize that I would spend this sequel post explaining that a built-in function is an OCaml function that directly transforms an abstract state into another abstract state without going through the tedious instruction-by-instruction interpretation of C code. So this second post is still about source code.

It is not source code that it is recommended or even documented how to write yourself. But it is a convenient way for value analysis experts to introduce certain new features.

What new features? Smoking hot blazing fast new features.

Also features that work completely right neither the first nor the second time. The Formula One of features if you will. In fact you could be forgiven for calling them Virgin Racing features although I prefer to think of them as Ferrari features.

As just stated builtins are about using OCaml functions to describe the effect of an unavailable C function. For instance calls to Frama_C_memcpy(d s l) are interpreted with an OCaml function that in a single step copies a slice of memory of length l*8 bits from s to d. And it is fast. Behold:

char s[10000]  d[10000];
main()
{
  // Make problem representative
  for (int i=0; i<100; i++)
    s[100*i] = i;
  // Time one or the other
  //  c_memcpy(d  s  10000);
  Frama_C_memcpy(d  s  10000);
  // Check that d contains the right thing
  Frama_C_dump_each();
  return 0;
}

(download the entire program)

Using the C version of memcpy and sufficient unrolling you get the expected contents for d:

$ time bin/toplevel.opt -val -slevel 20000 memcpytest.c -no-results
...
        d[0..99] ∈ {0; }
         [100] ∈ {1; }
         [101..199] ∈ {0; }
         [200] ∈ {2; }
         ...
user   0m23.155s

Using the Frama_C_memcpy built-in you get the same precise result for d:

$ time bin/toplevel.opt -val -slevel 20000 memcpytest.c -no-results
...
        d[0..99] ∈ {0; }
         [100] ∈ {1; }
         [101..199] ∈ {0; }
         [200] ∈ {2; }
         ...
user   0m0.402s

You also get nearly 23 seconds of your life back.

So what are the things that can go wrong with a builtin replacement function?

  1. If the C code that is not analyzed would have caused an alarm to be emitted the built-in should emit that alarm too or at the very least be provided with a header file that offers an ACSL pre-condition along with a prototype for the builtin. Both alternatives work but it is easy to forget to do either.
  2. All cases of imprecise arguments should be handled. Support for an imprecise length was added recently. Very imprecise source or destination pointers may still cause the analysis to abort (it is being done and I am not sure where we were the last time we were working on this).
  3. The builtin function calls the value analysis' internal functions in contexts that differ from the usual well tested circumstances occurring during the analysis of normal C programs. Strange behaviors or bugs may be uncovered by the user. As an example I was just preparing the next section of this post where I would have shown how imprecise lengths were now handled. I was going to suggest modifying the above test program thus:
  Frama_C_memcpy(d  s  Frama_C_nondet(150 10000));

This brought up the following two remarks: first there is an internal function for moving slices of the abstract state around that is slower than it should be and it shows on this test. Secondly the value analysis in Carbon will tell you that bits 1200 to 79999 of d contain either 0 or a very large number a number with about 23400 decimal digits that start with 2164197 and end with 85824. The value analysis is right of course. If the length argument to Frama_C_memcpy was 150 then that slice of memory contains zero. If the length was 10000 the numbers in d that were copied from s can be interpreted as representing exactly that 78800-bit number. This is the value analysis' way of telling you that it's either one or the other: either the length was 150 or it was 10000. This is more informative than telling you cell by cell that the cell contains zero or another number because then you wouldn't see that it's either all one or all the other. The value analysis is rather cute that way.

Note: if the 78800-bit number explanation seems terribly complicated do not worry there will be a better explanation for this issue in a future post. We just need another new feature in order to be able to show you.

In conclusion not everyone may want a Formula One car. But Frama_C_memcpy should at least work for completely unrolled deterministic programs where it can speed up analysis quite a bit compared to the interpretation of a C memcpy.

Thanks to Julien for help with motorsports metaphors and to Boris for help in improving Frama_C_memcpy (it used to be worse).

Pascal Cuoq
31st Jan 2011