On memcpy (part 1: the source code approach)

Pascal Cuoq - 27th Jan 2011

memcpy() is one of the few functions standardized as part of C itself instead of an additional API. But that's not what makes it interesting from a static analysis point of view. I think what makes it interesting is that it is used often, and often for tasks that can be described in high-level terms, whereas its implementations range from the low-level to the horribly low-level.

Here is a typical memcpy implementation:

void * 
memcpy(void *dst, void *src, size_t length) 
{ 
  int i; 
  unsigned char *dp = dst; 
  unsigned char *sp = src; 
  for (i=0; i<length; i++) 
    *dp++ = *sp++; 
  return dst; 
} 

And here are snippets of an analysis context that uses it (download the entire file):

struct S { int *p; int *q; int i; double d; int j; }; 
... 
  struct S s  d; 
  s.p = &a; 
  s.q = Frama_C_nondet_ptr(&a  &b); 
  s.i = Frama_C_interval(17  42); 
  s.j = 456; 
  s.d = Frama_C_double_interval(3.0  7.0); 
  memcpy(&d  &s  sizeof(struct S)); 

Analyzing with frama-c -val .../builtin.c m.c produces the following information about the contents of s which is exactly what we could have expected:

          s.p ∈ {{ &a ;}} 
           .q ∈ {{ &a ; &b ;}} 
           .i ∈ [17..42] 
           .d ∈ [3. .. 7.] 
           .j ∈ {456; } 

On the other hand the information for d is much less precise:

 d ∈ {{ garbled mix of &{a; b; } (origin: Misaligned {m.c:11; }) }} or UNINITIALIZED 

Each field of d is only guaranteed to contain a value that was not computed from base addresses other than &a and &b. The fields of d are not even guaranteed to have been initialized. An ideally precise analysis would tell us that d contains the same values as those displayed for s.

Side remark: if you do run the analysis at home you can ignore the messages about imprecise values for the moment. These are not alarms to act upon but simply informational messages to help trace when the value analysis is imprecise on large programs where the imprecision starts.

If you have been reading this blog from the beginning perhaps as a continuation from the tutorial you may have noticed by now that whenever the value analysis is not precise enough the first thing to try is option -slevel.

frama-c -val .../builtin.c m.c -slevel 100

          s.p ∈ {{ &a ;}} 
           .q ∈ {{ &a ; &b ;}} 
           .i ∈ [17..42] 
           .d ∈ [3. .. 7.] 
           .j ∈ {456; } 
          d.p ∈ {{ &a ;}} 
           .q[bits 0 to 7]# ∈ {{ &a ; &b ;}} misaligned 0%32  
           .q[bits 8 to 15]# ∈ {{ &a ; &b ;}} misaligned 0%32  
           .q[bits 16 to 23]# ∈ {{ &a ; &b ;}} misaligned 0%32  
           .q[bits 24 to 31]# ∈ {{ &a ; &b ;}} misaligned 0%32  
           .i[bits 0 to 7]# ∈ [17..42] misaligned 0%32  
           .i[bits 8 to 15]# ∈ [17..42] misaligned 0%32  
           .i[bits 16 to 23]# ∈ [17..42] misaligned 0%32  
           .i[bits 24 to 31]# ∈ [17..42] misaligned 0%32  
           .d[bits 0 to 7]# ∈ [3. .. 7.] misaligned 32%64  
           .d[bits 8 to 15]# ∈ [3. .. 7.] misaligned 32%64  
           .d[bits 16 to 23]# ∈ [3. .. 7.] misaligned 32%64  
           .d[bits 24 to 31]# ∈ [3. .. 7.] misaligned 32%64  
           .d[bits 32 to 39]# ∈ [3. .. 7.] misaligned 32%64  
           .d[bits 40 to 47]# ∈ [3. .. 7.] misaligned 32%64  
           .d[bits 48 to 55]# ∈ [3. .. 7.] misaligned 32%64  
           .d[bits 56 to 63]# ∈ [3. .. 7.] misaligned 32%64  
           .j ∈ {456; } 

Good news: all fields of d are now guaranteed to be initialized. In addition the information available for each field is more precise than before. The information about fields .j and .p has been perfectly preserved during the memcpy from s to d whereas fields .q .i and .d are only known to contain several 8-bit slices of values. This is what the "misaligned ..." part of each line mean.

For instance line ".i[bits 0 to 7]# ∈ [17..42] misaligned 0%32" uses this mention in order to avoid giving the impression that bits 0 to 7 of field .i contain [17..42]. Instead there was a 32-bit value [17..42] somewhere in memory and a 8-bit slice of that value was copied to the first 8 bits or .i.

The fact that all 4 (respectively 8) slices of each field are followed by the same "misaligned" mention with the same numerical values mean that all slices for that field could come from the same value and have been copied in order with the first source slice going to the first destination slice ... However the value analysis is not able to guarantee that this is what happened. It has lost the information that bits 0 to 7 and bits 8 to 15 of .i come from the same value which is why the contents of .i are printed thus instead of just displaying that .i is [17..42].

It is important not to stitch together say bit 0 to 7 of {{ &a ; &b ;}} misaligned 0%32 and bit 8 to 31 of {{ &a ; &b ;}} misaligned 0%32 into {{ &a ; &b ;}} when you do not know that they come from the same value. Consider the example below:

  int a  b; 
  int *p = Frama_C_nondet_ptr(&a  &b); 
  int *q = Frama_C_nondet_ptr(&a  &b); 
  *(char*)&p = *(char*)&q; 

After analyzing this piece of code the value analysis predicts for p:

          p[bits 0 to 7]# ∈ {{ &a ; &b ;}} misaligned 0%32  
           [bits 8 to 31]# ∈ {{ &a ; &b ;}} misaligned 0%32  

It would be incorrect to predict that p contains {{ &a ; &b ;}} here because the first call to Frama_C_nondet_ptr may return &a while the second one returns &b. Without the information that the two slices originate from the same value it is incorrect to stitch them together as if they did.

The value analysis does better when there is only one possible concrete value inside the set at hand. Fields .p and .j in the original program although they were copied char by char could be reconstituted because the value analysis knows that there is only one possible way to be {{ &a ; }} (or { 456; }) which implies that properly ordered slices when put side by side can only rebuild &a.

This memcpy if it is called with large values of length in the program and the program is analyzed with the -slevel option may occupy a large fraction of the analysis time. This is because although the analyzer does not keep the information that the small bites of values come from the same value which it could use it keeps the information that there exist plenty of equalities between source and destination chars — and does not use them. This should be considered a known issue with an open-ended fix date.

And this performance issue naturally brings up another performance issue with which it is suitable to conclude this first part of the memcpy discussion. At the other end of the spectrum what if the user finds the analysis of memcpy too slow even without -slevel (or with -slevel-function memcpy:0 to force a rough analysis of memcpy)?

To this user we say: replace your memcpy function by the one below.

void * 
memcpy(void *dst  void *src  size_t length) 
{ 
  unsigned char *dp = dst; 
  unsigned char *sp = src; 
  for (;Frama_C_interval(0 1);) 
    { 
      dp[Frama_C_interval(0 length-1)] = sp[Frama_C_interval(0 length-1)]; 
    } 
  return dst; 
} 

While it it may not be immediately clear why one would want to replace the original executable memcpy with this non-executable one we can at least notice that all the executions of the original memcpy are captured by this one. At each iteration of the original memcpy the condition is either 0 or 1 and one of the characters sp[Frama_C_interval(0 length-1)] is copied to dp[Frama_C_interval(0 length-1)].

Now to investigate why analyzing this function is faster let us look at the states propagated by the analyzer when analyzing the for loop of the original memcpy. It starts the loop with:

        i ∈ {0; } 
        dp ∈ {{ &d ;}} 
        sp ∈ {{ &s ;}} 
        d completely uninitialized 

Here is the sequence of states at the point after another char has been copied before dp sp and i are incremented:

        i ∈ {0; } 
        dp ∈ {{ &d ;}} 
        sp ∈ {{ &s ;}} 
        d.p[bits 0 to 7]# ∈ {{ &a ;}} misaligned 0%32  
         {.p[bits 8 to 31]; .q; .i; .d; .j; } ∈ UNINITIALIZED 
        i ∈ {0; 1; } 
        dp ∈ {{ &d + {0; 1; } ;}} 
        sp ∈ {{ &s + {0; 1; } ;}} 
        d.p[bits 0 to 15] ∈ 
         {{ garbled mix of &{a; } (origin: Merge {m.c:12; }) }} or UNINITIALIZED 
         {.p[bits 16 to 31]; .q; .i; .d; .j; } ∈ UNINITIALIZED 
        i ∈ {0; 1; 2; } 
        dp ∈ {{ &d + {0; 1; 2; } ;}} 
        sp ∈ {{ &s + {0; 1; 2; } ;}} 
        d.p[bits 0 to 23] ∈ 
         {{ garbled mix of &{a; } (origin: Merge {m.c:12; }) }} or UNINITIALIZED 
         {.p[bits 24 to 31]; .q; .i; .d; .j; } ∈ UNINITIALIZED 
        i ∈ {0; 1; 2; 3; } 
        dp ∈ {{ &d + {0; 1; 2; 3; } ;}} 
        sp ∈ {{ &s + {0; 1; 2; 3; } ;}} 
        d.p ∈ 
         {{ garbled mix of &{a; } (origin: Merge {m.c:12; }) }} or UNINITIALIZED 
         {.q; .i; .d; .j; } ∈ UNINITIALIZED 
        i ∈ {0; 1; 2; 3; 4; } 
        dp ∈ {{ &d + {0; 1; 2; 3; 4; } ;}} 
        sp ∈ {{ &s + {0; 1; 2; 3; 4; } ;}} 
        d{.p; .q[bits 0 to 7]; } ∈ 
         {{ garbled mix of &{a; b; } (origin: Merge {m.c:12; }) }} or UNINITIALIZED 
         {.q[bits 8 to 31]; .i; .d; .j; } ∈ UNINITIALIZED 
        i ∈ [0..15] 
        dp ∈ {{ &d + {0; 1; 2; 3; 4; 5; } ;}} 
        sp ∈ {{ &s + {0; 1; 2; 3; 4; 5; } ;}} 
        d{.p; .q[bits 0 to 15]; } ∈ 
         {{ garbled mix of &{a; b; } (origin: Merge {m.c:12; }) }} or UNINITIALIZED 
         {.q[bits 16 to 31]; .i; .d; .j; } ∈ UNINITIALIZED 
        i ∈ [0..16] 
        dp ∈ {{ &d + {0; 1; 2; 3; 4; 5; 6; } ;}} 
        sp ∈ {{ &s + {0; 1; 2; 3; 4; 5; 6; } ;}} 
        d{.p; .q[bits 0 to 23]; } ∈ 
         {{ garbled mix of &{a; b; } (origin: Merge {m.c:12; }) }} or UNINITIALIZED 
         {.q[bits 24 to 31]; .i; .d; .j; } ∈ UNINITIALIZED 
        i ∈ [0..23] 
        dp ∈ {{ &d + [0..7] ;}} 
        sp ∈ {{ &s + [0..7] ;}} 
        d{.p; .q; } ∈ 
         {{ garbled mix of &{a; b; } (origin: Merge {m.c:12; }) }} or UNINITIALIZED 
         {.i; .d; .j; } ∈ UNINITIALIZED 
        i ∈ [0..23] 
        dp ∈ {{ &d + [0..8] ;}} 
        sp ∈ {{ &s + [0..8] ;}} 
        d{.p; .q; .i[bits 0 to 7]; } ∈ 
         {{ garbled mix of &{a; b; } (origin: Merge {m.c:12; }) }} or UNINITIALIZED 
         {.i[bits 8 to 31]; .d; .j; } ∈ UNINITIALIZED 
        i ∈ [0..23] 
        dp ∈ {{ &d + [0..15] ;}} 
        sp ∈ {{ &s + [0..15] ;}} 
        d{.p; .q; .i; .d[bits 0 to 31]; } ∈ 
         {{ garbled mix of &{a; b; } (origin: Merge {m.c:12; }) }} or UNINITIALIZED 
         {.d[bits 32 to 63]; .j; } ∈ UNINITIALIZED 
        i ∈ [0..23] 
        dp ∈ {{ &d + [0..16] ;}} 
        sp ∈ {{ &s + [0..16] ;}} 
        d{.p; .q; .i; .d[bits 0 to 39]; } ∈ 
         {{ garbled mix of &{a; b; } (origin: Merge {m.c:12; }) }} or UNINITIALIZED 
         {.d[bits 40 to 63]; .j; } ∈ UNINITIALIZED 
        i ∈ [0..23] 
        dp ∈ {{ &d + [0..23] ;}} 
        sp ∈ {{ &s + [0..23] ;}} 
        d ∈ 
         {{ garbled mix of &{a; b; } (origin: Merge {m.c:12; }) }} or UNINITIALIZED 

In contrast with this long sequence of abstract states going through the same point of the for loop until a fixpoint is finally found the analysis of the modified memcpy reaches the same abstract state in one iteration realizes that it is a fixpoint and goes on with its life:

        dp ∈ {{ &d ;}} 
        sp ∈ {{ &s ;}} 
        d ∈ 
         {{ garbled mix of &{a; b; } (origin: Merge {mfast.c:32; }) }} or UNINITIALIZED 

In short the modified memcpy voluntarily loses information so that finding the loop's fixpoint becomes straightforward. The analysis of the original memcpy has to discover the fixpoint by gradually losing information and seeing what values stick. In fact the process of discovering the fixpoint may overshoot and end up less precise as well as slower than the improved memcpy. For imprecise analyses of programs that use memcpy when the user is interested in the verification of the program itself rather than memcpy it is recommended to use the replacement version.

Pascal Cuoq
27th Jan 2011