Donut gibberish

Pascal Cuoq - 7th Aug 2011

Hey, I left out one alarm last time:

donut.c:15 ... out of bounds read. assert \valid(". -~:;=!*#$@"+tmp_7); 

This corresponds to ". -~:;=!*#$@"[N>0?N:0] in the obfuscated code. I wanted to have a blog post about this construct in particular because I was curious whether it would break the content management system's URL generation.

The reason why this alarm is a false alarm is a bit difficult to see in context so let us focus on the function below which contains only the relevant computations.

double sin(double)  cos(double); 
/*@ ensures 0 <= \esult <= 11 ; */ 
int compute(float A  float B  float i  float j) 
  float c=sin(i)  l=cos(i); 
  float d=cos(j)  f=sin(j); 
  float g=cos(A)  e=sin(A); 
  float m=cos(B)  n=sin(B); 
  int N=8*((f*e-c*d*g)*m-c*d*e-f*g-l*d*n); 
  return N>0?N:0; 

How would we argue that it is allowable to do this in a real verification? If the target compiler implements IEEE 754 floating-point exactly then it doesn't matter what instructions are interspersed with the ones we are studying: they always give the same IEEE 754-mandated values. If the compiler only implements IEEE 754 approximately (say computing some intermediate values in higher precision and later double-rounding some of them at its whim) then we can analyze the function above with option -all-rounding-modes and argue that while the instructions around these ones may in the real program interfere with the computation all possible results of the real program are captured by the value analysis' predictions for this function.

I do not know whether this kind of argument would be accepted in an industrial context but on the other hand the difficulty here similarly to the case of variable o in the last post is artificially introduced by the program's obfuscation. So the point is probably moot: industrial code would not be allowed to mix up computations like this in the first place.

If the discussion above seems a bit abstract perhaps you have not read David Monniaux's report on the pitfalls of floating-point computations for static analysis. Consider it recommended.

Lastly modifying the program as below and invoking Frama-C's slicing plug-in with the command that follows was useful for extracting the function compute() above.

int          k;double sin() 
          cos();main(){float A= 
       0 B=0 i j z[1760];char b[ 
  ){memset(b 32 1760);memset(z 0 7040) 
 >i;i+=0.02){float c=sin(i) d=cos(j) e= 
 sin(A) f=sin(j) g=cos(A) h=d+2 D=1/(c* 
 h*e+f*g+5) l=cos      (i) m=cos(B) n=s\ 
in(B) t=c*h*g-f*        e;int x=40+30*D* 
(l*h*m-t*n) y=            12+15*D*(l*h*n 
+t*m) o=x+80*y           N=8*((f*e-c*d*g 
 )*m-c*d*e-f*g-l        *d*n); 
	  /*@ slice pragma expr N; */ 
 ". -~:;=!*#$@"[N>0?N:0];}}/*#****!!-*/ 
         . ~~;;;========;;;:~-. 
             .. -------- */ 
$ bin/toplevel.opt -val share/libc.c -slice-pragma main -slice-print donut.c 
/* Generated by Frama-C */ 
extern double sin(); 
extern double cos(); 
void main(void) 
  float A; 
  float B; 
  float i; 
  float j; 
  A = (float)0; 
  B = (float)0; 
  while (1) { 
    j = (float)0; 
    while (6.28 > (double)j) { 
      i = (float)0; 
      while (6.28 > (double)i) { 
        { float c; float d; float e; float f; float g; float l; float m; 
          float n; int N; c = (float)sin(i); d = (float)cos(j); 
          e = (float)sin(A); f = (float)sin(j); g = (float)cos(A); 
          l = (float)cos(i); m = (float)cos(B); n = (float)sin(B); 
          N = (int)((float)8 * ((((f * e - (c * d) * g) * m - (c * d) * e) -  
                                 f * g) - (l * d) * n)); 
          /*@ slice pragma expr N; */ ; 
        i = (float)((double)i + 0.02); 
      j = (float)((double)j + 0.07); 
    A = (float)((double)A + 0.04); 
    B = (float)((double)B + 0.02); 

Aside: in my first attempt to produce the slice above I made the mistake of including the approximative sin() and cos() functions written earlier:

#include "share/builtin.h" 
double sin(double x) 
  return Frama_C_double_interval(-1.0  1.0); 

The slicing plug-in always up for mischief noticed that sin() and cos() did not use their arguments and removed completely the computations of variables A and B and their declarations now useless. The values of these variables do not matter but we'll need to know that cos() and sin() are applied to the same variables i j A B if we want to have a chance to establish the post-condition. The slicing was removing this crucial piece of knowledge from the sliced program but it was only taking advantage of the information that we had given it with our function stubs.

Pascal Cuoq
7th Aug 2011