Helping the value analysis — part 2

Pascal Cuoq - 26th Mar 2011

How it started

In the last post we started a new series about helping the value analysis be more precise. We introduced an interpolation function with the goal of verifying that it is free of run-time errors. Having launched a first value analysis we got one alarm we need to take care of.

interp.c:28:[kernel] warning: float operation: assert \is_finite((double )((double )((double )(y1*(double )(x-x0))+(double ) (y0*(double )(x1-x)))/(double ) (x1-x0))); 

This alarm is emitted because the analyzer is unsure that the value computed by the function is not an infinite or NaN. The suspicion originates from the division by x1 - x0. Furthermore we informally deduced from x0 = t[i].x and x1 = t[i+1].x that x1 is always strictly greater than x0 and more specifically that x1 - x0 >= 0.125.

In this post we show how to eliminate the alarm formally. For a change of style we start with a few attempts that are natural but do not work.

Getting more precise

From the Value Analysis manual and former posts we know already that using the option -slevel with an adequate argument may (in some cases) help the analyzer already in being more precise. This is the first thing to try:

frama-c -val -slevel 10 -main interp interp.c 

We obtain the same results as before. The alarm does not disappear. The analyzer still does not know if the variables of x1 and x0 may contain the same value or (as they are double variables) if they may be very close to each other which would mean the subtraction x1 - x0 may be very close to zero.

We further investigate the output of the analysis:

[value] Values for function interp: 
          x0 ∈ [0. .. 5.] 
          x1 ∈ [0.125 .. 10.] 
          y0 ∈ [1. .. 74.2099485248] 
          y1 ∈ [1.00782267783 .. 11013.2329201] 
          i ∈ {0; 1; 2; 3; 4; 5; 6; } 
          __retres ∈ [-1.79769313486e+308 .. 1.79769313486e+308] 

We can see that the expected minimum and maximum return values are very large occupying the entire range allowed by a double. This is consistent with the fact that the value analysis is unable to guarantee there is no overflow in the computation of this result.

A natural first thought could be to write an assertion containing what we deduced already from looking at the code namely x1 - x0 >= 0.125. In fact let's modify the program so:

  y1 = ...; 
  //@ assert x1 - x0 >= 0.125; 
  Frama_C_show_each_diff(x1 - x0); 
  return (y1 * (x - x0) + y0 * (x1 - x)) / (x1 - x0); 
} 

Inserting this assertion could be useful in two ways: the analyzer could tell us that it is true which would be reassuring; and it could take advantage of the information it contains to improve the precision of its results. The call to Frama_C_show_each... only helps to observe the improvements.

frama-c -val -slevel 10 -main interp interp.c 
... 
interp_try.c:29:[value] Assertion got status unknown. 
[value] Called Frama_C_show_each_diff([-4.875 .. 10.]) 

Here neither of the two ways work: the analyzer cannot decide if this assertion is correct and it is also unable to take advantage of the provided information to eliminate the alarm or for that matter to improve its prediction of the value of x1 - x0 which still contains zero and values close to zero.

But why? The states propagated by the value analysis assign values to individual variables such as x0 and x1. Looking again at the final values we see that x0 and x1 are both interpreted as the following floating-point ranges:

x0 ∈ [0. .. 5.] 
x1 ∈ [0.125 .. 10.] 

Providing information about x1 - x0 when x0 and x1 are already imprecise has the following drawbacks:

  • the analyzer cannot tell whether the assertion is true or false because it is already manipulating imprecise values for x0 and x1 that could make the formula true or false.
  • Even assuming it was strong at symbolic manipulations and able to transform x1 - x0 >= 0.125 into x1 >= x0 + 0.125 (or a similar formula that is actually a correct transformation in double-precision floating-point arithmetics) to improve the value of x1 it would be impaired by the fact that x0 is already imprecise so that the formula doesn't tell it anything new about the value of x1. And similarly when trying to take advantage of the formula to improve the value of x0.

The value analysis does not represent information about the relationship the values of x0 and x1 may have. Therefore it considers possibilities here that do not exist in a real execution. Since the states propagated by the analysis assign values to variables only the best way to provide information is to write assertions that directly tell about the values of variables:

  x0 = t[i].x; 
/*@ assert x0 == 0. || x0 == 0.125 || x0 == 0.25 || x0 == 0.5 ||  
           x0 == 1. || x0 == 2.    || x0 == 5. ;  */ 
  y0 = t[i].y; 
  x1 = t[i+1].x; 
/*@ assert x1 == 0.125 || x1 == 0.25 || x1 == 0.5 || x1 == 1. ||  
           x1 == 2.    || x1 == 5.   || x1 == 10. ;  */ 
  y1 = t[i+1].y; 
  Frama_C_show_each_diff(x1 - x0); 
  return (y1 * (x - x0) + y0 * (x1 - x)) / (x1 - x0); 
} 

We augment the option -slevel since the two assertions multiply the number of possibilities:

frama-c -val -slevel 100 -main interp interp.c 
... 
[value] Called Frama_C_show_each_diff(-4.875) 
[value] Called Frama_C_show_each_diff(-4.75) 
[value] Called Frama_C_show_each_diff(-4.5) 
... 
[value] Called Frama_C_show_each_diff({0; }) 
... 
[value] Called Frama_C_show_each_diff(5.) 
[value] Called Frama_C_show_each_diff(10.) 
... 
interp_try.c:36:[kernel] warning: float operation: assert \is_finite((double )((double )((double )(y1*(double )(x-x0))+(double ) (y0*(double )(x1-x)))/(double ) (x1-x0))); 

The alarm remains. This was to be expected because the analyzer does not know that certain combinations of values for the pair (x0 x1) are impossible which include the cases where x0 == x1. To remove these cases we can instead provide an assertion about the possible pairs of x0 and x1:

/*@ assert x0 == 0.    && x1 == 0.125 ||  
           x0 == 0.125 && x1 == 0.25  ||  
           x0 == 0.25  && x1 == 0.5   ||  
           x0 == 0.5   && x1 == 1.    ||  
           x0 == 1.    && x1 == 2.    ||  
           x0 == 2.    && x1 == 5.    ||  
           x0 == 5.    && x1 == 10. ; 
*/ 

Looking at the output again we notice that the alarm has disappeared. The values displayed for x1 - x0 show that the assertion was taken into account. Finally the return value is in a more acceptable range now.

[value] Called Frama_C_show_each_diff(5.) 
[value] Called Frama_C_show_each_diff(3.) 
[value] Called Frama_C_show_each_diff(1.) 
[value] Called Frama_C_show_each_diff(0.5) 
[value] Called Frama_C_show_each_diff(0.25) 
[value] Called Frama_C_show_each_diff(0.125) 
[value] Called Frama_C_show_each_diff(0.125) 
... 
[value] Values for function interp: 
... 
          __retres ∈ [-16801.608905 .. 881132.843557] 

Still one problem remains. The assertion can not be verified by the value analysis:

interp_try.c:29:[value] Assertion got status unknown. 

The alarm is gone but only under the assumption that the information provided in the above assertion is correct. And as far as the analyzer is concerned it is unfortunately not guaranteed.

This attempt is only half-successful because we are trying to reduce and divide the propagated state too late at a point where the values for x0 and x1 have already been computed and the information about their relationship already lost. Ideally we would like to nudge the analyzer into finding out about the relationship by itself. Being non-relational it cannot represent the relationship inside a single state but it can represent it virtually by using several states. This is in fact what we have been forcing it to do with the last assertion.

How to improve the provided hint?

The computation of the formula as investigated so far is dependent on the possible values of x0 x1 y0 and y1. Looking again at the function we can see that the respective computations of x0 x1 y0 and y1 are dependent on the value of i immediately before. Due to the postcondition of the choose_segment function we know that only integer values between 0 and 6 might be assigned to i. This post-condition is taken into account by the value analysis and the information restituted in the final values displayed for interp:

i ∈ {0; 1; 2; 3; 4; 5; 6; } 

Depending on the value of i there are actually only seven possible cases.

x0 == 0.    && x1 == 0.125 && y0 == 1.                  && y1 == 1.00782267782571089 ||  
x0 == 0.125 && x1 == 0.25  && y0 == 1.00782267782571089 && y1 == 1.03141309987957319 ||  
x0 == 0.25  && x1 == 0.5   && y0 == 1.03141309987957319 && y1 == 1.1276259652063807  ||  
x0 == 0.5   && x1 == 1.    && y0 == 1.1276259652063807  && y1 == 1.54308063481524371 ||  
x0 == 1.    && x1 == 2.    && y0 == 1.54308063481524371 && y1 == 3.76219569108363139 ||  
x0 == 2.    && x1 == 5.    && y0 == 3.76219569108363139 && y1 == 74.2099485247878476 ||  
x0 == 5.    && x1 == 10.   && y0 == 74.2099485247878476 && y1 == 11013.2329201033244 

Now we want to perform a case analysis to make the analyzer propagate only those 7 specific cases. A new appoach could be to formulate an assertion to make the analysis be split before the computation of the four variables by using information the analyzer already has the range of values for i. The function could now look like this:

/*@ requires 0. <= x <= 10. ; */ 
double interp(double x) 
{ 
  double x0  x1  y0  y1; 
  int i; 
  i = choose_segment(x); 
  //@ assert i == 0 || i == 1 || i == 2 || i == 3 || i == 4 || i == 5 || i == 6 ;  
  x0 = t[i].x; 
  y0 = t[i].y; 
  x1 = t[i+1].x; 
  y1 = t[i+1].y; 
  Frama_C_show_each_x0_x1(x0  x1); 
  return (y1 * (x - x0) + y0 * (x1 - x)) / (x1 - x0); 
}  

Running Frama-C this time we achieve success. The alarm is gone and the hint assertion can at the same time be verified. The call to Frama_C_show_each_x0_x1(x0 x1) confirms the absence of any case in which x0 and x1 contain the same value. Thereby it is possible to show that the divisor is not zero or a value too close to zero.

frama-c -val -slevel 100 -main interp interp.c 
... 
interp.c:25:[value] Assertion got status valid. 
[value] Called Frama_C_show_each_x0_x1(5. 10.) 
[value] Called Frama_C_show_each_x0_x1(2. 5.) 
[value] Called Frama_C_show_each_x0_x1(1. 2.) 
[value] Called Frama_C_show_each_x0_x1(0.5 1.) 
[value] Called Frama_C_show_each_x0_x1(0.25 0.5) 
[value] Called Frama_C_show_each_x0_x1(0.125 0.25) 
[value] Called Frama_C_show_each_x0_x1({0; } 0.125) 
... 
[value] Values for function interp: 
... 
          __retres ∈ [-11013.2329201 .. 11161.6528172] 

Regarding the return value again we obtain a result in an acceptable range. The difference with the previous approach is that this time the analyzer is able to verify the assertion. Therefore this last approach is not only less verbose but also safer than the former.

In conclusion case analysis is a powerful tool to help Frama-C's value analysis especially when the information required to reach the verification goal is relational in nature. When this technique works the assertion to split the different cases shouldn't be inserted too late when the analyzer has no chance to prove that the cases offered cover all the possibilities. And it shouldn't be inserted too early lest the information be lost again the number of combinations become too big or the analyzer waste resources on irrelevant parts of the target program.

This post is co-authored by Kerstin Hartig.

Pascal Cuoq
26th Mar 2011