Value analysis tutorial, part 3: answering one quiz

Pascal Cuoq - 15th Oct 2010

This is another episode in the advanced value analysis tutorial. The first episode was here and the second one here.

There were a couple of questions left unanswered at the break. This post answers the first one.

Quiz 1: continuing the study of the first five calls to Skein_256_Update what used to cause the warning below and what made it disappear?

lib.c:18:kernel warning: out of bounds write. assert \valid(tmp);

Value analysis messages in the log are output in order. To understand what the analysis was doing when an alarm was emitted it is recommended to look at the surrounding entries in the log. Using the command-line frama-c -val -slevel 100 -slevel-function main:0 *.c > log on the version of main without unrolling we get the following information about the circumstances that cause the value analysis to worry about line lib.c:18 in function memset:

...
[value] Done for function Skein_256_Update
[value] computing for function Skein_256_Final <-main.
Called from main.c:44.
[value] computing for function memset <-Skein_256_Final <-main.
Called from skein.c:212.
lib.c:18:[kernel] warning: out of bounds write. assert \valid(tmp);
[value] Recording results for memset
[value] Done for function memset
...

Using frama-c-gui to observe the value of the troublesome pointer inside memset we see for the main with unrolling:

tmp ∈ {{ &skein_context + [24..87] ; &cfg + [0..31] ;}}

Repeating the procedure for the main without unrolling we see:

tmp ∈ {{ &skein_context + [24..88] ; &cfg + [0..31] ;}}

It seems that the pointer causing the problem is at an offset of 88 bytes inside skein_context. To confirm this still while in frama-c-gui we can select any statement of the function main and use "Evaluate expression" in the contextual menu to evaluate the expression sizeof(skein_context). The result is as follows:

[...] all the values taken by the expression sizeof(skein_context) are contained in {88; }

Note: not all C expressions can be evaluated this way but expressions of the form sizeof(v) can at a statement where the variable v is in scope. This is often useful to make sense of the offsets in bytes displayed by the value analysis.

So the problem is indeed that the value analysis thinks that the pointer may refer to offset 88 of skein_context when dereferenced inside memset. This address is out of range. According to the log snippet above this happens when memset is called from Skein_256_Final. The log points us to line 212 in file skein.c that is the first call in that function. This call is guarded by the following conditional and comment:

if (ctx->h.bCnt < SKEIN_256_BLOCK_BYTES)   /* zero pad b[] if necessary */

      memset(&ctx->b[ctx->h.bCnt] 0 SKEIN_256_BLOCK_BYTES - ctx->h.bCnt);

In that call the pointer argument is & ctx->b[ctx->h.bCnt] and the values recorded at the if for ctx->h.bCnt are {0; 16; 32; } whereas the values at the call are {0; 16; } (the analysis knows that in the case of 32 the call is not executed because the condition ctx->h.bCnt < SKEIN_256_BLOCK_BYTES is false).

If you still have the frama-c-gui with the results for the main with unrolling lying around you can check that in that case the values for ctx->h.bCnt are {16; 32; } at the if and { 16; } at the call.

This leads us to the following conclusion: the imprecision that leads to the false alarm at lib.c:18 is caused by analyzing the call to memset with imprecise arguments in function Sein_256_Final. By unrolling the first few calls to Skein_256_Update we also forced these calls to happen systematically and that made the analysis conditions for Skein_256_Final simpler. When completing the verification with separate analyses of the omitted cases the arguments to memset are precise again (and complementary) so the false alarm is omitted again.

In fact if we kept investigating in the same fashion we would find that the case ctx->h.bCnt==0 corresponds to the absence of any call to Skein_256_Update. This case is indeed covered by the while loop in the main analysis context without unrolling. The alarm we obtained when analyzing that loop does not mean there is a problem with the sequence Skein_256_Init(...); Skein_256_Final(...); only that some peculiar values happen in this case and that analyzing this case together with the others leads to false alarms. When verifying software that has already been tested and that its authors take pride on it makes sense to heuristically assume that alarms are false alarms when investigating them. One should however remain honest and continue to investigate until the alarm is understood—ideally until the value analysis can confirm that there was no alarm after all.

Next time the answer to the other quiz.

Pascal Cuoq
15th Oct 2010