Analysis scripts: helping automate case studies, part 2

André Maroneze - 15th Feb 2018

In the previous post, we used analysis-scripts to prepare our analysis of Recommender. In this post, we will run EVA on the code and see how the iterative refinement of the analysis can be done. We assume the reader has performed all of the steps in the previous post, and is currently in the directory containing the GNUmakefile created previously.

Running the first EVA analysis

Running make main.eva will run the EVA plug-in in the console, and then output some metrics information. This is useful for scripts and experienced users. For now, running make main.eva.gui is more useful: it will run the analysis and then open the results in the GUI.

A *.{parse,eva}.gui target in analysis-scripts only run parsing/EVA if necessary; if it has already been run, and no source files or analysis parameters were modified, it will directly open the GUI.

In the GUI, in the Properties panel (you have to click Refresh after loading the GUI) if we inspect only Unknown/Invalid properties, we will notice 148 of them, several related to finiteness of floating-point values, as well as many related to variable initialization.

Note that the results of each analysis are stored in the main.eva directory, and also copied to a timestamped directory in the form main_YYYY-MM-DD_hh-mm-ss.eva. Thus main.eva always contains the latest analysis, while the timestamped copies are useful to retrieve past results and compare them, e.g. using meld’s folder comparison view.

Iterating on EVA analyses

Let us use the Loop Analysis plug-in to obtain more precise results. Since analysis-scripts already created a Frama-C save file (extension .sav by default), we just need to load it and add -loop:

frama-c -load main.eva/framac.sav -loop > main.slevel

Open the main.slevel file with a text editor. It contains the output of the Loop Analysis plug-in: some informative text, and then a line that says:

[loop] Add this to your command line:

Followed by several command-line arguments related to the use of slevel. Let us erase everything up to and including the Add this ... line, and then include the file contents in the EVAFLAGS variable of our GNUmakefile:

EVAFLAGS    += $(shell cat main.slevel | tr '\\' ' ')

Note that we use the tr command-line utility to remove the backslashes from the output of Loop Analysis.

If we re-run make main.eva.gui, the dependencies managed by Make will re-run the analysis because its command-line changed, but not the parsing, since it was not impacted.

This time, we obtain 114 unknown/invalid properties.

Appeal to brute-force

Since this is a test case that runs very quickly, and because Loop Analysis’ heuristics are not perfect, we can allow ourselves to try a somewhat aggressive -slevel. Let us remove the heuristics given by Loop Analysis, and instead use a global slevel of 500:

EVAFLAGS    += -slevel 500

The reason why we remove Loop Analysis’ results is that, in some cases, it forces merging of states to avoid slowing down the analysis, and those settings take precedence over the global slevel.

The analysis will be much longer this time, but should still terminate in a few seconds. Given such a high slevel, we are lucky that it terminates at all. We will obtain 86 unknown/invalid properties this time.

Extra hypotheses for more precision

There are several options in EVA that allow a more precise analysis of less-than-ideal code that can be found in the wild. Most of them require knowing some details about the semantics of C, but others can be used more leniently, to help get a better idea of the code under analysis, before some serious analysis time is spent. One of them is -no-val-malloc-returns-null. This option is only useful when the code contains dynamic memory allocation, and what it does is to consider that memory allocation never fails.

By default, the memory allocation built-ins used by EVA suppose that memory allocation can fail, as stated in the standard (i.e. malloc can return NULL, for instance when there is not enough available memory). However, many code bases fail to test for such cases; which is admittedly a relatively rare situation, especially on a system that allows memory overcommitment. The -no-val-malloc-returns-null thus adds a new hypothesis to the underlying analysis (“… and considering that memory allocation never fails…”), in exchange for a more precise result.

In Recommender, we notice that there are some spots where malloc is tested for NULL (e.g. in src/learned_factors.c:47), but others where no such test is performed (e.g. in src/learned_factors.c:67). Thus adding this option should (hopefully) result in fewer alarms. Our EVAFLAGS line becomes:

EVAFLAGS    += -slevel 500 -no-val-malloc-returns-null

Indeed, running the analysis again results in only 56 unproved properties. However, it also reveals another issue.

Tracking red alarms

Opening in the GUI the result of the last iteration of the analysis, if we scroll through the main function, we see that the analysis has stopped (became red) before returning from the call to recommend_items. Jumping to its definition, we see a loop where j should go from 0 to 2 (the value contained in (estim_param->tset)->items_number)), but inspecting the value of j in the while condition shows {0; 1}. This means that the third iteration never took place, its trace cut short by some evaluation that resulted in a definitely invalid property.

There are several ways to proceed here:

  • One can follow the “red path” in the GUI, jumping inside reachable functions in the GUI until the limit of the red path is found;
  • Alternatively, in this case we have a warning (in the Messages tab) that says that all target addresses were invalid, which often indicates an issue with the evaluation of a state;
  • Finally, one can prioritize alarms in the Properties tab, by checking only Invalid properties at first, before including Unknown ones.

In this case, all three would end up around the same source line (src/binary_heap.c:153), though this is not always true.

Spoiler alert: the Chlorine 17 release of Frama-C will include an extra panel, Red alarms, which will provide another way to get to properties that warrant extra attention when starting an analysis.

Note that, whenever the C code contains calls to assert(), Frama-C generates a call to __FC_assert whose precondition is requires \false. This leads to an Invalid alarm in the Properties panel, however this does not indicate that the source code necessarily has undefined behavior: it may be due to an imprecision in the assertion condition, which if possibly true, leads to the call being made, which will itself definitely indicate an error. But if the call may not happen, then the assertion should be treated like an Unknown alarm.

Spoiler alert: in Frama-C Chlorine (17), the behavior of assert will be changed to more closely match the usual approach: an Invalid alarm will only be generated when the assertion is definitely violated.

Back to the Invalid alarm: if we inspect the source code in question, we will see that the only valid index for the buffer bheap->buffer is 0, due to the ACSL assertion inserted by EVA:

/*@ assert Value: index_bound: bheap->filled_elements < 1; */
bheap->buffer[bheap->filled_elements] = value;

However, filled_elements is 1 in this call, which seems like an error. But why is it there? While it is impossible to know exactly the intent of the author of the code, by inspecting the definition of the buffer, we can obtain some clues. In the Information panel, if we inspect bheap, we see that:

Variable bheap has type `binary_heap_t *'.
It is a formal parameter.
It is referenced and its address is not taken.

And then if we click binary_heap_t itself, we obtain its type information:

Type information for `binary_heap_t':
(sizeof 16, defined at src/binary_heap.h:53)
 struct binary_heap {
    size_t max_size ;
    size_t filled_elements ;
    int (*value_comparer)(void const *, void const *) ;
    void *buffer[1] ;

Finally, clicking the source line will take us to the definition in the binary_heap.h header, which contains some useful comments:

 * binary_heap:  The heap is tree based data structure
 * Members:
 *   max_size        The number of elements in the binary heap
 *   filled_elements The number of inserted elements so far
 *   buffer          The buffer is a dynamically allocated array
 *                   containing the heap's elements
typedef struct binary_heap
  size_t          max_size;
  size_t          filled_elements;
  bh_value_cmp    value_comparer;
  void*           buffer[1];
} binary_heap_t;

From the comment, it seems that the code is implementing the pattern of a flexible array member, where the last element of the binary_heap_t structure should be an incomplete array type. However, its current declaration corresponds to that of a fixed-size array, which according to the C standard cannot be accessed beyond its static bounds. We can fix this by using the proper flexible array member notation defined in the C99 standard.

Syntactic fixes

By replacing void* buffer[1]; with void* buffer[]; in src/binary_heap.h, we define a proper flexible array member. As a consequence, the code can access all elements of the dynamically-allocated array, as long as enough memory has been allocated for them.

You may notice that, after modifying the .h file and re-running make main.eva.gui, Make will not re-parse nor re-run the analysis, it will simply open the GUI again. This is because the .h file dependencies are not tracked by the Makefile rules: they were never included in the list of .c sources of main.parse (because such headers are directly included via preprocessor directives), so Make has no way to know about these implicit dependencies. You can remedy that by forcing Make to unconditionally make all targets, adding -B to the command line arguments.

After doing so, everything is recomputed, and we obtain … almost the same alarm.

This time, the alarm happens on the third iteration of the loop, instead of the second one: progress! … But not much. There is still a problem when accessing the last element of the array. To find out why, we can use the GUI again. However, this time the Invalid alarm is no longer present; the problem is more subtle. Still, the warning about all target addresses were invalid is still there. It points to this statement:

    /*@ assert
        Value: mem_access: \valid(&bheap->buffer[bheap->filled_elements]);
    bheap->buffer[bheap->filled_elements] = value;

By inspecting bheap->filled_elements, we see its possible values were {1; 2} before the statement, but {1} afterwards. This indicates that index 2 is not valid for this memory access. But why? We can check the location being accessed, to see its validity.

If we inspect the values for bheap, we see, in the Values tab:

bheap -> {{ (binary_heap_t *)&__malloc_init_binary_heap_l42 }}

Then, if we left-click on __malloc_init_binary_heap_l42, the Information panel will display some extra lines about this value in particular, with a clickable link pointing to its syntactic information. Clicking on that link will display the type of the dynamically allocated block:

Variable __malloc_init_binary_heap_l42 has type `char [23]'.

For dynamically allocated bases, EVA does not always have access to the original type intended for the allocated base, so it uses heuristics to restore that information. When such heuristics fail, it resorts to the “catch-all” char [].

“23” is a somewhat unusual size (not a multiple of the word size). Let us investigate where that number came from.

Using Studia

Right-clicking on __malloc_init_binary_heap_l42, in the Information tab, will display a context menu, the same one as if we had right-clicked some lvalue on the Cil code. We will use the Studia plug-in, in particular its Writes feature, to identify all points in the source code where this memory location was written prior to the statement indicated by the warning.

Studia will compute and highlight the statements in the code, and also add an extra column to the filetree display (upper left corner in the GUI), indicating all functions directly writing to the memory location (indicated by a ‘tick’ symbol in the Studia column) and the functions indirectly writing to it, that is, the callers of the former (indicated by an arrow symbol in the Studia column) as well as their own callers, recursively.

The functions writing to buffer are: balance_heap, init_binary_heap and insert_binary_heap. The first and the last write to fields in the buffer structure, but only init_binary_heap actually allocated the memory.

Well, I guess you could have directly inferred that from the variable name, which serves this exact purpose, but then I wouldn’t have had an excuse to shamelessly plug Studia, one of the new features in Frama-C 16 Sulfur, would I?

Inside init_binary_heap lies our answer: the computation of the size of the bheap includes a -1, which is due to the fact that the non-flexible array member notation already had a byte allocated for the array, but our proper C99-compatible version no longer includes this byte. So the malloc’ed memory was 1 byte shorter, which led to the invalid memory access. So all we have to do is to fix the computation here (and take note of the fact that modifying other people’s code without properly understanding it can lead to bugs).

At last … unable to progress

This time, we re-run the analysis (no need to use -B this time, since the .c file is tracked in the dependencies of our GNUmakefile), expecting to triumph over the syntactic distractions. However, we are greeted with a somewhat unpleasant surprise:

src/binary_heap.c:118:[value] warning: detected recursive call
    (balance_children <- balance_children :: src/binary_heap.c:135 <-
                         pop_binary_heap :: src/recommender.c:95 <-
                         recommend_items :: test/test.c:108 <-
    Use -val-ignore-recursive-calls to ignore (beware this will make the analysis
[value] user error: Degeneration occurred:
    results are not correct for lines of code that can be reached from the degeneration point.

We should have seen it coming: binary heaps are fond of recursion. Our coverage did improve from the previous analysis (from about 75% to about 85%), but we now hit a harder obstacle. To deal with this, we’ll have to stub the balance_children function, possibly over-approximating its behavior, or rewrite an equivalent, iterative version of the function. In either case, such transformations are out of the scope of this tutorial.

Conclusion and bonus features

In this tutorial, we showed how to use analysis-scripts, presenting most of its features in a quick usage scenario. There are still a few tricks up its sleeve which were not mentioned:

  • flamegraph.txt: this file is computed by option -val-flamegraph, and it produces a Flamegraph of the analysis performed by EVA. This graph is useful when profiling larger case studies, to quickly visualize where the analysis is spending most of its time. This can indicate functions where less slevel or ACSL stubs might help.
  • metrics.log: its output is displayed after running a .eva target. The coverage estimation is a quick indicator of progression in the analysis during the refining stage, i.e. sometimes when an analysis is made more precise, the number of alarms may actually increase, and the reason may be that the coverage has improved, so that more code is actually being analyzed, which might explain why now there are more alarms.
  • alarms.csv: a CSV report with the same contents of the Properties panel in the GUI, obtained via the Report plug-in. Very useful for scripts.
  • stats.txt: similar to the one present in the .parse directory; however, since parsing is usually very fast when compared to an EVA analysis, this version of the file is the one that is actually useful most of the time. In particular, if you want to compare the execution time of a new parameterization of the analysis, you just need to look at the user_time line. Very useful when you realize you forgot to type time make instead of make.

We hope this tutorial was useful to you and that the next Frama-C release will make things even easier! Don’t forget to check some usage examples in open-source-case-studies, and please consider proposing your own case studies via Github issues.

André Maroneze
15th Feb 2018