Frama-C-discuss mailing list archives

This page gathers the archives of the old Frama-C-discuss archives, that was hosted by Inria's gforge before its demise at the end of 2020. To search for mails newer than September 2020, please visit the page of the new mailing list on Renater.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Frama-c-discuss] Issues with analyzing the nginx webserver using the eva plugin



Hello Christian,


I'd just like to complement David's excellent answer:


- Eva was initially developed with embedded code in mind; we are 
continuously working towards general-purpose code, including dynamic 
allocation, recursion (in the future) and library calls. Dynamic 
allocation is supported, but due to its nature, a static analyzer is 
forced to compromise: the user often has to choose between analysis 
efficiency, precision and complexity of usage. By default, Eva minimizes 
precision, to keep it efficient and relative simple to use; with more 
precise knowledge of the code, and of Eva itself, it is possible to 
improve efficiency, precision, or both; but usually dynamic memory 
exacerbates small losses of precision elsewhere, even when it is not 
their direct cause.


- We are constantly adding C11 and POSIX library function specifications 
to help analyze code using them; but third-party libraries such as 
cryptographic components and other code still require some effort. Since 
Eva performs a "white-box" interprocedural analysis by default, external 
code needs to be brought into the analysis somehow; either by writing 
some stubs or by including the third-party code in the analysis.


- Recursion is one of our future targets, but it requires considerable 
effort to adapt the tool.


- In practice, especially for first-time users, David's remarks stand: 
nginx itself is too large and complex to be precisely analyzed by Eva. 
Spending hours on it is rarely (if ever) going to improve results. A 
more interesting approach might be to toy with option 
`-eva-stop-at-nth-alarm <N>`, which stops the analysis when the <N>-th 
alarm is found. You can then open the GUI and, in some cases, obtain 
some "red" (Invalid) alarms, which will hopefully indicate issues 
deserving more immediate attention. David's recommendation of using kill 
with the USR1 signal results in something similar: the analysis is 
halted and its state is saved in a file. Running "frama-c-gui -load" on 
that file will show the partial result of the analysis. The most common 
issue I've found in such case are initialization problems, since they 
happen quite early in the analysis.


- Another thing to take into account in large analyses is the generation 
of garbled mix: there is a special warning key, `eva:garbled-mix`, which 
warns when they are formed. In some cases this is inevitable, and in a 
few cases they are harmless; but in the majority of situations, they 
indicate the analysis has already lost precision and might snowball from 
there. In some cases, tracking them can help identify early cases of 
imprecise specifications or critical loops where extra precision is 
needed. Note that removing garbled mix often has a positive impact on 
the efficiency of the analysis.


- Remember that Eva, by default, runs a context akin to a "test case": 
the analyzed functions correspond to those reachable during execution of 
a 'main' function. If you have a single test file with dozens or 
hundreds of test functions, a useful approach is to split them into 
separate analyses, e.g. with #ifdefs and corresponding 
`-cpp-extra-args=-DTEST1`, `-cpp-extra-args=-DTEST2`, etc. This allows 
running analyses in parallel and prevents losses of precision from one 
part of the analysis to spread into the rest.


Overall, I'd say that your efforts to prepare the sources and ensure Eva 
can run are themselves already something remarkable. If by chance they 
are available somewhere, we'd be interested to take a look at them and 
see if we can provide some help into how to better parametrize Eva. If 
so, we'll work on the documentation to ensure such options become more 
prominent in the future.



Best regards,



On 03/05/2020 18:28, David MENTRÉ wrote:
> Hello Christian,
>
> In short, you are starting with a very complex project : nginx is big 
> and complex code and EVA is not designed to analyze such kind of code 
> (EVA target is safety critical embedded code, with very few library 
> calls, no dynamic memory allocation and no recursion). You might be 
> able to analyze and get result on some parts of nginx, although due to 
> the use of many library calls it might not be that easy.
>
> Nonetheless, here are some hints:
>
>   * At any time, you can stop an analysis using "kill -USR1
>     pid-of-frama-c-process". It saves the intermediate result in
>     analyzed.sav.errors file and you can then use "frama-c-gui -load
>     analyzed.sav.errors" to load the analyze and try to see what is
>     going on (not analyzed code appear as red unreachable code and the
>     degeneration points appear as yellow & orange)
>   * When you are seeing message like "Assigning imprecise value to
>     ...", it means EVA is losing precision on the way the program is
>     doing memory copies. Most of the time, it means that EVA has
>     completely lost track of what the program is doing and you won't
>     get any meaningful result afterwards
>   * In the same way, if EVA runs more than a few hours, most of the
>     time you won't get much of it
>   * As far as I know, the default parameters make the quickest
>     analysis, at the expense of lack of precision. You probably won't
>     get better than that
>   * EVA is inherently single core application. The only way to use
>     multiple core is to cut your program into smaller parts and
>     analyze separately each part
>
> If nginx has some parts of code that does not do dynamic memory 
> allocation and do not manipulate too much pointers, maybe you can use 
> EVA on them.
>
>
> Best regards,
> david
>
>
> Le 30/04/2020 à 14:50, Flach, Christian a écrit :
>> Hello everyone,
>>
>> I am trying to use Frama-C for analyzing the nginx webserver. I have 
>> not used Frama-C before.
>>
>> My goal is to be able to answer questions like "why would the 
>> webserver respond with status code 200, given an incoming request and 
>> the webserver configuration?". On a C-level, that would mean 
>> questions like: "what configuration struct fields will lead to the 
>> status field of the response struct being set to 200?" or "which 
>> configuration struct fields influence the response header field in 
>> the response struct".
>>
>> I would hope to be able to answer these questions by using static 
>> analysis of nginx. Frama-C (and its eva plugin) should be able to 
>> answer these questions, if I'm not mistaken.
>>
>> I was able to successfully parse the source code with Frama-C. I got 
>> most of it working by following this blog post [1] and copied the 
>> other necessary parts from header files on my system.
>>
>> I am stuck with running the eva plugin. I was able to successfully 
>> run it when I parsed only a subset of nginx's source files. Running 
>> eva took less than 3 hours and seemed to work. However, when I try to 
>> run the eva plugin after parsing a bigger set of nginx source files, 
>> it never finishes:
>>
>> I have let eva run for 40 hours on a 4,4GHz CPU and it didn't finish. 
>> While I see lot's of logging happening in the first ~3 hours, there 
>> is a gap between log messages of 15-60 minutes after that.
>>
>> My issue is that I have no idea
>> - whether the analysis is actually ongoing when running it that long 
>> or stuck in some sort of loop
>> - how close the analysis is to finishing / how much progress has been 
>> made
>> - whether there is anything I can configure to make the analysis 
>> finish faster or run on more than one core
>>
>> For reference, this is how I parse the code and execute the eva 
>> plugin [2]. I also see hundreds, of lines like this [3]. Maybe 
>> Frama-C is choking on nginx's heavy use of dynamic allocations and 
>> memory pooling?
>>
>> Do you have any advice for running eva on a big project like nginx? 
>> Are there command line options that I am missing that might help with 
>> figuring out if something is wrong? Are there any additional options 
>> to make the analysis faster by giving up some precision? I also want 
>> to make clear that I am not trying to proof or verify any nginx 
>> properties; approximations are fine to some degree.
>>
>>
>> Best
>> Christian
>> (apologies if you received this email twice - it appears like I first 
>> used the wrong email address)
>>
>>
>> [1] 
>> http://blog.frama-c.com/index.php?post/2018/07/06/Parsing-realistic-code-bases-with-Frama-C
>>
>>
>> [2]
>>
>> export FRAMA_C_MEMORY_FOOTPRINT=10
>>
>> # Parsing
>>
>> CPP_EXTRA_ARGS="-include../ext-include/__fc_stubs.h -I ../ext-include 
>> -I src/core -I src/event -I src/event/modules -I src/os/unix -I objs 
>> -I src/http -I src/http/modules -DNGX_HAVE_GMTOFF=0 
>> -DNGX_HAVE_O_PATH=0 -DNGX_ZLIB=0 -DNGX_HAVE_TCP_INFO=0 
>> -DNGX_HAVE_MEMALIGN=0 -DNGX_HAVE_POSIX_MEMALIGN=0 
>> -DNGX_HAVE_IP_PKTINFO=0 -DNGX_HAVE_GNU_CRYPT_R=0 -DNGX_HAVE_EPOLL=0 
>> -DNGX_HAVE_EPOLLEXCLUSIVE=0 -DNGX_HAVE_ACCEPT4=0"
>>
>> frama-c -metrics -cpp-extra-args="$CPP_EXTRA_ARGS" \
>>         -kernel-warn-key annot:missing-spec=abort \
>>         -kernel-warn-key typing:implicit-function-declaration=abort \
>> -save=$OUTPATH/parsed.sav $FILES | tee $OUTPATH/parse.log
>>
>> # EVA
>>
>> frama-c -load $OUTPATH/parsed.sav \
>>         -main main \
>>         -eva \
>> -eva-ignore-recursive-calls \
>> -eva-use-spec=ngx_log_error_core \
>>         -eva-warn-key builtins:missing-spec=abort \
>>         -eva-warn-key alarm=inactive \
>> -save=$OUTPATH/analyzed.sav | tee $OUTPATH/analyze.log
>>
>> [3]
>>
>> [eva] src/core/ngx_string.c:273:
>> Assigning imprecise value to *tmp_4
>>   (pointing to ngx_errlog_module with offsets [0..792],0%8;
>>   ngx_core_module with offsets [0..792],0%8;
>>   init_cycle with offsets [0..2488],0%8;
>>   ngx_conf_module with offsets [0..792],0%8;
>> ngx_http_header_filter_module with offsets [0..792],0%8;
>>   ngx_debug_points with offsets [0..280],0%8;
>>   ngx_core_commands with offsets [0..3800],0%8;
>>   ngx_core_module_ctx with offsets [0..120],0%8;
>>   ngx_errlog_commands with offsets [0..440],0%8;
>>   ngx_errlog_module_ctx with offsets [0..120],0%8;
>>   ngx_log with offsets [0..312],0%8; ngx_log_file with offsets 
>> [0..152],0%8;
>>   ngx_conf_commands with offsets [0..440],0%8;
>>   ngx_syslog_dummy_log with offsets [0..312],0%8;
>>   ngx_syslog_dummy_event with offsets [0..408],0%8;
>> ngx_http_header_filter_module_ctx with offsets [0..248],0%8;
>>   __malloc_ngx_alloc_l22 with offsets [0..8184];
>>   __malloc_ngx_alloc_l22_2 with offsets [0..408];
>>
>>    ... 100s to 1000s more malloc lines go here ...
>>
>> __malloc_ngx_alloc_l22_7196; __malloc_ngx_alloc_l22_7197;
>> __malloc_w_ngx_alloc_l22_7198; __malloc_w_ngx_alloc_l22_7199;
>> __malloc_w_ngx_alloc_l22_7200}
>>   because of Misaligned.
>>
>>
>>
>> _______________________________________________
>> Frama-c-discuss mailing list
>> Frama-c-discuss at lists.gforge.inria.fr
>> https://lists.gforge.inria.fr/mailman/listinfo/frama-c-discuss
>
>
>
> _______________________________________________
> Frama-c-discuss mailing list
> Frama-c-discuss at lists.gforge.inria.fr
> https://lists.gforge.inria.fr/mailman/listinfo/frama-c-discuss

-- 
André Maroneze
Researcher/Engineer CEA/List
Software Reliability and Security Laboratory

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gforge.inria.fr/pipermail/frama-c-discuss/attachments/20200504/8877d2df/attachment-0001.html>