Frama-C-discuss mailing list archives

This page gathers the archives of the old Frama-C-discuss archives, that was hosted by Inria's gforge before its demise at the end of 2020. To search for mails newer than September 2020, please visit the page of the new mailing list on Renater.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Frama-c-discuss] Issues with analyzing the nginx webserver using the eva plugin



Hello David and André,

thank you both for your detailed responses.
After reading your advice and tinkering around some more, I figured that
Frama-C probably is (in its current state and with its focus on
verification of small(er) embedded programs) not the ideal tool to analyze
nginx *for my specific use case*. Myself having barely any experience with
static analysis and the fact that nginx is full of dynamic allocations
doesn't help either ;)

Just in case it proofes useful, I have uploaded the exact commands and
custom include stubs on the frama-c branch in this repository:
https://github.com/cmfcmf/nginx/tree/frama-c (note that this is based on
nginx release-1.17.5)
You should be able to reproduce my attempts by executing the frama.sh
script.
I had to make changes to two nginx source files related to the itimerval
struct to successfully parse the source (see
https://github.com/cmfcmf/nginx/commit/d7eb963d40aec4e67d245fb6e0bbd997aa1b6d47).
I couldn't figure out why Frama-C wouldn't recognize the it_interval
and it_value fields.

Best regards
Christian

Am Mo., 4. Mai 2020 um 15:59 Uhr schrieb Andre Maroneze <
Andre.MARONEZE at cea.fr>:

> Hello Christian,
>
>
> I'd just like to complement David's excellent answer:
>
>
> - Eva was initially developed with embedded code in mind; we are
> continuously working towards general-purpose code, including dynamic
> allocation, recursion (in the future) and library calls. Dynamic allocation
> is supported, but due to its nature, a static analyzer is forced to
> compromise: the user often has to choose between analysis efficiency,
> precision and complexity of usage. By default, Eva minimizes precision, to
> keep it efficient and relative simple to use; with more precise knowledge
> of the code, and of Eva itself, it is possible to improve efficiency,
> precision, or both; but usually dynamic memory exacerbates small losses of
> precision elsewhere, even when it is not their direct cause.
>
>
> - We are constantly adding C11 and POSIX library function specifications
> to help analyze code using them; but third-party libraries such as
> cryptographic components and other code still require some effort. Since
> Eva performs a "white-box" interprocedural analysis by default, external
> code needs to be brought into the analysis somehow; either by writing some
> stubs or by including the third-party code in the analysis.
>
>
> - Recursion is one of our future targets, but it requires considerable
> effort to adapt the tool.
>
>
> - In practice, especially for first-time users, David's remarks stand:
> nginx itself is too large and complex to be precisely analyzed by Eva.
> Spending hours on it is rarely (if ever) going to improve results. A more
> interesting approach might be to toy with option `-eva-stop-at-nth-alarm
> <N>`, which stops the analysis when the <N>-th alarm is found. You can then
> open the GUI and, in some cases, obtain some "red" (Invalid) alarms, which
> will hopefully indicate issues deserving more immediate attention. David's
> recommendation of using kill with the USR1 signal results in something
> similar: the analysis is halted and its state is saved in a file. Running
> "frama-c-gui -load" on that file will show the partial result of the
> analysis. The most common issue I've found in such case are initialization
> problems, since they happen quite early in the analysis.
>
>
> - Another thing to take into account in large analyses is the generation
> of garbled mix: there is a special warning key, `eva:garbled-mix`, which
> warns when they are formed. In some cases this is inevitable, and in a few
> cases they are harmless; but in the majority of situations, they indicate
> the analysis has already lost precision and might snowball from there. In
> some cases, tracking them can help identify early cases of imprecise
> specifications or critical loops where extra precision is needed. Note that
> removing garbled mix often has a positive impact on the efficiency of the
> analysis.
>
> - Remember that Eva, by default, runs a context akin to a "test case": the
> analyzed functions correspond to those reachable during execution of a
> 'main' function. If you have a single test file with dozens or hundreds of
> test functions, a useful approach is to split them into separate analyses,
> e.g. with #ifdefs and corresponding `-cpp-extra-args=-DTEST1`,
> `-cpp-extra-args=-DTEST2`, etc. This allows running analyses in parallel
> and prevents losses of precision from one part of the analysis to spread
> into the rest.
>
>
> Overall, I'd say that your efforts to prepare the sources and ensure Eva
> can run are themselves already something remarkable. If by chance they are
> available somewhere, we'd be interested to take a look at them and see if
> we can provide some help into how to better parametrize Eva. If so, we'll
> work on the documentation to ensure such options become more prominent in
> the future.
>
>
>
> Best regards,
>
>
>
> On 03/05/2020 18:28, David MENTRÉ wrote:
>
> Hello Christian,
>
> In short, you are starting with a very complex project : nginx is big and
> complex code and EVA is not designed to analyze such kind of code (EVA
> target is safety critical embedded code, with very few library calls, no
> dynamic memory allocation and no recursion). You might be able to analyze
> and get result on some parts of nginx, although due to the use of many
> library calls it might not be that easy.
>
> Nonetheless, here are some hints:
>
>    - At any time, you can stop an analysis using "kill -USR1
>    pid-of-frama-c-process". It saves the intermediate result in
>    analyzed.sav.errors file and you can then use "frama-c-gui -load
>    analyzed.sav.errors" to load the analyze and try to see what is going on
>    (not analyzed code appear as red unreachable code and the degeneration
>    points appear as yellow & orange)
>    - When you are seeing message like "Assigning imprecise value to ...",
>    it means EVA is losing precision on the way the program is doing memory
>    copies. Most of the time, it means that EVA has completely lost track of
>    what the program is doing and you won't get any meaningful result afterwards
>    - In the same way, if EVA runs more than a few hours, most of the time
>    you won't get much of it
>    - As far as I know, the default parameters make the quickest analysis,
>    at the expense of lack of precision. You probably won't get better than that
>    - EVA is inherently single core application. The only way to use
>    multiple core is to cut your program into smaller parts and analyze
>    separately each part
>
> If nginx has some parts of code that does not do dynamic memory allocation
> and do not manipulate too much pointers, maybe you can use EVA on them.
>
>
> Best regards,
> david
>
> Le 30/04/2020 à 14:50, Flach, Christian a écrit :
>
> Hello everyone,
>
> I am trying to use Frama-C for analyzing the nginx webserver. I have not
> used Frama-C before.
>
> My goal is to be able to answer questions like "why would the webserver
> respond with status code 200, given an incoming request and the webserver
> configuration?". On a C-level, that would mean questions like: "what
> configuration struct fields will lead to the status field of the response
> struct being set to 200?" or "which configuration struct fields influence
> the response header field in the response struct".
>
> I would hope to be able to answer these questions by using static analysis
> of nginx. Frama-C (and its eva plugin) should be able to answer these
> questions, if I'm not mistaken.
>
> I was able to successfully parse the source code with Frama-C. I got most
> of it working by following this blog post [1] and copied the other
> necessary parts from header files on my system.
>
> I am stuck with running the eva plugin. I was able to successfully run it
> when I parsed only a subset of nginx's source files. Running eva took less
> than 3 hours and seemed to work. However, when I try to run the eva plugin
> after parsing a bigger set of nginx source files, it never finishes:
>
> I have let eva run for 40 hours on a 4,4GHz CPU and it didn't finish.
> While I see lot's of logging happening in the first ~3 hours, there is a
> gap between log messages of 15-60 minutes after that.
>
> My issue is that I have no idea
> - whether the analysis is actually ongoing when running it that long or
> stuck in some sort of loop
> - how close the analysis is to finishing / how much progress has been made
> - whether there is anything I can configure to make the analysis finish
> faster or run on more than one core
>
> For reference, this is how I parse the code and execute the eva plugin
> [2]. I also see hundreds, of lines like this [3]. Maybe Frama-C is choking
> on nginx's heavy use of dynamic allocations and memory pooling?
>
> Do you have any advice for running eva on a big project like nginx? Are
> there command line options that I am missing that might help with figuring
> out if something is wrong? Are there any additional options to make the
> analysis faster by giving up some precision? I also want to make clear that
> I am not trying to proof or verify any nginx properties; approximations are
> fine to some degree.
>
>
> Best
> Christian
> (apologies if you received this email twice - it appears like I first used
> the wrong email address)
>
>
> [1]
> http://blog.frama-c.com/index.php?post/2018/07/06/Parsing-realistic-code-bases-with-Frama-C
>
>
> [2]
>
> export FRAMA_C_MEMORY_FOOTPRINT=10
>
> # Parsing
>
> CPP_EXTRA_ARGS="-include../ext-include/__fc_stubs.h -I ../ext-include -I
> src/core -I src/event -I src/event/modules -I src/os/unix -I objs -I
> src/http -I src/http/modules -DNGX_HAVE_GMTOFF=0 -DNGX_HAVE_O_PATH=0
> -DNGX_ZLIB=0 -DNGX_HAVE_TCP_INFO=0 -DNGX_HAVE_MEMALIGN=0
> -DNGX_HAVE_POSIX_MEMALIGN=0 -DNGX_HAVE_IP_PKTINFO=0
> -DNGX_HAVE_GNU_CRYPT_R=0 -DNGX_HAVE_EPOLL=0 -DNGX_HAVE_EPOLLEXCLUSIVE=0
> -DNGX_HAVE_ACCEPT4=0"
>
> frama-c -metrics -cpp-extra-args="$CPP_EXTRA_ARGS" \
>         -kernel-warn-key annot:missing-spec=abort \
>         -kernel-warn-key typing:implicit-function-declaration=abort \
>         -save=$OUTPATH/parsed.sav $FILES | tee $OUTPATH/parse.log
>
> # EVA
>
> frama-c -load $OUTPATH/parsed.sav \
>         -main main \
>         -eva \
>         -eva-ignore-recursive-calls \
>         -eva-use-spec=ngx_log_error_core \
>         -eva-warn-key builtins:missing-spec=abort \
>         -eva-warn-key alarm=inactive \
>         -save=$OUTPATH/analyzed.sav | tee $OUTPATH/analyze.log
>
> [3]
>
> [eva] src/core/ngx_string.c:273:
> Assigning imprecise value to *tmp_4
>   (pointing to ngx_errlog_module with offsets [0..792],0%8;
>   ngx_core_module with offsets [0..792],0%8;
>   init_cycle with offsets [0..2488],0%8;
>   ngx_conf_module with offsets [0..792],0%8;
>   ngx_http_header_filter_module with offsets [0..792],0%8;
>   ngx_debug_points with offsets [0..280],0%8;
>   ngx_core_commands with offsets [0..3800],0%8;
>   ngx_core_module_ctx with offsets [0..120],0%8;
>   ngx_errlog_commands with offsets [0..440],0%8;
>   ngx_errlog_module_ctx with offsets [0..120],0%8;
>   ngx_log with offsets [0..312],0%8; ngx_log_file with offsets
> [0..152],0%8;
>   ngx_conf_commands with offsets [0..440],0%8;
>   ngx_syslog_dummy_log with offsets [0..312],0%8;
>   ngx_syslog_dummy_event with offsets [0..408],0%8;
>   ngx_http_header_filter_module_ctx with offsets [0..248],0%8;
>   __malloc_ngx_alloc_l22 with offsets [0..8184];
>   __malloc_ngx_alloc_l22_2 with offsets [0..408];
>
>    ... 100s to 1000s more malloc lines go here ...
>
>   __malloc_ngx_alloc_l22_7196; __malloc_ngx_alloc_l22_7197;
>   __malloc_w_ngx_alloc_l22_7198; __malloc_w_ngx_alloc_l22_7199;
>   __malloc_w_ngx_alloc_l22_7200}
>   because of Misaligned.
>
>
>
> _______________________________________________
> Frama-c-discuss mailing listFrama-c-discuss at lists.gforge.inria.frhttps://lists.gforge.inria.fr/mailman/listinfo/frama-c-discuss
>
>
>
> _______________________________________________
> Frama-c-discuss mailing listFrama-c-discuss at lists.gforge.inria.frhttps://lists.gforge.inria.fr/mailman/listinfo/frama-c-discuss
>
> --
> André Maroneze
> Researcher/Engineer CEA/List
> Software Reliability and Security Laboratory
>
> _______________________________________________
> Frama-c-discuss mailing list
> Frama-c-discuss at lists.gforge.inria.fr
> https://lists.gforge.inria.fr/mailman/listinfo/frama-c-discuss
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gforge.inria.fr/pipermail/frama-c-discuss/attachments/20200510/7991193c/attachment-0001.html>