Never forget to sanitize your input

Virgile Prevosto - 19th Sep 2012

This post is a follow up of this one

A facetious colleague pointed out to me that the print_stmt function that is used to display the CFG in the post mentioned above behaves incorrectly when used over code that include string constants such as the one below:

void f(const char *); 
void test(int c) { 
  if ("A random string"[c]=='s') f("Hello  world"); else f("Goodbye"); 

Namely poor dot is a little bit confused by all these double quotes occurring in labels that are themselves delimited by double quotes. It manages to output something but the result is not really satisfying: cfg_string_cst.png

Thus it looks as though the first improvement to our little script will be to have a more robust pretty-printing function for statements. Basically we have to escape those pesky double quotes that might appear when pretty-printing a C expression. Fortunately OCaml's Printf (hence Format) module does know how to do this with the "%S" directive. There's one drawback though: this directive will also put (unescaped) double quotes at the beginning and end of the result that we'll have to get rid of. In the end the escaping function is the following:

let protect pretty chan v = 
  let s = Pretty_utils.sfprintf "%a" pretty v in 
  let s = Pretty_utils.sfprintf "%S" s in 
  Format.pp_print_string chan (String.sub s 1 (String.length s - 2)) 

First we create a string containing the the unescaped output. Pretty_utils.sfprintf is a Frama-C function that is analogous to Format.sprintf except that 1) it can be fed with the same pretty-printing functions as fprintf when there is a custom "%a" directive and 2) each call uses its own internal buffer avoiding subtle flushing issues. Second we escape all the inner double quotes. Finally we output the result except for the first and last characters that by definition of "%S" are double quotes.

We can now use protect in print_stmt each time there might be an issue (the complete file is available here):

 let print_stmt out = 
-   | Instr i -> !Ast_printer.d_instr out i 
+   | Instr i -> protect !Ast_printer.d_instr out i 
-   | If (e _ _ _) -> Format.fprintf out "if %a" !Ast_printer.d_exp e 
-   | Switch(e _ _ _) -> Format.fprintf out "switch %a" !Ast_printer.d_exp e 
+   | If (e _ _ _) -> Format.fprintf out "if %a" (protect !Ast_printer.d_exp) e 
+   | Switch(e _ _ _) -> 
+       Format.fprintf out "switch %a"  
+         (protect !Ast_printer.d_exp) e 

Using this new version of the script frama-c -load-script test.c && dot -Tpng -O cfg.out produces a better result: cfg_string_cst_correct.png

Virgile Prevosto
19th Sep 2012