On switch statements

Pascal Cuoq - 28th Feb 2011

In Carbon 20110201 and earlier versions of Frama-C, if you do not use precautions when analyzing a program with switch statements, you get imprecise results. Consider the example program below.

main(int argc, char **argv){ 
  switch(argc){ 
  case 1:  
    Frama_C_show_each_no_args(argc); 
    break; 
  case 2: 
    Frama_C_show_each_exactly_2(argc);     
    /* fall through */ 
  case 3: 
    Frama_C_show_each_2_or_3(argc); 
    break; 
  default: 
    Frama_C_show_each_else(argc); 
    break; 
  } 
  return argc; 
} 
$ frama-c -val sw.c 
... 
[value] Called Frama_C_show_each_exactly_2([-2147483648..2147483647]) 
[value] Called Frama_C_show_each_no_args([-2147483648..2147483647]) 
[value] Called Frama_C_show_each_else([-2147483648..2147483647]) 
[value] Called Frama_C_show_each_2_or_3([-2147483648..2147483647]) 
... 

This is correct, but disappointingly imprecise. Frama-C (the framework) can normalize the code to a large extent so that its plug-ins (here, the value analysis) do not need to handle redundant constructs. A switch construct can be encoded using if and goto. Frama-C option -simplify-cfg does exactly that, for the convenience of plug-in authors. As long as they have already programmed a precise analysis for these constructions, they can always get a precise result by requiring the program to be transformed thus.

Crediting where credit is due, the C code normalization philosophy is inherited from CIL which provided the basis of Frama-C's front-end.

The transformed program looks like this:

$ frama-c -simplify-cfg -print sw.c 
... 
  if (! (argc == 1)) { 
    if (argc == 2) { goto switch_0_2; } 
    else { 
      if (argc == 3) { goto switch_0_3; } 
      else { goto switch_0_default; goto switch_0_break; } 
    } 
  } 
  switch_0_1: /* internal */ Frama_C_show_each_no_args(argc); 
  goto switch_0_break; 
  switch_0_2: /* internal */ Frama_C_show_each_exactly_2(argc); 
  switch_0_3: /* internal */ Frama_C_show_each_2_or_3(argc); 
  goto switch_0_break; 
  switch_0_default: /* internal */ ; 
  Frama_C_show_each_else(argc); 
  goto switch_0_break; 
  switch_0_break: /* internal */ ; 
  return (argc); 
... 

I did not say the transformed code looked good. But usually you do not need to look at the generated code; the analysis results are what you are interested in:

$ frama-c -simplify-cfg -val sw.c 
... 
[value] Called Frama_C_show_each_no_args({1; }) 
[value] Called Frama_C_show_each_exactly_2({2; }) 
[value] Called Frama_C_show_each_2_or_3({2; 3; }) 
[value] Called Frama_C_show_each_else([-2147483648..2147483647]) 
... 

This time the value analysis' results show that each time the program point at which we placed the Frama_C_show_each_no_args call is traversed the value of argc is exactly 1. The program point with the call to Frama_C_show_each_exactly_2 is only traversed with argc==2 and so on. These are the expected results.

Sometimes however the value analysis is used as a means to a different end and option -simplify-cfg can conflict with the final objective. Let's say the example program is modified slightly:

int x  t[4]; 
main(int argc  char **argv){ 
  t[0] = 10; 
  t[1] = 11; 
  t[2] = 12; 
  t[3] = 13; 
  switch(argc){ 
  case 1:  
    Frama_C_show_each_no_args(argc); 
    break; 
  case 2: 
    Frama_C_show_each_exactly_2(argc); 
    /* fall through */ 
  case 3: 
    Frama_C_show_each_2_or_3(argc); 
    x = t[argc]; 
    break; 
  default: 
    Frama_C_show_each_else(argc); 
    break; 
  } 
  return x; 
} 

If the task we are actually interested in is obtaining a subset of the statements of the above program so that this subset when compiled and executed returns the same value as the original program option -simplify-cfg gets in the way. The options to instruct the slicing plug-in to compute this subset are -slice-return main -slice-print. We face the following dilemma:

  1. not use option -simplify-cfg give the slicing plug-in access to an unmodified abstract syntax tree but let it rely on imprecise values; or
  2. use option -simplify-cfg get precise values but have the slicing plug-in work on an abstract syntax tree representing a mangled version of the original program.

In the first case we get the sliced program below which is shaped like the original but retains several statements that could have been removed:

$ frama-c sw.c -slice-return main -slice-print  
... 
{ 
  t[0] = 10; 
  t[1] = 11; 
  t[2] = 12; 
  t[3] = 13; 
   switch (argc) { 
    case 1: ; 
    break; 
    case 2: ; 
    case 3: ; 
    /*@ assert (0 ≤ argc) ∧ (argc < 4); 
          // synthesized 
       */ 
    x = t[argc]; 
    break; 
    default: ; 
  } 
  return (x); 
} 

In the second case some useless initializations are removed but the transformation makes the sliced program harder to follow:

$ frama-c sw.c -slice-return main -slice-print -simplify-cfg 
... 
{ 
  t[2] = 12; 
  t[3] = 13; 
  if (! (argc == 1)) { 
    if (argc == 2) { goto switch_0_2; } 
    else { if (argc == 3) { goto switch_0_3; } } 
  } 
  goto switch_0_break; 
  switch_0_2: /* internal */ ; 
  switch_0_3: /* internal */ ; 
  x = t[argc]; 
  switch_0_break: /* internal */ ; 
  return (x); 
} 

Or... you can use the next version of Frama-C:

$ bin/toplevel.opt sw.c -slice-return main -slice-print  
... 
{ 
  t[2] = 12; 
  t[3] = 13; 
  switch (argc) { 
    case 1: ; 
    break; 
    case 2: ; 
    case 3: ; 
    x = t[argc]; 
    break; 
    default: ; 
  } 
  return (x); 
} 

As you can see on this example from the next release onwards option -simplify-cfg is no longer necessary to get precise results with the value analysis. This means better slicing of programs that use the switch construct — and one option that you may now forget about.

Many thanks to Boris Taker of our Own Value Analysis Medicine extraordinaire for implementing the new feature described in this post and to Miguel Ferreira for providing expertise in English idioms.

Pascal Cuoq
28th Feb 2011