What functions does a function use: option -users

Pascal Cuoq - 5th Nov 2011

Exploring unfamiliar code

Sometimes, one finds oneself in the situation of exploring unfamiliar code. In these circumstances, it is sometimes useful to know which functions a function f() uses. This sounds like something that can be computed from the callgraph, and there exists plenty of tools out there that can extract a callgraph from a C program, but the callgraph approach has several drawbacks:

  1. A static callgraph does not include calls made through function pointers. Therefore, you do not see all the functions that f() uses: the list omits the functions that were directly or indirectly called through a function pointer.
  2. The set of functions computed from the call graph is over-approximated, because if f() calls g() and g() may sometimes call h(), it doesn't necessarily mean that f() uses h(). Indeed, perhaps g() never calls h() when it is called from f(), but only when it is called from another function k().

Example

Here is an example that illustrates both issues.

enum op { ADD, MULT };
void copy_int(int *src, int *dst)
{
  *dst = *src;
}
int really_add(int u, int v)
{
  return u + v;
}
int really_mult(int u, int v)
{
  return u * v;
}
int do_op(enum op op, int u, int v)
{
  if (op == ADD)
    return really_add(u, v);
  else if (op == MULT)
    return really_mult(u, v);
  else
    return -1;
}
int add(int x, int y)
{
  int a, b, res;
  void (*fun_ptr)(int*, int*);
  fun_ptr = copy_int;
  (*fun_ptr)(&x, &a);
  (*fun_ptr)(&y, &b);
  res = do_op(ADD, a, b);
  return res;
}

Using a syntactic callgraph to compute the functions used by add(), one finds do_op(), really_add(), and really_mult(). This list is over-approximated because add() does not really use really_mult(). More importantly, the list omits function copy_int(), which is used by add().

Frama-C's users analysis

Frama-C's users analysis computes this list instead:

$ frama-c -users -lib-entry -main add example.c
...
[users] ====== DISPLAYING USERS ======
        do_op: really_add
        add: copy_int really_add do_op
        ====== END OF USERS ==========

The users analysis exploits the results of the value analysis, so the results hold for the initial conditions the value analysis was configured for. Here, the value analysis was instructed to study the function add() by itself. In these conditions, do_op() only calls really_add(), but if the analysis focused on a larger program it would see that do_op() also sometimes call really_mult(). The users analysis can tell that add() uses copy_int(), really_add(), and do_op(), and does not use really_mult().

This kind of synthetic information is very useful when trying to get a grip on large programs, for instance, when trying to extract a useful function from a large codebase to make it a library. Unsurprisingly, plenty of tools already existed before Frama-C that tried to provide this sort of information. But having information on the dynamic behavior of the program can make a large difference in the value of the synthetic information computed.

My colleagues at Airbus Opérations SAS and Atos SA will present serious applications of the -users option at ERTS² 2012 (next February).

Pascal Cuoq
5th Nov 2011