# The problem with differential testing is that at least one of the compilers must get it right

Pascal Cuoq - 25th Sep 2013

A long time ago, John Regehr wrote a blog post about a 3-3 split vote that occurred while he was finding bugs in C compilers through differential testing. John could have included Frama-C's value analysis in his set of C implementations and then the vote would have been 4-3 for the correct interpretation (Frama-C's value analysis predicts the correct value on the particular C program that was the subject of the post). But self-congratulatory remarks are not the subject of today's post. Non-split votes in differential testing where all compilers get it wrong are.

## A simple program to find double-rounding examples

The program below looks for examples of harmful double-rounding in floating-point multiplication. Harmful double-rounding occurs when the result of the multiplication of two `double` operands differs between the double-precision multiplication (the result is rounded directly to what fits the `double` format) and the extended-double multiplication (the mathematical result of multiplying two `double` numbers may not be representable exactly even with extended-double precision so it is rounded to extended-double and then rounded again to `double` which changes the result).

```\$ cat dr.c
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include <float.h>
#include <limits.h>
int main(){
printf("%d %a %La"  FLT_EVAL_METHOD  DBL_MAX  LDBL_MAX);
while(1){
double d1 = ((unsigned long)rand()<<32) +
((unsigned long)rand()<<16) + rand() ;
double d2 = ((unsigned long)rand()<<32) +
((unsigned long)rand()<<16) + rand() ;
long double ld1 = d1;
long double ld2 = d2;
if (d1 * d2 != (double)(ld1 * ld2))
printf("%a*%a=%a but (double)((long double) %a * %a))=%a"
d1  d2  d1*d2
d1  d2  (double)(ld1 * ld2));
}
}
```

The program is platform-dependent but if it starts printing something like below then a long list of double-rounding examples should immediately follow:

```0 0x1.fffffffffffffp+1023 0xf.fffffffffffffffp+16380
```

## Results

In my case what happened was:

```\$ gcc -v
Using built-in specs.
Target: i686-apple-darwin11
...
gcc version 4.2.1 (Based on Apple Inc. build 5658) (LLVM build 2336.11.00)
\$ gcc -std=c99 -O2 -Wall dr.c && ./a.out
0 0x1.fffffffffffffp+1023 0xf.fffffffffffffffp+16380
^C
```

I immediately blamed myself for miscalculating the probability of easily finding such examples getting a conversion wrong or following `while (1)` with a semicolon. But it turned out I had not done any of those things. I turned to Clang for a second opinion:

```\$ clang -v
Apple clang version 4.1 (tags/Apple/clang-421.11.66) (based on LLVM 3.1svn)
Target: x86_64-apple-darwin12.4.0
\$ clang -std=c99 -O2 -Wall dr.c && ./a.out
0 0x1.fffffffffffffp+1023 0xf.fffffffffffffffp+16380
^C
```

## Conclusion

It became clear what had happened when looking at the assembly code:

```\$ clang -std=c99 -O2 -Wall -S dr.c && cat dr.s
...
mulsd	%xmm4  %xmm5
ucomisd	%xmm5  %xmm5
jnp	LBB0_1
...
```

Clang had compiled the test for deciding whether to call `printf()` into `if (xmm5 != xmm5)` for some register `xmm5`.

```\$ gcc -std=c99 -O2 -Wall -S dr.c && cat dr.s
...
mulsd	%xmm1  %xmm2
ucomisd	%xmm2  %xmm2
jnp	LBB1_1
...
```

And GCC had done the same. Although to be fair the two compilers appear to be using LLVM as back-end so this could be the result of a single bug. But this would remove all the salt of the anecdote so let us hope it isn't.

It is high time that someone used fuzz-testing to debug floating-point arithmetic in compilers. Hopefully one compiler will get it right sometimes and we can work from there.

Pascal Cuoq
25th Sep 2013