Just a few more digits, please

Pascal Cuoq - 18th Nov 2011

Introduction: of Spivak pronouns

In this post, I really let fly the Spivak pronouns. They are not used for the first time in this blog or in documentation but they are abused here. I have no serious justification for this sudden peak. I have a couple of unserious ones: I started using Spivak pronouns a couple of years ago when I noticed that the EU expected research project proposals to explicitly state in which way the proposed project would close the gap between sexes in Europe. This is of course ridiculous. Safer and securer software verified faster and cheaper benefits everyone so we do not close any gap at all. The only way I thought to do our part here was to actively butcher the English language. Anything less would make us chauvinist pigs. So here I am.

Have you been paying attention?

If you were paying really close attention you may have noticed a new kind of log message in last post:

warning: Floating-point constant 1.01161128282547f is not represented exactly. 
  Will use 0x1.02f8f60000000p0. See documentation for option -warn-decimal-float 

This has been a subject of dispute between developers for a long while: should Frama-C have this warning or is it undesirable? The reasons for having it are obvious which does not prevent them from being strong and good reasons: the program does not do what the programmer wrote. Perhaps ey thought the processor would use base-10 arithmetics to accommodate eir distasteful habit. Perhaps ey thought the computer should represent the number that ey was thinking of. Perhaps the programmer even expected the computer to understand ey meant "π" when ey wrote 3.14 (key quote: "I entered [sin(3.14)]. Shouldn't that give me an answer of 0? It gave me 0.001593").

Well of course Frama-C does not aim at being useful for users who are quite as naive as that. This constitutes in fact most of the counter argument: many numbers whether finitely representable in decimal or otherwise useful are not finitely representable as binary floating-point numbers and the programmer should know and expect this. Why tell em something ey should already know? It's a difficult line to draw.

In truth one reason and it's not a good one for this warning to appear now is "because we can". Function strtod() and its single-precision and extended-precision colleagues do not tell whether the conversion they applied was exact or rounded. But when you write your own correctly rounded version of these functions it's very easy to keep the information if you want it.

There is another factor at play here: how conveniently can the programmer express in eir program that ey knows what ey is doing?

Example

Let us look at the implications of the new message on an example. Consider the programmer who wants eir variable d to be initialized to the nearest double-precision representation of 0.1. Ey writes:

double d = 0.1 ; 
main(){ 
  return 0; 
} 

This programmer knows what ey is doing. But Frama-C doesn't know that and just in case warns:

Floating-point constant 0.1 is not represented exactly. Will use 0x1.999999999999ap-4. 

The programmer could use 0x1.999999999999ap-4 as initializer of d but that would hardly improve the program's readability. Writing the decimal value inside a comment would still be bad style: the two constants are bound to desynchronize and then havoc will ensue. Comments are not intended for this. No the programmer may hope instead to demonstrate that ey knows what is going on by showing how many decimals he expects to be able to rely on:

double d = 0.10000000000000000 ; 

That doesn't work because 0.10000000000000000 is not represented exactly:

Floating-point constant 0.10000000000000000 is not represented exactly. Will use… 

Ah yes! thinks the programmer. I must show I understand that I can't count on decimals beyond these ones to be zeroes:

double d = 0.100000000000000005 ; 

But:

Floating-point constant 0.100000000000000005 is not represented exactly… 

Please reader allow me to make this story shorter that it could be. The programmer finally writes…

double d = 0.1000000000000000055511151231257827021181583404541015625 ; 
main(){ 
  return 0; 
} 

… and at last Frama-C accepts the program without a word.

Mitigation techniques

This is not a great user interface. The programmer must write more digits to demonstrate that ey knows exactly what the rounding error is than ey writes to unambiguously designate the double-precision floating-point ey wants. It wouldn't be so important if every compiler had to correctly round constants to the nearest floating-point representation but as discussed in last post the standard does not mandate it. Crucially by displaying the Will use 0x1.999999999999ap-4 part of the message Frama-C is also insuring itself against misunderstandings. The user cannot claim that because eir compiler does not round constants to the nearest Frama-C is unsound: it told em what values it would use.

"Caveat emptor" or the first word thereof would make a great name for a static analysis tool but unfortunately it's already taken (by one of the ancestors of Frama-C [removed dead link]).

I have tried to adjust the usability compromise thus:

  1. Frama-C by default only warns about the first non-exact constant and on that occasion gives you the name of the relevant command-line option. If you want to ignore it it's only one message to skip.
  2. The command-line option allows to activate warnings for all non-exact constants or for none of them if that's what you prefer.
  3. That still leaves one almost unavoidable almost always unwanted warning per analysis run as soon as the program uses floating-point. Here comes my usability masterstroke: I have removed another unavoidable unwanted warning that appeared as soon as the program used floating-point. Starting with the Oxygen release floating-point will cease to be considered experimental in the value analysis and the analyzer will no longer tell you that it is each time it gets the chance keeping the signal/noise ratio about the same as before.

Conclusion

In Frama-C Oxygen floating-point will no longer be considered experimental in the value analysis and the front-end will have a warning for inexact constants in the program. I'll bet you can't wait.

Pascal Cuoq
18th Nov 2011