Floating-point quiz
Pascal Cuoq - 8th Nov 2011Here is a little quiz you can use to test your C floating-point expertise. I have tried to write the examples below so that the results do not depend too much on the platform and compiler. This is theoretically impossible since C99 does not mandate IEEE 754 floating-point semantics, but let us assume that the compiler at least tries. It could be a recent GCC on 8087-class hardware, for instance.
Question 1
What's the return code of this program?
main(){ return 0.1 == 0.1f; }
Answer: the program returns 0
. Promotion and conversion rules mean that the comparison take place between double
numbers. The decimal number 0.1
is represented differently as a single-precision float
and as a double-precision double
(and none of these two representations is exact), so when 0.1f
is promoted to double
, the result is quite a bit different from the double
representing 0.1
.
Question 2
main(){ float f = 0.1; return f == 0.1f; }
Answer: the program returns 1
. This time, the comparison takes place between float
numbers. But first things first: variable f
is initialized with the double
representation of 0.1
, but this number has to be converted to float
to fit f
. As a result, f
ends up containing only those digits of 0.1
that fit into a float
mantissa. When the contents of f
are read back, they compare exactly to the 0.1f
single-precision constant.
Question 3
main(){ float f = 0.1; double d = 0.1f; return f == d; }
Answer: the program returns 1
. The comparison takes place between double
numbers again. The left-hand side is the promotion to double of the single-precision representation of 0.1
. The right-hand side is the contents of double-precision variable d
, that has been initialized with the conversion to double
of the single-precision representation of 0.1
. The two sequences of operations produce the same result.
Question 4
main(){ double d1 = 1.01161128282547f; double d2 = 1.01161128282547; return d1 == d2; }
Answer: the program returns 0
. The decimal number 1.01161128282547
is no more representable than 0.1
, and again, its double
representation in d2
has more digits than its float
representation converted to double in d1
.
For a fractional number to be representable as a (base 2) floating-point number, its decimal expansion has to end in
5
, although the converse isn't true. Numbers0.5
and0.625
are representable as floating-point numbers, but0.05
,0.1
and1.01161128282547
aren't. A number may also have the same representation asfloat
anddouble
although neither of these two representations is exact: for this to happen, it suffices that the 29 additional binary digits available in thedouble
format be all zeroes.
Question 5
main(){ float f1 = 1.01161128282547f; float f2 = 1.01161128282547; return f1 == f2; }
Answer: if this looks like a trick question, it's because it is. The program returns 0
. Variable f1
is initialized with the single-precision representation of 1.01161128282547
. On the other hand, f2
receives the conversion to float
of the double representation of this number. In this particular case, the two are not the same: the number 1.01161128282547 is actually very close to the middle point of two successive floating-point numbers. When it is first rounded to double (when initializing f2
), it is rounded to the middle point itself (which happens to be representable as a double
). When that double
is rounded to a float
, applicable rounding rules send it to the float
on the opposite side of the middle point we started from. On the other hand, when initializing f1
, the original number is rounded directly to the nearest float
.
~1.01161122 ~1.01161128 ~1.01161134 +------------------------------+------------------------------+ f2 ^ f1 original number
I could make another series of questions, somewhat symmetrical to this one, where two different but standard-complicant compilers produce different results each time, but that wouldn't be as much fun. The examples here were relatively well defined. The rules that make them puzzling (or not) apply indiscriminately to most compilers. Unless they do not even try to follow C99's guideline that recommends IEEE 754 arithmetics.