Non-expert floating-point-using developers need accurate floating-point libraries the most

Pascal Cuoq - 6th Apr 2013

Quotes on the Internet

In 2012, Lukas Mathis took a quote out of the context of a blog post by Marco Arment and ran with it. The result was a though-provoking essay. Key quote:

This is a sentiment you often hear from people: casual users only need «entry-level» devices. Even casual users themselves perpetuate it: «Oh, I’m not doing much on my computer, so I always just go with the cheapest option.» And then they buy a horrid, underpowered netbook, find out that it has a tiny screen, is incredibly slow, the keyboard sucks, and they either never actually use it, or eventually come to the conclusion that they just hate computers.

In reality, it’s exactly backwards: proficient users can deal with a crappy computer, but casual users need as good a computer as possible.

Lukas fully develops the idea in his post, Crappy Computers. Go ahead and read it now if you haven't already. This blog will still be here when you come back.

Floating-point libraries

The idea expressed by Lukas Mathis applies identically in a much more specific setting: developing with floating-point computations. The developers who most need accurate floating-point libraries are those who least care about floating-point. These developers will themselves tell you that it is all the same to them. They do not know what an ULP (“unit in the last place”) is so what difference is it to them if they get two of them as error where they could have had one or half of one?

In this they are just as wrong as the casual computer users who pick horrid netbooks for themselves.

Floating-point-wise programming environments are not born equal

All recent processors for desktop computers provide basic operations + - * / and square root for IEEE 754 single- and double-precision floating-point numbers. Each operation has its assembly instruction and since the assembly instruction is the fastest way to implement the operation compilers have no opportunity to mess things up in a misguided attempt at optimizing for speed.

Who am I kidding? Of course compilers have plenty of opportunities to mess things up.

  1. It may seem to a compiler that a compile-time computation is even faster than the assembly instruction provided by the processor so that if the program computes x / 10.0 the compiler may compute 1 / 10.0 at compile-time and generate assembly code that multiplies x by this constant instead. This transformation causes the result to be less accurate in some rare cases.
  2. Or a compiler may simplify source-code expressions as if floating-point operations were associative when they aren't. It may for instance optimize a carefully crafted floating-point expression such as a + b - a - b into 0.

Nevertheless there has been much progress recently in standard compliance for compilers' implementations of floating-point. Overall for programs that only use the basic operators the situation has never been better.

The situation is not as bright-looking when it comes to mathematical libraries. These libraries provide conversion to and from decimal and transcendental elementary functions implemented on top of the basic operations. They are typically part of the operating system. Implementations vary wildly in quality from an operating system to the next.

Expert developers know exactly what compromise between accuracy and speed they need and they typically use their own functions instead of the operating system's. By way of illustration a famous super-fast pretty-accurate implementation of the inverse square root function is used in Quake III and has been much analyzed.

The casual developer of floating-point programs on the other hand will certainly use the functions provided by the system. Some of eir expectations may be naive or altogether impossible to reconcile with the constraints imposed by the IEEE 754 standard. Other expectations may be common sense such as a sin() function that does not return -2.76.

For such a developer the mathematical libraries should strive to be as accommodating and standard-compliant as possible because ey needs it regardless of what ey thinks.

An example

To illustrate I have written a string-to-long conversion function. It could have been an entry in John Regehr's contest but since the deadline is passed I have allowed the function to expect only positive numbers and to fail miserably when the input is ill-formed.

The function looks like this:

long str2long(char *p) 
  size_t l = strlen(p); 
  long acc = 0; 
  for (size_t i=0; i<l; i++) 
      int digit = p[i] - '0'; 
      long pow10 = pow(10  l - 1U - i); 
      acc += digit * pow10;; 
  return acc; 

Neat huh?

I tested this function more than the last function. This time I compiled it and invoked it on a few strings:

  printf("%ld %ld %ld %ld"  

You can download the entire C code for yourself. If you run it you should get:

0 123 999 123456789123456789 

I wrote my function to work for all well-formed inputs that fit in a long (but I only tested it for four values so do not embed it in your space shuttle please). Some of the reasons why I expect it to work are implicit: for one powers of ten up to 10^22 are exactly representable as double-precision floating-point numbers. Also I happen to know that on the system I use the mathematical library is one of the best available.

I am not in fact a floating-point expert. I could be completely illiterate with respect to floating-point and have written the exact same function. In fact this happened to StackOverflow user1257. (I am not saying that StackOverflow user1257 is illiterate with respect to floating-point either. Ey wrote a function similar to mine after all.)

User1257's function returned 122 when applied to the string "123" !!!

This was so troubling that user1257 suspected a compiler bug. The reality is more likely that on eir computer the statement long pow10 = pow(10 2); sets variable pow10 to 99. The function pow() only needs to be inaccurate by 1ULP for this result to come up because of C's truncation behavior (towards zero) when converting from floating-point to integer.


My str2long() function would fail just the same if it was run in user1257's compilation environment. I still think that my function is correct and that I should be able to expect results to the ULP from the math library's pow() function. A floating-point expert would never even encounter the issue at all. I might be able to diagnose it and to cope. But the floating-point beginner simply needs an environment in which long pow10 = pow(10 2); sets pow10 to 100.

If you program and if you use floating-point at all beware of relying on the math library equivalent of a crappy netbook.

Pascal Cuoq
6th Apr 2013