## Do you really know how your spreadsheet works?

Many people all over the world use spreadsheet calculation software on their computers. Only few of them, though, seem to be aware what is really happening beneath the GUI. Many effects that seem strange at a first glance can be understood when considering how most recent spreadsheet programs process numbers. It is reasonable to assume that at least the programming language the spreadsheet software is built in uses a number format called Double Precision. This has two important consequences: firstly, such a software (and the computer it is running on as well) represents numbers using the binary numeral system ( wikipedia has a more advanced description), and secondly, only a limited amount of *Double Precision (DP)* numbers is available to represent any number that is entered into a cell, e.g. The distance between a DP number and its two direct neighbours nearly always equals 2^-52 times the leading power of 2 that occurs in that number. The situation is different for the few numbers exactly equalling a power of 2: because the leading power of 2 changes at these numbers, the distance to their next smaller DP neighbour is half the distance to their next larger DP neighbour. The DP numbers are most dense around zero and most sparse near their maximum, which is about 10^307.

Consequently, software internally using DP numbers can only exactly represent a decimal input if that equals a binary fraction and as well contains a minimum and maximum power of 2 that do not differ by more than 52. For example, 4, 1, 0.5, 0.25, 1.25, and 0.6125 are exactly representable in the binary as well as decimal numeral system. But what about numbers like sqrt(2), 0.9, 0.1 &hell;? They are rounded to a DP number using a standardised algorithm. Thus, there is a double dilemma with the doubles. Firstly, most of the numbers humans want to work with cannot be exactly represented by a DP number but only by what is called a string. Secondly, most of the usual DP numbers a computer can deal with would require at maximum 751 decimal digits to represent them exactly. Fortunately, in case of IEEE 754 DP numbers 17 decimal digits uniquely identify a DP number (18, if false rounding is to be avoided).

Usually, spreadsheet software displays only 15 digits. But then you cannot spot that the result of 0.5-0.4 is rounded to a different DP number than the direct input of 0.1, as demonstrated in a previous post.

The joint consequences of different spacing of DP neighbours as well as rounding to a DP can lead to considerable differences between the results of mathematically equivalent formulae used to calculate a result. The following graph shows the DP differences between the DP results of y1=1-x⋅x and y2=(1-x)⋅(1+x) for 30,000 different values of DP x between (0;1).

The absolute value of the difference does not exceed 1.1⋅10^-16, but the smaller the results for y1 as well as y2 get as x approaches 1, the more different DP numbers exist to represent the difference between y1 and y2. The amount of available levels that have a non-zero value doubles at all the x values where y decreases below a power of 2, i.e., at about x≈0.707 (x>1/sqrt(2), y<1/2), x≈0.866 (x>sqrt(3)/2, y<1/4), x≈0.935 (x>sqrt(7/8), y<1/8) …

But unfortunately, even for x close to 1, the absolute values of the differences still nearly fill the interval between +/-6⋅10^-17, though the value of y gets very small there. Thus, the accuracy of the results is questionable and moreover, the question arises which of both methods is more reliable. These aspects will be treated in a follow-up post.

## Leave a Reply