Archive for the gnumeric Category

To SSE, or not to SSE, that is the question

Posted in binary, decimal, gnumeric, Maxima, Microsoft Excel, multi-precision calculations, Number processing, Number systems, Software tools, Spreadsheet, VBA, xNumbers on February 25, 2014 by giorgiomcmlx

Some time ago, I discovered that in a thirty year old floating-point (FP) library the value of the mathematical expression 1 – x2 was determined by calculating 2·(1-x)-(1-x)2 for 0.5 ≤ x ≤ 1. At a first glance, this method appeared rather peculiar to me, but then I realised that the Lemma of Sterbenz (see, e.g. page 45 of this slide collection on FP calculations) guarantees that 1-x is calculated in FP without any round-off error in the specified range. The multiplication by 2 does not introduce errors as well. Yet, there are still two sources of errors: firstly, in case (1-x) has more than 26 significant bits, (1-x)2 is not exactly representable in FP, and secondly, finally taking the difference might cause a round-off error as well.

In my tests though, this old formula outperformed all other versions on average, and so I shared it with developer friends. Surprisingly, some of them responded with a statement like "The results produced by this formula do not differ from those obtained using (1-x)·(1+x)." In the end, the reason for this paradox could be identified: all my tests were executed on SSE registers, calculations on which strictly adhere to the double-precision floating-point format, whereas the calculations not showing any improvement had harnessed the intermediate extended double-precision format of the x87 stack. Modern x87-compatible FPUs provide both of these two very different kinds of intermediate FP number storage.

The SSE have been invented in favour of fast parallel data processing, whereas the x87 double-extended stack provides both 11 additional mantissa bits and 4 additional exponent bits to intermediate calculations in order to try to maintain an overall accuracy of 53 bits and to reduce the probability of intermediate under-/over-flows as well. The decision whether a piece of software uses the x87 or the SSE instruction sets has to be made up at compile time, because the generated byte code is different. Contrary, both the rounding-mode and the exception-handling properties may be influenced at runtime by changing the so called floating-point control word (FPUCW). I could not find an article on Wikipedia, so please see e.g. this collection of or this introduction to the FPUCW world. I also have come across these imho nice summaries that are written from C-programmers points of view: The pitfalls of verifying floating-point computations as well as SciLab is not naive. Unfortunately, values of the FPUCW having an identical meaning differ across platforms. For example, a value of the MS-FPUCW of hex 8001F corresponds to a GNU-GCC-FPUCW value of hex 37F …  Fortunately, the ability to change the rounding mode during runtime enables programmers to harness interval arithmetic in order to compute reliable FP error estimates of their code for both the x87 and SSE instructions sets. The rounding mode can be chosen from the set UP, DOWN, CHOP, and TO NEAREST – TIES TO EVEN, the latter usually being the default setting. Only if the x87 instruction has been selected at compile time, the precision of the calculations may be changed during runtime as well.

The 11 additional significand bits of the x87 extDP format correspond to roughly 3 decimal figures, and thus are a valuable tool in order to minimise the effects of cancellation in intermediate calculations. The 4 additional exponent bits allow for an intermediate product of up to 16 huge (tiny) doubles without causing an over-(under-)flow. Yet, harnessing the extDP format bears one dirty little secret: when the result of an intermediate calculation is transferred from the x87 stack firstly into an extDP x87 register and finally into the DP main memory, the intermediate value is subject to a double-rounding procedure. In combination with the standard rounding mode "TO NEAREST – TIES TO EVEN" this can lead to a slightly wrong result in two situations: an intermediate calculation rounds up/down to an extDP value representing an exact DP tie that is again rounded up/down to a DP value. The following tables give an example:

decimal number    DP hex    comment
+18446744073709551616.0 43F0000000000000 exact DP Num1
+6143.0 40B7FF0000000000 exact DP Num2
+18446744073709557759.0 Num1 + Num2,
not representable in DP
+18446744073709555712.0 43F0000000000001 DP_lo: nearest DP
+18446744073709559808.0 43F0000000000002 DP_hi: next larger DP
-2047.0 error of DP_lo
+2049.0 error of DP_hi

When adding Num1 to Num2 on the x87 extDP stack, the hex representation of the final DP sum in memory will be 43F0000000000002 instead of the closest DP 43F0000000000001. This can be understood by analysing the binary representation of the numbers involved. In order to clearly show the rounding effects, some separators are used: a "V" indicates the border between any two DP half-bytes, each of which corresponds to a single character in the hex representation, an "X" shows where the DP bits end and the extDP bits begin, and finally a "Y" marks the end of the extDP bit sequence.

Num1
+IOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO.O
Num2
+IOIIIIIIIIIII.O
Num1 + Num2
+IOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOIOIIIIIIIIIII.O
same with separators
+IVOOOOVOOOOVOOOOVOOOOVOOOOVOOOOVOOOOVOOOOVOOOOVOOOOVOOOOVOOOOVOOOIXOIIIIIIIIIIYI.O
1st step: round (up) to extDP
+IVOOOOVOOOOVOOOOVOOOOVOOOOVOOOOVOOOOVOOOOVOOOOVOOOOVOOOOVOOOOVOOOIXIOOOOOOOOOOYO.O
2nd step: round DP tie (up) to even
+IVOOOOVOOOOVOOOOVOOOOVOOOOVOOOOVOOOOVOOOOVOOOOVOOOOVOOOOVOOOOVOOIOXOOOOOOOOOOOYO.O

In this example, the addition creates a bit sequence, the mantissa of which exceeds the 64 bits of the extDP format. Thus, in a first step, the result is rounded to 64 bits. These extDP bits happen to represent a so-called tie (a number exactly half-way between the two adjacent DP numbers) of the DP format. In this example, the tie-breaking rule TO EVEN finally causes another rounding up procedure, so that the value stored in memory does not represent the DP number closest to the mathematical result of the summation, but the next larger one. Yet, it is still one of the two DP numbers bracketing the true value.

Only two different bit sequences among the 2^13 possible ones lead to this kind of error:
leadingbitsIXOIIIIIIIIIIYItrailingbits ++
leadingbitsOXIOOOOOOOOOOYOtrailingbits —

so that the odds ratio of this error should be expected to be close to 1/2^12 = 1/4096 ≈ 2.44E-4 . Of course, if the bit sequence of an intermediate result does not exceed the 64 bits of the extDP format, there will be only a single rounding step yielding a correct result. When analysing the results of 5.75E5 renormalisation steps of double-double numbers in VBA7, the sum of which is determined to exceed the 64 bit extDP limit, and that were introduced here at the end of this post, I found 1480 errors and thus an odds ratio of about 2.57E-4. Due to the relatively large sample size, I guess the deviation from the theoretical value is caused by non-randomness of the bit sequences involved.

Double-rounding is unavoidable when using the extDP registers of the x87 FPU, but it only slightly increases the representation error of about 0.5 ULPs by at maximum 0.05&percent;. Moreover, this only happens with an odds ratio of about 1/4096. Thus, this disadvantage of rarely occurring false double rounding is usually more than counterbalanced by the gain of 11 bits of intermediate precision in every calculation. However, there is a special kind of multi-precision software that cannot be run on the x87 stack due to the double rounding, i.e. all the double-double as well as quad-double software that use chained DP variables in order to store multi-precision values (see, e.g. this implementation). The double rounding on the x87 stack can lead to violations of the basic assumption of this kind of software, i.e., that the leading DP value always represents the DP value closest to the multi-precision value stored in a chain of doubles. Therefore, arbitrary precision tools being determined to use the x87 stack, for example Xnumbers, need to harness completely different algorithms for internal number storage. The numerical trick using a double-double representation of π/2 introduced in the aforementioned post works reliably, because the firstly calculated intermediate difference does not involve more than 64 significant bits.

The FP instruction set actually used by third-party numerical software is often unknown and needs to be determined by using test cases. In order to avoid problems with the digits chopping feature of most spreadsheet software, examples need to be created that use exact DP values corresponding to less than 14 significant decimal digits. The previously given example can be reformulated as sum of +1.8446744E+19 ≡ 43EFFFFFFDDAD230 and +73709557759 ≡ 4231296E97FF0000. In order to check the DP result for correctness without any debugging capabilities like those provided by Xnumbers for MS XL, e.g., both DP numbers that were summed up before now need to be successively subtracted from the sum, the bigger one at first, and the smaller one afterwards. It is important to assure that all intermediate differences are rounded to DP, which can most easily be achieved by assigning them to different DP variables, or different cells in a spreadsheet. Contrary to the first addition, the two final subtractions do not introduce any FP error, because they are subject to the lemma of Sterbenz. A final result of -2047 corresponds to calculations that have not been executed on the x87, whereas a value of +2049 indicates the opposite. Here are my PY3- and CAS-Maxima scripts:

Python 3     CAS-Maxima
num1 = +1.8446744E+19 +1.8446744E+19;
num2 = +73709557759.0 +73709557759.0;
sum12 = (num1+num2) %o1+%o2;
sum12sub1 = (sum12-num1) %-%o1;
sum12sub1sub2=(sum12sub1-num2) %-%o2;
print(sum12sub1sub2)

The following table summarises the results I found in some spreadsheet and scripting software running on MS WIN7x64prof. On Linux, all SW listed in the table that is available on that OS as well returns -2047.

spreadsheet SW   result        scripting SW   result
GNUmeric 1.12.9 +2049 PY2.7.3×64 -2047
GScalc 11.4.2 -2047 PY3.2.3×64 -2047
KSspread 9.1.0 +2049 R2.15.2×64 -2047
LOcalc 4.1.4.2 -2047 R3.0.0×32 +2049
LT SYM 3.0.1 -2047 R3.0.0×64 -2047
MS XL14/VBA7 +2049 FreeMat 4.2 -2047
SM PM 2012 -2047 Maxima x32 +2049

As you can see from the table, the situation is completely messed up. It seems to me as if on the Win32 platform the x87 with its extDP format was used by default, whereas on the Win64 platform the SSE part of the FPU was involved. Of course, any software can override the default setting, and MS XL64 as well as VBA7 indeed do so. Generally speaking, specifying the FPUCW as well as the kind of FPU during compilation is a very good idea in case of numerical libraries that were developed and optimised for only one of the two scenarios. In case these two setting are omitted and the unknown defaults are used at runtime, small deviations are to be expected only in case the software was designed not to harness the extDP precision of the x87 stack. Otherwise, SSE lacking the additional 11 bits of precision will very likely cause things like polynomial approximations to special functions to fail at runtime. In particular, source code of libraries originally designed for the x87 must not be re-compiled for SSE without closely inspecting the results. Libraries that were built before the advent of the x87 can be a very valuable source of inspiration for changes in the code that are likely to become necessary.

Advertisements

Decimal significant figures of IEEE754 DP numbers

Posted in ArPrec, binary, Computer, decimal, Fortran, gnumeric, Mathematics, Microsoft Excel, multi-precision calculations, Number processing, Number systems, Programming Languages, Python, Software, Software tools, Spreadsheet, VBA, xNumbers on January 27, 2012 by giorgiomcmlx

The amount of decimal figures used to show a binary DP number in decimal floating-point format varies among the spreadsheet programs as well as programming languages I use. For example, the following table shows the default amount of decimal figures used when displaying the square root of two in decimal notation. The abbreviations are explained at the end of this post.

sqrt(2): square root of two ( programming languge || spreadsheet software )

30: 1.41421356237309504880168872421b0 (MXM fpprec:30;sqrt(2b0);)
18: 1.41421356237309515 (F77RM for DOS as of 1992)
17: 1.4142135623730951 (C, F77OW, F99/95G, JS, PY || GNM )
16: 1.414213562373095 (MXM ev(sqrt(2),numer);)
15: 1.41421356237310 ( || MSXL, LOC)
12: 1.41421356237 (F95S)

Obviously, there is a contradiction between the numbers shown in the first and second lines, resp. The reason for that is Maxima not using the IEEE 754 DP number format. Thus, the first line shows the mathematically correct value of sqrt(2) rounded to 30 decimal significant figures (SFs), whereas all the other lines show rounded values of the very decimal number that represents the DP approximation to sqrt(2). Both the decimal and binary fractions corresponding to the exact value of sqrt(2) have infinitely many SFs, because it is an irrational number. According to the IEEE754 specification, the binary DP representation of sqrt(2) comprises at maximum 53 binary SFs. Thus, it is a rational number in the binary system. All DP numbers correspond to rational decimal numbers. This is easily seen when writing the sum of at maximum 53 different powers of two (Po2) as a product of a factor times the Po2 corresponding to the 53rd bit. If that Po2 is non-negative, a product of two integers is obtained; if it is negative, the result is a fraction of two integers. Either form describes a rational decimal number. The remaining question to answer is how many decimal SFs this number has.
If all the Po2 values of a DP number are non-negative, i.e. the leading hidden bit corresponds to a value of Po2 ≥ 52, we might expect a scaling with a factor of log10(2) = 0.30103, because that is the position of the leading decimal figure in decimal columnar number notation. The trailing figure is always found on position 1, because its value cycles through the sequence 1&rarrow;cycle(2&rarrow;4&rarrow;8&rarrow;6&rarrow;). If all the Po2 values are negative, i.e. the leading hidden bit corresponds to a Po2 < 0, the position of the leading decimal figure scales like -log10(2), and the position of the trailing decimal figure like -1, so that the number of SFs in this case can be expected to scale like -1 – (-0.30103) = -0.69897. The following figures show the practically determined amount of decimal SFs of DP numbers for two different values of the amount of binary SFs, i.e. 1 and 53, resp.

decimal significant figures of DP numbers; overview
decimal significant figures of DP numbers; details near leading power of two equals zero

A simple linear regression yields the following relations between the leading Po2 (LPo2) of a DP number and the amount of its decimal SFs:

pure power of two, only hidden bit set
SF_01Po2(LPo2≥0) = + 0.301023(31) ⋅ LPo2 + 0.504(18)
SF_01Po2(LPo2≤0) = - 0.698923(31) ⋅ LPo2 + 0.516(19)

full 53-bit resolution
SF_53Po2(LPo2≥52) = + 0.301034(33) ⋅ LPo2 + 0.496(20)
SF_53Po2(LPo2≤52) = - 0.698963(28) ⋅ LPo2 + 52.504(16)

Of course, the fractional values resulting from these calculations have to be rounded to an integer value and two has to be added for safety (if I will find the time I will plot the confidence intervals of these regression lines…). The values of scaling factors found in case of 53-bit resolution agree closely with our expectations. The smallest normalised DP numbers (corresponding to 53 binary SFs) can be correctly displayed in the decimal system only if the really huge number of 767 (yes, seven hundred and sixty seven) decimal places are used! If fixed point notation is to be used, the amount of decimal places even increases up to a value of 1022 + 52 + 2 + 1 = 1077 columns inclusive sign noted in the first and leading zero and decimal separator noted in the second and third columns, resp.

The graph giving the details of the relations near LPo2 = 0 shows that the minimum amount of decimal SFs needed to exactly display a 53-bit DP number is 16. Thus, displaying at maximum only 15 decimal figures like many popular spreadsheet software do prevents any 53-bit DP number from being displayed correctly in the decimal system! On the other hand, because the deviation between an arbitrary decimal fraction and the value of its DP approximation on average starts on the 16-th decimal SF, it does not make much sense to use more than 18 decimal SFs when displaying the final result of calculations performed with DP numbers.
However, if some intermediate results seem strange and are to be further analysed, extended-precision frameworks like xNumbers or ArPrec are mandatory in order to deal with the huge amounts of digits that are likely to occur. Alternatively, you may decide to sit down together with your children under the Christmas tree and write add-with-carry as well as subtract-with-borrow codes that can handle up to 1500 decimal figures. If you manage to obey the KISS principle during coding, your work will be highly portable to any programming language, but that’s another story.

List of abbreviations with download links, if applicable.
CAS: computer algebra system

FOSS: free and open source software

MXM: Maxima, a FOSS CAS

C: name of a programming language

gcc: gnu C compiler

F: abbreviation for FORTRAN, the name of a programming language

F77OW: OpenWatcam Fortran 77

F77RM: Ryan-McFarland Fortran 77 for DOS as of 1987, malware????legal????

F99/95G: the Fortran 90/95/99/… front end for the GNU C compiler

F95S: Salford® FTN95 compiler

F99I: Intel® Fortran compiler for Linux

JS: JavaScript(MS: JScript), name of a scripting language targeting at web browsers

PY: Python, name of a programming language

GNM: gnumeric spreadsheet software

LOC: LibreOffice Calc spreadsheet software

MSXL: Microsoft® Excel®

Intermezzo: how to escape the DP limitations?

Posted in Computer, gnumeric, Microsoft Excel, mpmath, multi-precision calculations, Programming Languages, Python, Software, Software tools, Spreadsheet, VBA, xNumbers on September 14, 2011 by giorgiomcmlx

In general, it is at least a very complicated task to prove that an algorithm is working as intended over the whole range of possible input parameters. When dealing with mathematical problems, it seems to be a good idea to check whether results obtained using a higher level of accuracy agree with those obtained using the standard level. Fortunately, for all major programming languages there are multi-precision libraries as well as automated conversion utilities that transform the source code automatically. The situation seems different in case of spreadsheets, though imho they are used for relatively complicated calculations as well nowadays. Unfortunately, there is much mindless advice on the web like “check the accuracy using addition theorems” or so. However, in case you hopefully read the previous posts on DP numbers (or even more professional matter), you know that you cannot test the accuracy of a numerical expression like 1 – x^2 in DP by adding x^2 and checking whether the result equals 1 in DP, e.g. Thus, there definitely is need of multiple-precision calculations in spreadsheets.

I use three spreadsheet environments, two of which are fundamentally different: two combinations of Microsoft®™ (MS) Excel®™ (XL) under MS Windows®™ (Win) as well as gnumeric (GNM) under Ubuntu Linux (UBL). Fortunately, multi-precision extensions are available more or less easily relying on free and open source software (FOSS) only. In case of XL, the software of my choice is a Visual Basic®™ for Applications (VBA) XL add-in called xNumbers (XN) (old version). XL is shipped with an Integrated Development Environment (IDE) for VBA. In case of GNM, I have decided to use its built-in Python plug-in and write a function suite myself that utilises the multi-precision capabilities of the FOSS PY module mpmath. I have not decided on an IDE for big python projects so far; the proof of concept function library provided here is so simple that I could write it using an editor capable of syntax highlighting. I wanted to figure out whether it is feasible to provide PY based cell formulae to GNM, which offer exactly the same functionality as their XN pendants in XL and thus do not require any change whatsoever in the XL file when it is opened with GNM in UBL. Of course, this GNM extension is totally independent of XL and works with the standard GNM file format as well.

The GNM PY plug-in under UBL requires two files to be located in a (subdirectory of a) specific folder, i.e. “/usr/lib/gnumeric/1.10.1/plugins” in case of the current GNM version 1.10.1x, in order to fully specify a user-defined function (UDF). Unfortunately, this folder belongs to the user “root” in UBL. However, very surprisingly there is no user “root” in a fresh UBL installation.

!!! DANGER – DO NOT TRY THIS AT HOME !!!
If this warning failed to prevent you from reading on, be warned: “root” can be medicine and poison as well … At least, if you are not a subject matter expert in computer security and secure internet router configuration, at first do all the necessary downloads and installations, then plug off your network cable or switch off your mobile network equipment, and only after that try to become the “root” user of UBL.

A search on the web immediately yielded the cure for the UBL-“root” problem: type >>sudo passwd<< in a terminal window, then twice enter an identical and very secure password, and voila, the user “root” is automatically created. Now exit the terminal, re-start and then login as “other”–>”root”. Now you can add a subfolder to the “plugins” directory, which in my case is called “fn-drgst” (GNM will automatically check all subfolders in the “plugins” folder). In this folder, at least two files have to be created: the module file that contains the executable PY code, which in my case is “drgsts_pyfuncs.py”, and an xml file having the mandatory name “plugin.xml” that describes the PY file. All the details of the naming conventions are explained in the section writing your own python xxx on the GNM website. Any PY UDF is referred to three times: firstly in the actual function definition line in PY file, secondly in a line belonging to a named function group in the PY file describing the kinds and numbers of parameters that all functions take, and thirdly in a line in the XML file.

For example, the multi-precision function which performs the addition of two real numbers in xN is named xADD and is invoked by typing “=xADD(C5;C6;30)”, e.g., into an XL cell. In this case, the result in XL will be the sum of the two cells C5 and C6 with an accuracy of 30 decimal digits. Of course, it cannot be an XL number but in fact is a string. Typically, C5 and C6 will therefore contain strings like >>’1.23456789012345678901234567891<<, too. My PY code defines exactly such a function and reads:

def xADD(myStr1,myStr2,myStr3):
# sum of the numbers contained in myStr1 and myStr2,
# i.e. myStr1 + myStr2, with an accuracy of myStr3 decimal places
from mpmath import mp
dps_int = int(myStr3);mp.dps = dps_int;mp.pretty = True
return str(mp.mpf(myStr1) + mp.mpf(myStr2))

Fortunately, it turned out that this function might be called using cell references instead of number strings as well. In the function-group statement, the corresponding line reads:

drgstspyfuncs_functions = {

'xADD': ('sss', 'myStr1,myStr2,myStr3', xADD),

}

It tells to the system that the function “xADD” takes three strings as parameters (‘sss’), how these parameters are named (‘myStr1,myStr2,myStr3’), and provides the internal (xADD) as well as external (‘xADD’) names of the function. In the XML file, the line

<plugin id="Gnumeric_drgstspyfuncs">

introduces the function group defined in the PY file, then the line

<attribute name="module_name" value="drgsts_pyfuncs" />

states the name of the PY module file that contains the aforementioned function group, and finally the line

<function name="xADD" />

tells to the system the external name of any function that is to appear in the GNM function GUI. Unfortunately, a bug in the GNM PY plug-in prevents the formal description of a function from appearing in the GUI (this is why I have omitted such a formal description like a PY “docstring” in the code). The number and kind of parameters are stated correctly in the GNM GUI, though.

As a proof of concept I have created 30 PY functions which correspond to those 28 functions in xN that are available in two or even three different kinds of accuracy, i.e. DP, quadruple precision, and multiple precision, and the two constants pi and e as well. You may download the code of the two defining *.py as well as *.xml files found as text in a PDF file and a spreadsheet in XL 97-2003 file format demonstrating all the functions and providing their results obtained with all my different spreadsheet environments. All software written by myself is licensed under the GPL v3 that is found in both files, too.