Archive for the Compiler Category

How to call a MSVCpp2010x64-DLL from XL14x64

Posted in C++, Compiler, Computer, Microsoft Excel, Programming Languages, Software, Software tools, Spreadsheet, VBA on December 20, 2012 by giorgiomcmlx

Belonging to the group of people that sociologists call "early adopters" usually is a burden with respect to newly invented technologies, and unfortunately, I cannot avoid this situation all of the time… Fortunately, nowadays it is possible to search for suitable keywords on the Internet, and quite often, this strategy is much more effective in order to solve a problem than reading through manuals. Rarely, the result of such a search will be a thick searchable manual that covers all aspects one is interested in. A recent example of this situation occurred when I wanted to call functions in a self-compiled C++ DLL from VBA7, i.e., the macro language than comes with the 64bit version of Microsoft Office 2010.

The first thing I realised was that it is impossible to call a 32bit DLL from a 64bit application directly (and vice versa). Then it turned out that the so-called express edition of Visual Studio 2010 does not contain 64bit versions of the compilers. Therefore, I had to follow the advice provided by MS and install the WindowsSDK version 7.1 in order to enable building x64 DLLs. Unfortunately, I got this finally working only after having previously de-installed all of the VS2010express stuff present on my PC, because otherwise, the installation of the WinSDK7.1 failed at an early stage (may be due to some updates…). After having re-installed all the desired languages in VS2010Express and afterwards the WinSDK_7p1 as well, I was happy to find VCpp2010Exp working as expected.

VCpp2010Exp provides a project template meant for creating DLLs: select file → new → project → Win32 → Win32-project, enter a name, and click ok. On the following welcome screen, do not click the finish but the continue button. A properties window will then be displayed in which you select the DLL radio button, and check the box labelled export symbols as well. Clicking the finish button will make VCpp2010Exp create a DLL project template that contains two files: the source code in *.cpp, and the header information in *.h . You’ll need to edit both of them later. Now the target platform has to be switched from Win32 to x64 in two steps. Firstly, select project → properties, change the content of the configuration button to all configurations, then select configuration properties → general in the list displayed on the left, change the entry for platform toolset in the general section of the list displayed on the right from v100 (or v90 or …) to Windows7.1SDK, and click the apply button. Secondly, after VCpp2010Exp has finished applying this change, click the configuration manger button and set the x64 platform (it does not matter whether the debug or release configuration is active) by clicking on platform, selecting new, changing the entry of new platform to x64, clicking the ok button, and finally clicking the close button of the configuration manager.

I decided to start with a very simple x64 DLL that only provides two different functions, i.e., a function of type integer (in VBA7, this corresponds to a function of type long), and another function of type void (corresponding to a sub in VBA7). Both functions calculate the product of two numbers provided as input variables; the integer function returns the result as its own value, whereas the void function returns it as value of a third variable, which consequently must be transferred by reference (in the declaration, an ampersand has to be noted directly in front of the variable’s name in Cpp, and the ByRef keyword in VBA7).

The C++ file named DLL_Test.cpp thus reads:

#include "stdafx.h"
#include "DLL_Test.h"

DLL_TEST_API int DLL_Mult_FNC(int a, int b)
{
return a*b;
}

DLL_TEST_API void DLL_Mult_SUB(int a, int b, int &c)
{
c = a*b;
return;
}

The meaning of DLL_TEST_API is defined in the header file DLL_Test.h together with some other very important settings:

#ifdef DLL_TEST_EXPORTS
#define DLL_TEST_API __declspec(dllexport)
#else
#define DLL_TEST_API __declspec(dllimport)
#endif

extern "C"
{
DLL_TEST_API int DLL_Mult_FNC(int a, int b);

DLL_TEST_API void DLL_Mult_SUB(int a, int b, int &c);
}

The block of macro-defining statements at the beginning has been added by VCpp2010Exp automatically, and saves a bit of typing if many functions are to be exported. The magic extern "C" block around the function declarations has been added manually after finding out that otherwise the names of the exported functions are so badly mangled that they cannot be called from VBA7. Imho, the tool Dependency Walker is quite valuable in such a situation. The hint that name mangling could be the reason for my DLL functions not yielding any result on the first try was mentioned in a post on Stack overflow, in which user arsane posted a link to a very comprehensive survey of calling conventions.

In VBA7, the functions need to be declared in a standard module; the name of a function in VBA7, which directly follows the function or sub keywords, has to be different from the name exported by the (C++) DLL, which has to be given directly after the Alias keyword.

Option Explicit

Declare PtrSafe Sub SUB_AmultB Lib "insert-correct-path-to-dll-here\DLL_Test.dll" Alias "DLL_Mult_SUB" (ByVal ia&, ByVal ib&, ByRef ic&)
Declare PtrSafe Function FNC_AmultB& Lib "insert-correct-path-to-dll-here\DLL_Test.dll" Alias "DLL_Mult_FNC" (ByVal ia&, ByVal ib&)

Function WS_DLL_Mult_SUB&(ByVal ia&, ByVal ib&)
Dim ic&
Call SUB_AmultB(ia, ib, ic)
WS_DLL_Mult_SUB = ic
End Function

Function WS_DLL_Mult_FNC&(ByVal ia&, ByVal ib&)
WS_DLL_Mult_FNC = FNC_AmultB(ia, ib)
End Function

Do not get confused by the completely different meanings of ampersands in C++ and VBA7, resp.: in C++, a leading ampersand denotes a call by reference, whereas in VBA7, a trailing ampersand defines the variable to be of type long.

In case you find this recipe is not working for you, please let me know…

Advertisements

Mathematical constants in program code

Posted in binary, Compiler, Computer, decimal, hexadecimal, Mathematics, multi-precision calculations, Number systems, Software, Software tools, xNumbers on October 2, 2012 by giorgiomcmlx

This text is published under the Creative Commons’ non-commercial share-alike (CC-NC-SA) license V3.0. It was created by GiorgioMCMLX at carolomeetsbarolo on wordpress.com in October 2012.

In source codes concerned with numerical subjects, usually the values of all frequently used expressions that do not change during execution time are precomputed externally and assigned to variables as a sequence of decimal figures. For example, in some fictious programming language that lacks an internal constant corresponding to the mathematical constant π, such a statement could look like

Declare Constant pi_on_3 as Double Precision = 1.04719755119659775E+00_8B

The mathematically exact value of π/3 = 1.047197551196597746154214461… comprises incountably many decimal figures. Thus, it was decided to round it to a precision of 18 decimal significant figures in the so-called scientific notation. The variable pi_on_3 is defined to be of the type double precision (DP) binary floating-point (FP) that uses 8 bytes = 64 bits for each value: 1 sign bit, 11 exponent bits, and 52 plus 1 implicit significand bits. This is the most widespread FP type used in numerical code nowadays. At a first glance, the declaration seems to be quite natural, but it neglects the very property of π that led to its invention: π is a so-called irrational number; moreover, it even belongs to the set of transcendental numbers among them. Consequently, π and all of its multiples definitely cannot be represented exactly by any of the floating-point (FP) variable types used in computers, because all their values are binary rational numbers that all correspond to decimal rational numbers. Nearly all mathematical constants belong at least to the group of irrational numbers. Otherwise, their invention would not have made much sense…

From the point of view of a compiler, 1.04719755119659775E+00 is not a number but a string consisting of decimal characters. The transformation of such numeric strings found in code into values of numerical variables is an example of what is called parsing. When a compiler meets a statement like that given above it will call its parser in order to determine a DP number that will substitute the value of the decimal string in the calculations. Parsing of decimals will almost always cause round-off errors (a special form of the more general quantisation errors), because the rational DP numbers form a sparse set of isolated points within the continuum of all the real numbers. As a consequence, all the incountably many real numbers comprised by a finite interval need to be parsed onto a single DP number.

Usually, it is tried to parse a numerical string to the nearest DP. Then, the magnitude of the quantisation error is somewhere between zero and half the distance between a pair of consecutive DP numbers. This full distance is called an ULP (unit in the last place). It is a relative measure: it doubles its absolute value at all the DP numbers that are exact powers of two. Each DP number DPval represents the interval of real numbers from DPval-0.5ULPs to DPval+0.5ULPs. Disregarding exact powers of two, the absolute distance between a pair of consecutive DP numbers is equal to the length of the interval that both DP numbers stand in for. Parsing a numerical string to the nearest DP thus
limits the unavoidable quantisation error to ±0.5ULPs. The absolute value of an ULP depends on the exponent of the DP number it is associated with: 1 ULP = 2^(LPo2-52), where LPo2 denotes the true leading power of two that is encoded in biased form in the exponent. LPo2 equals zero for all the DP numbers found in the interval [1;2); in this range, 1 ULP = 1/2^52 ≅ 2.22…E-16. This special absolute value is called machine epsilon.

Consequently, all exact binary DP numbers can be uniquely identified if at least 17 decimal significant figures (SFs) are provided. Unfortunately, this theorem seems to be widely misunderstood. In particular, it does not mean that any given numerical string consisting of 17 decimal SFs uniquely identifies a binary DP number. The limit of 17 decimal SFs only holds for parsing numerical strings the values of which correspond to exact DPs, for example when printing the results of DP calculations. In the other direction, for example when parsing numerical strings to DP numbers in order to specify the input of DP calculations, at maximum about 770 (seven-seven-zero, no typo!) decimal SFs might be needed. Thus, the fictious programmer, who might even have thought that specifying 18 decimal SFs is more than sufficient, might be terribly wrong.

For any parsing-to-nearest-DP process to succeed, it is necessary that the value of a given numerical string can be discriminated from the border between the quantisation intervals of a pair of consecutive DPs, which is called a "tie". Obviously, the rule "round/parse to nearest" has to be complemented by a tie-breaking instruction. Usually, this is "ties to even", and means that in case of a tie the DP number having an even binary representation is chosen, i.e., the trailing bit of which is zero. Ties formally involve just one additional bit as compared to the pair of DP numbers they separate; if the power of two that bit corresponds to is negative, this one extra bit exactly requires one extra decimal SF. The approximate relation between the total amount of decimal significant figures SF_dec of an exact DP and its leading power of two LPo2 has already been estimated my means of linear regression in a previous post:

SF_dec ≅ - 0,699*LPo2 + 52,5  ; LPo2 ≤ 52
SF_dec ≅ + 0,301*LPo2 + 0,496 ; LPo2 ≥ 52

For the most negative normalised value of LPo2, i.e. -1022, this formula predicts SF_dec ≅ 767. Consequently, at maximum about 770 decimal SFs assure that any numerical string having a value within the range of normalised DP numbers can definitely be parsed to the nearest DP number. Unfortunately, there is no way of predicting the amount of SF_dec that are really needed for a given decimal string without parsing it. The maximum amount will only be needed if the number is extremely close to a tie. On the other hand, if the number is close to an exact DP, 17 digits would do the job. So the scaling law that governs the amount of SF_dec actually needed for parsing-to-nearest-DP of a given decimal number needs to worked out.

The distance between a given decimal number and the nearest DP tie can be calculated harnessing multiple-precision software. All of the calculations presented here were done using the FOSS XL-Addin called xNumbers. A simple VBA-script was used to output the properties of the multi-precision results obtained when dividing π by all integers between 1 and 2E+7. In order to assure that the absolute value of the relative measure ULP remained constant, all results were scaled by a power of two so that all the final results are found in the range between one and two. This kind of scaling does not change the relative position of a decimal number with respect to the quantisation interval of the nearest DP, because these intervals just scale likewise! The scaled results were then ordered by decreasing distance to the nearest DP tie. In xNumbers, there is a built-in function "dgmat" that outputs the amount of matching digits of two xNs. Adding one to the result of "dgmat" thus yields a good estimate of the amount of significant decimal figures that would allow for correctly parsing each of the scaled results to the DP nearest to it. The following table summarises some of the results.

amount of decimal significant figures for selected multiples of pi

It can be seen that 19 decimal SFs enable parsing π/3 to the nearest DP, whereas for π/59425 23 SFs are needed. Of course, you should be aware that a "stupid" parser might use not all the figures and thus yield an incorrect result! In order to find the scaling law that relates the distance dist_tie between a decimal number and its nearest DP tie to the amount SF_dec of decimal SFs that needs to be specified for enabling parsing to nearest DP, the following figure depicts the values found in the last two columns of the preceding table (orange rhombuses). The normalised quantity dist_tie takes values between zero (exact tie) and one (exact DP); its values are computed by removing the sign from the non-normalised values as well as by dividing them by 0.5ULPs.

relation between distance to tie and amount of decimal significant figures for selected multiples of pi

The line drawn in blue colour corresponds to the finally deduced scaling law:

SF_dec ≅ 17 - log10(dist_tie)

Summarising, the amount of decimal significant figures to which a mathematical constant has to be specified in order to enable parsing-to-the-nearest-DP depends on the distance between this constant and its nearest DP tie. Unfortunately, this distance is not known until the constant has been parsed, so that this has to be done in the multi-precision environment in which the constant is calculated! This solution is mathematically correct but practically unsafe and stupid as well, because all the parsing work would have to be done at least twice: once when writing the code and once again during code execution. A simple improvement is obvious though: instead of using the probably very long numerical string corresponding to the mathematically exact value of a constant, simply the numerical string comprising the 17 decimal significant figures of the nearest DP is to be used.

This method can still be greatly improved by avoiding any parsing at run-time, which can be achieved by reading/writing bit patterns directly from/to the memory of a computer. All major programming languages provide means for this kind of manipulation. It usually converts a DP number into 16 hexadecimal figures of which each corresponds to 4 bits, and vice versa. Unfortunately, two different schemes co-exist that define the ordering of the 8 bytes of a DP number: little-endian and big-endian byte order. Within a byte, the bits always follow the big-endian scheme. Thus, most humans like to read these
hexadecimal numbers in big-endian notation, and that is why it is used here. Of course, the endianness of stored hexadecimal strings should be fixed (e.g., to big-endian), and the conversion routines should account for the endianness of the machine they are executed on. Concluding, the statement given at the beginning of this post should be replaced by something like this:

Declare pi_on_3 as Double Precision
pi_on_3 = HEX2DP("3FF0C152382D7366")

where HEX2DP calls the function that converts the 16 hexadecimal figures into a DP number. The code of this function depends on the programming language as well as on the hardware platform.

follow-up: using multiple versions of gcc on Windows

Posted in Compiler, Computer, Fortran, gcc, Programming Languages, Software on April 23, 2012 by giorgiomcmlx

I got some mails in which I was asked to describe the solution to the “liblto_plugin-0.dll not found” problem a bit more detailed. So let us assume that you installed (or plan to do that) all versions of gcc previously obtained from equation.com as subdirectories of the folder “X:\myCompilers” (myCompilers might be a path name!). Let us further assume, for example, that you decided to call these subdirectories “gcc461” and “gcc470” according to the version numbers of the gcc suites during the installations. At least, this convention will turn out to be a smart idea. After having firstly saved the environment variables to disk and secondly having erased all references to gcc from them, you might then want to create a DOS batch file called “go_gcc.bat”, for example, in the root directory of drive “X:”. Two lines of this batch file then set two environment variables to the values required by a specific gcc version obtained from equation.com:

set PATH=X:\myCompilers\gcc%1%2%3\bin;E:\Projekte\COMPILER\gcc%1%2%3\libexec\gcc\i686-pc-mingw32\%1.%2.%3;%PATH%
set EQ_LIBRARY_PATH=X:\myCompilers\gcc%1%2%3\i686-pc-mingw32\lib

Of course, you have to use the saved information about the path names on your PC in order to correctly substitute all the symbolic paths names used in this post. Finally, if you want to use gcc and thus start cmd, you only need to enter “X:\go_gcc 4 6 3” in order to enable version 4.6.3 of the gcc suite, for example. If you plan to install multiple versions of gcc, it is even smarter to use “gcc_x.y.z” as folder names, as you then only need to use a single batch variable that holds the value of “x.y.z”:

set PATH=X:\myCompilers\gcc_%1\bin;E:\Projekte\COMPILER\gcc_%1\libexec\gcc\i686-pc-mingw32\%1;%PATH%
set EQ_LIBRARY_PATH=X:\myCompilers\gcc_%1\i686-pc-mingw32\lib

You would want to start this batch file typing in “X:\go_gcc 4.6.3”.

I hope things are obvious now.

strange error message by GCC on Windows platform

Posted in Compiler, Computer, Fortran, Programming Languages, Software on April 22, 2012 by giorgiomcmlx

After having installed a second version of the GCC suite on the MS-Windows platform of my PC, it turned out to be impossible to use the gfortran front end of the older one. I had downloaded both editions from the equation.com web page.

When trying to link a program using the older gcc/gfortran version, I always got an error message that looked like “fatal error: -fuse-linker-plugin, but liblto_plugin-0.dll not found”. Unfortunately, even a very extensive web search did not help. As I had not changed the position of the old files in the directory tree, it must have been the installation routine that created that mess. Indeed, the problem turned out to be caused by the installation routine adding the paths to two directories of the newer gcc version at the beginning of two environment variables, i.e., the user variable EQ_LIBRARY_PATH, and the system variable PATH. Because the name of the file is still the same in the newer version, the older gcc suite detects a wrong version of that file and raises the error message. Obviously, the error message should rather read “gcc found a wrong version of …”.

Thus, the cure to the problem is to edit these two environment variables manually. After having saved their proper content to text files by typing “set > set.txt” and “path > path.txt” on the command line, e.g., erase all information related to the gcc versions from both variables. In order to change PATH permanently, you need to have admin privileges. After that, just use a batch file on the command line to set the content of both variables according to the version of gcc that you are going to use in that very session of the command interpreter and all is fine. At least, that’s how it worked for me.