Sunday, October 14, 2012

Updated Auto-Generated Test Header

When correcting the parser code, it was noticed that there were two constant arrays that contain text strings for the names of their associated enumerations, namely the token and data type enumerations.  These are used, along with the code name array (that is already auto-generated) in the parser test code to output the test results.

The issue is that the text strings must match the enumerations.  The code name array was already set up to be auto-generated (getting the text strings by scanning the table.cpp source file).  The token and data type enumerations are contained in the ibcp.h header file.

The awk script test_codes.awk was generating the contents (text string elements) of the code names array.  This script was renamed to test_names.awk and was modified to generate a complete array definition along with scanning ibcp.h header file for the token and data type enumerations to create name arrays for these also.  The generated file was changed from test_codes.h to test_names.h and the CMakeLists.txt was modified accordingly.

Several other minor changes were made along with this commit, for details see the commit log on GitHub.  The remaining issue is to add the test programs to CMake, which were in the older make file that has been removed.

Update (9:06): I should now better then to rush out a change without testing on both major platforms.  It turns out the updated test_names.awk script reported an error on Windows.  The problem was that the awk script was in DOS format and contained line continuations (back-slash at the end of the line).  The awk program saw the CR at the end of these continuation lines and got confused.  These were removed to eliminate the problem and GitHub was updated.

All Platform Parser Issues Resolved

On the Windows platform, the C format specifier "%g" outputs three digits in the exponent always whereas on Linux, only two digits output unless three are needed.  Looking at the C99 standard, it says basically "at least two exponent digits" should be output, so with that criteria, Windows is acceptable.  It also appears that Microsoft decided that three exponent digits should always be output, and the GNU compiler suite decided to follow this on Windows.  There is no way to control the number of exponent digits in the "%g" specifier.

The parser print token routine was modified to output two exponent digits if the first one is zero.  This was accomplished by first outputting the number into a temporary character buffer using the "%g" specifier.  Then it searches for an 'e' character in the string.  If the fourth character from the 'e' is not a zero-terminator, then that means there are three exponent digits.  If the second character from the 'e' is a '0' then the last two digits is moved on top of this zero.

Hit the Continue... for a code snippet of the solution.  This change will now output two exponent digits always if the first one is zero.  This code will do nothing on the Linux platform.  The expected results for parser test #3 were updated for two exponent digits were applicable (five instances).

Since the major CMake issues have been resolved and the 0.1.16 release will be about the build system, the cmake0.1.16 branch was merged to branch0.1.16 branch.  This last commit was also tagged as v0.1.16‑pre‑1.  On GitHub, a zip (or .tar.gz) files can be downloaded directly for a tag, which contain all the files at that tag, so it is not necessary to install git to retrieve the files.  This file does not contain the git repository however.  There are still a few more build issues to take care of.

Resolving Platform Parser Issues

The first parser issue is that on the Windows 7 (64-bit) platform, values as small as around 1e‑323 are accepted without a range error, whereas on the Linux (64‑bit) and Windows XP (32‑bit), values smaller than around 1e-308 produce a range error.  This is equivalent to the large value around 1e+308.

I thought possibly this had to do with Intel (what Windows 7 is running) on and AMD (what Linux and the Windows XP virtual machine are running on).  However, in testing the limits on a Scientific Linux 6.2 (equivalent to RedHat Enterprise 6.2) 64-bit virtual machine on the Intel, the limit was around 1e‑308 (though it was with the older GCC 4.4 vs. 4.6 on the others).

The float and long double conversions were also tested on all three platforms.  The float had a limit of around 1e‑38 on all three platforms as well as the long double having a limit of around 1e‑4931.  So I have no explanation why double conversions are different on Windows 7.  Both 7 and XP have identical versions of MinGW installed.

So instead of fighting this problem any longer, the 1.234e‑308 test value in parser test #3 was changed to 1.234e‑324, a value that will produce a range error on all three platforms.  Next to deal with the differences in the number of exponent digits output.