Sunday, November 2, 2014

Parsing Operators – Standard Library

The get operator function was changed to use a standard input stream (again using a temporary input string stream like the two previous functions).  This function uses one of the table search functions, which was modified to take a standard string.

The table search function used the compare function from the QString class with the case insensitive option.  There is no equivalent function in the standard string class.  The std::equal function is used instead by passing a no case comparison lambda function.  This is the same lambda function used in the Tester class, so this definition was moved to the main header file.  Since the name in the table is still a QString, it is temporarily converted to a standard string.  The std::equal function assumes the arguments are the same size, so the size of the strings are checked first.

The token constructor for codes was changed to take a standard string, which defaults to an empty string.  For now, these are converted for the QString member variable by obtaining a c-style string from the standard string, which is implicitly converted.  The only caller of this constructor using this argument is the new token table function, which was also modified to take a standard string.  Callers of this function using the string argument were modified to pass a standard string.

[branch parser commit 27f06e8714]

Parsing Strings – Standard Library

The get string function was changed to use a standard input stream (temporarily putting the input string from the current position into a temporary local input string stream of the same name as the member variable to simulate the final parser code).  The looking at and the obtaining of current character was changed as previously described.

Instead of incrementing the local position variable for each character in the string constant, this variable is just set to the current input position.  This will be changed to get the position within the input stream stream once the member variable is changed.  The current input position is incremented for each character.  After the change, the current input position member variable will not be needed.

[branch parser commit ab3b1c08f18]

Parsing Numbers – Standard Library

When the parser is changed to use the standard library, instead of placing the input string into a string member variable, it will put into a standard input string stream from which the characters will be pulled from.  A position into the input string will not need to be maintained during the processing of the line.

The get number function was the first to be changed to this model.  Temporarily, the input string from the current position (a substring) is transferred into a temporary local input string stream of the same name as the member variable to simulate the final parser code.  Two failed attempts were made to use standard library functions to parse and read numbers.

The first attempt used the stoi function to convert the number directly.  The problem was that it doesn't report the specifics of the error when the conversion fails, throwing only a invalid argument or out-of-range exception.  The type of error could be determined by a series of complex checks of the string.  A working solution was mostly achieved with one remaining issue.  When an out-of-range exception was thrown, there is no clue as to the length of the string that was processed (which is needed to properly highlight the error).

The second attempt used the extraction operator (>>) to get the number directly into a double or integer variable.  Again detecting an error and determining the type of error was difficult (and was not actually achieved when this attempt was abandoned).  The code again was also complicated.

These attempts were made to try to eliminate the involved (but working) algorithm already in place to parse numbers.  The decision was made to use the current algorithm, and was modified to read from a standard input stream.  The reading of the current character was changed to:
m_input[pos]      →      m_input.peek()
The original code incremented a local position when a character had been processed (will become part of the number string).  This position increment was replaced with pulling a character from the input stream and appending it to the local number string:
pos++;      →      number.push_back(m_input.get());
The various character type tests (to upper, is digit) were changed from the QChar functions to the standard ctype tests.  Once a possible valid number was parsed into the local string, if itt didn't contain a decimal point or exponent, an attempt is made to convert it to an integer using the stoi function.  If successful, an integer number token is created from the local string and returned.  If an out-of-range exception is thrown or had a decimal point or exponent, an attempt is made to convert it to a double using the stod function.  If successful, a double number token is created from the local string and returned.  Another out-of-range exception results in a floating point out of range exception being thrown.

The integer and double token constructors were modified to take standard string arguments.  For now, these are converted for the QString member variable by obtaining a c-style string from the standard string, which is implicitly converted.  This is temporary until the token string member is changed to a standard string.

[branch parser commit 8513875e33]