Saturday, October 25, 2014

Parser – Exceptions

When an error was detected, the parser set its internal error status to an enumerator for the error (either unknown token or a number error) and returned a an error token with its column and length set to the error.  So to return an exception, the status, column and length values need to be included in the exception thrown.  A simple Error structure was added to hold this information.  The parser routines were modified to throw an error exception when detecting an error.  The necessary values are included for this structure:
throw Error {Status::Error, m_token->column(), m_token->length()};
The set error functions were removed along with the error status member and its access function.  Since the Error token type enumerator doesn't indicate an error token anymore, this enumerator was removed.  The table entries that used with enumerator were changed to the default token enumerator (required the first enumerator to be set to 1).  A leftover check in the get identifier routine was removed that set the token string to  "invalid two work command" for an error token type, but this won't occur.

The translator get token routine was modified to catch parser exceptions (using C++ try and catch blocks).  For no exception, the Good status enumerator is returned.  For an exception, an error token is created from the column and length in the error structure.  (The token constructor was modified with an additional length argument and to the C++ initializer syntax.)  The rest of catch section remains the same except the status in the error structure is returned.  The creation of an error token may be removed later if the translator is modified to the exception model for handling errors.

The tester parse input routine was also modified to catch parser exceptions.  The while loop was replaced with a forever loop and the more flag removed.  For no exception, the print token routine is called.  The routine continues with the next token unless an End-of-Line token was returned.  For an exception, the print error routine is called directly, and the routine returns immediately.  The use of exceptions in the parse input routine allowed some simplifications in these print routines.

The print token previously had an error status argument.  If the token type was an error, then the error status was passed to the print error routine with the token column and length.  This check and call was removed, and so was the error status argument.  The tab argument was always true, so it was also removed.  The column, length and status arguments of the print error function were replaced with an error structure reference argument.  This required creating a temporary error structure in the translate input and encode input routines (perhaps later the translator will be modified to throw an error and the program module modified to return an error structure).

It should be noted that the token created at the beginning of the parser function operator routine if left allocated when an exception is thrown.  Previously the token is moved to the caller when the token contained an error the same as with a good token.  This is not a problem because the parser will go out of scope once the translator handles the error and returns.  The parser routines may be able to be changed to only create a token when a good token is found.  This will be considered as the parser is changed to use the STL.

[branch parser commit 14265956f7]

Parser Errors – Removed Date Type

When the parser returned an error, it set the data type of the token to Double to indicate a number error or None for an unknown token.  This was necessary since a number error could be returned when the parser was in operator state.  This no longer occurs after the last change as the unknown token error is returned if the parser finds a character of a number when numbers are not allowed, so the error status alone can be used to determine the error type.  Setting of the token data type for errors was removed.  This reduces the amount of data to send back with an exception.

When the parser returned an error, the get token function of the translator returned the special Parser status enumerator.  The translator routines used this enumerator along with the token data type to determine if the error was an unknown token or a number error.  The parser error type can now determined directly with the error status from the parser, so the get token function was changed to return this status instead of the Parser enumerator.

The checks in the rest of the translator routines for the Parser enumerator and None data type were changed to just check for unknown token.  The check in the get operand routine to set the token length to 1 for non-references when there was a parser error was removed since this is the only possible error.  When getting the token after an argument in the process internal function routine, the check for a unary operator (an error) was moved to before the check for an error, which was changed to return all errors except unknown token.  When this token was not a comma or closing parentheses the appropriate error is determined for the error or bad token.

The special Parser status enumerator was no longer used, so it was removed.  The concludes all the prep work for changing parser errors into exceptions.

[branch parser commit ae9e97696e]