An issue was discovered where if the token after an operand was a unary operator (this token should be a binary operator or a token terminating the expression), the process operator routine incorrectly processed the unary operator assuming it was a binary operator causing the code to malfunction.
A check was added after getting this token where if it is a unary operator, the "expected binary operator or end-of-statement" error is returned. This error, like the "expected operator or end-of-statement" error, will also need to be changed to an appropriate error by the caller. A test for this was added to expression test #1.
[commit 261c9647df]
Tuesday, July 2, 2013
New Translator – Parser Errors
There are currently two types of parser errors, an unrecognizable character and an incorrect number constant (of which there are five different ones). The number constant errors may have an alternate column (for example, when there is an error in exponent of a floating point number, the alternate column points to the error and the column points to the beginning of the number) or a length more than one (for example, two consecutive decimal points at the beginning).
When the translator is expecting an operand, the number constant errors should be reported as is (for example, pointing to the bad exponent or both decimal points). However, when the translator is expecting some other token (like an operator), a different error appropriate for the situation (for example, an "expecting an operator or end-of-statement" error) should be reported and it should be pointing to the only the first character of the bad token.
The get token routine was modified to take a desired data type argument instead of an operand flag. A none data type indicates that the token desired is not an operand. If the token obtained has an error, then it is marked as unused (it needs to be deleted). If not getting an operand token and the data type of the token is double (indicating a number constant error), then the token length is set to one (the error will point to only the first character of the token). For the error to return, if getting an operand token and the data type of the token is not double (indicating a parser error), the error is set to the appropriate "expecting XX expression" error for the desired data type argument, otherwise a generic parser error is returned (so the caller will used the number constant error message in the token).
The loop in the get expression routine was modified to eliminate duplicate code before the loop and at the end of the loop. At the end of the loop, the token pointer is set to null to force getting a new token at the beginning of the loop (I wish there were a way to prevent having this extra check, but there isn't without resorting to a goto statement and a label). Each of the return statements were changed to break statements with the final return at the end outside of the loop.
Also in this routine, the statement for getting the operand and next token was broken into two so that the error can be changed to an "expecting operator or end-of-statement" error when get token returns an error (any parser error). This error will need to be changed to the appropriate error by the caller. For example, in an IF statement when it calls to get its expression, if this error occurs, the error needs to be changed to an "expected operator or THEN" error.
In the new translate routine, when setting the error message in the RPN output list, the error string is obtained from the token for a parser error instead of from the error status. This also needed to be done before an unused token is deleted (tokens with parser errors are marked as unused).
Finally, the set operand state function of the parser was removed. This was always and only called before calling the token routine, so an operand state argument was added to this function. Several parser error tests were added to expression test #1. It was also discovered that the autoenums.h include file was not always being regenerated when the Token class source file was modified, which turned out to be a wrong dependency listed in the CMake build file.
[commit a194cdbfb6] [commit 348ec2230b]
When the translator is expecting an operand, the number constant errors should be reported as is (for example, pointing to the bad exponent or both decimal points). However, when the translator is expecting some other token (like an operator), a different error appropriate for the situation (for example, an "expecting an operator or end-of-statement" error) should be reported and it should be pointing to the only the first character of the bad token.
The get token routine was modified to take a desired data type argument instead of an operand flag. A none data type indicates that the token desired is not an operand. If the token obtained has an error, then it is marked as unused (it needs to be deleted). If not getting an operand token and the data type of the token is double (indicating a number constant error), then the token length is set to one (the error will point to only the first character of the token). For the error to return, if getting an operand token and the data type of the token is not double (indicating a parser error), the error is set to the appropriate "expecting XX expression" error for the desired data type argument, otherwise a generic parser error is returned (so the caller will used the number constant error message in the token).
The loop in the get expression routine was modified to eliminate duplicate code before the loop and at the end of the loop. At the end of the loop, the token pointer is set to null to force getting a new token at the beginning of the loop (I wish there were a way to prevent having this extra check, but there isn't without resorting to a goto statement and a label). Each of the return statements were changed to break statements with the final return at the end outside of the loop.
Also in this routine, the statement for getting the operand and next token was broken into two so that the error can be changed to an "expecting operator or end-of-statement" error when get token returns an error (any parser error). This error will need to be changed to the appropriate error by the caller. For example, in an IF statement when it calls to get its expression, if this error occurs, the error needs to be changed to an "expected operator or THEN" error.
In the new translate routine, when setting the error message in the RPN output list, the error string is obtained from the token for a parser error instead of from the error status. This also needed to be done before an unused token is deleted (tokens with parser errors are marked as unused).
Finally, the set operand state function of the parser was removed. This was always and only called before calling the token routine, so an operand state argument was added to this function. Several parser error tests were added to expression test #1. It was also discovered that the autoenums.h include file was not always being regenerated when the Token class source file was modified, which turned out to be a wrong dependency listed in the CMake build file.
[commit a194cdbfb6] [commit 348ec2230b]
Subscribe to:
Posts (Atom)