Wednesday, October 22, 2014

Parser – Operand State

When the translator is expecting an operator, both type of parser errors (unrecognizable character or numeric constant) are treated the same, and the error reported is appropriate for the situation (expecting an operator, comma, closing parentheses, etc.).  However, when the translator is expecting a non-reference operand, the unrecognizable character error is reported as some type of expecting expression error is and numeric constant errors are reported as is (one of the six).  For reference operands, the translator reports the appropriate expecting variable error.

A while ago, an operand state was added to the parser so that negative constants would be correctly interpreted instead of the unary negate operator and a positive constant.  The get number function checked the operand state when a '-' appears at the beginning of the number.  For the operator state, a '-' this function terminated indicating a numeric constant was not found, and it would then be interpreted as an operator by the get operator function.

To simplify checking for parser errors in the translator and ease the transition to exceptions, the parser was modified to return a single error ("unknown token" replacing "unrecognizable character") when in operator state since this error now included unexpected numeric constants.  This error will not be seen by the user and is used by the translator to report appropriate errors.  In operator state, there is no reason to parse a numeric constant.

The get number function is now only called for the operand state, so this function no longer needs to check for the operand state itself.  Since the operand state variable was now only used in the main operator function, it no longer needed to be a member.  The boolean argument value was also changed to a enumeration class (named State with values of Operator and Operand), so values are more explicit in calls to the parser operator function.

The default value for this state argument was also removed, requiring an argument value to be added from the Tester class call, which previously used the default (incorrectly operator state).  The operand state is used so that numerical constants are parsed (since numerical constants are no longer parsed in operator state).

These changes caused a problem in the translator where the wrong error was reported when assigning a numeric constant (for example, 2=A) as the parser was now returning an unknown token error for numeric constants since the translator requested a command token in operator state.  Previously a valid non-command token (numeric constant) was returned and passed to LET translate routine, which reported the appropriate error ("expected item for assignment").  Now with an error, the "expected command" error was reported.

The get commands function was modified by adding the Any data type argument to the get token call, which puts the parser in operand state.  This works because any token not a command token is passed to the LET translate routine, which reports an error for any non-reference token.  The check for an error from this get token call was no longer necessary since error tokens are passed to the LET translate routine (which reports the appropriate error).

These changes affected a result for parser test #3 (number tests) with the '-2147483647' number, which was being parsed as a negate operator followed by a positive integer.  This test was intended to check for the maximum negative integer constant, which it was now since the operand state is now being used for testing.  Two addition numbers were added to check one beyond both the maximum positive and negative constants (parsed as double constants).  The results for parser test #5 were also updated for the change in the "unknown error" status.

[branch parser commit 04c4fea820]