Saturday, January 3, 2015

Parser – Parentheses Token Handling

With the forthcoming change from the token type and code enumerations to the code type enumeration, it will be advantageous if all similar codes have the same code type.  Table flags will be used command, operator and functions codes since some will need their own code type (for example the LET command and equal operator codes).

The codes being discussed are the codes with operands: constants, variables, arrays, defined functions and user functions.  Each have separate codes for each of the three data types, and each (except constants) have an additional three of each data type for reference codes.  Currently only constants and variables are fully implemented.  For variables, there will be a single Variable code type for each of its six codes.  The translator will only need to check this single code type for a variable code.

There was a distinction between token types with parentheses and without (internal functions, defined functions, and generic tokens).  For defined and internal functions, this distinction was removed (the token type enumerators were combined for each).  For internal functions, instead of checking the token type to determine if there are parentheses, the number of operands is now checked.  Eventually defined functions (and later user functions) will have a similar check as the number of operands will be stored in their associated dictionaries.

The parser get word helper function was modified to only check if an open parentheses is present and add it to the identifier string as before, but not take it from the input stream.  The parentheses is still needed for functions when searching the table.  The get identifier function already removed the parentheses if a table entry was not found, but was modified to remove the parentheses from the input for internal functions only.  A new get parentheses access function was added to check if the next character in the stream is an open parentheses, remove the parentheses, and return whether it was a parentheses.

The translator get operand function was modified accordingly.  For the single internal function token type, the process internal function routine is only called if the function has operands.  For the single defined function token type, the process parentheses token is only called if the next character in the parser is a parentheses (calls the new get parentheses function to remove the parentheses).  For the generic parentheses token type (array or user function), the get parentheses function is called to remove the parentheses.

The two separate table entries for defined functions with and without parentheses remain for now with their associated code enumerators and are used to determine if parentheses are present.  Once defined functions are fully implemented, these two codes will be replaced with six codes (as described above) and the defined function dictionary will contain whether there should be parentheses (if there are operands).

The test token stream inserter and print token functions were modified for the change in token type enumerators.  The token has parentheses access function was only used by the print token function, which used a static map member.  The print token function was modified to not require this access function, so it and the static map member were removed.  There was also a static map member for precedence.  Since all tokens now have a code assigned, all precedences can be obtained from the table, so this static member and its access function were also removed.  The expected parser results for tests #2, #4 and #5 were updated for the change in the token types.

[branch table commit 3dd768f604]