Sunday, November 30, 2014

Translator – Better Token Handling

A common pattern in the translator functions was a reference to a token pointer argument that a token was pass into and out of the function.  The token returned was the next token that the function could not process (a terminating token).  Several of the functions allowed an unset token pointer, in which case, they would get a token, otherwise they would use the token passed in.  This was a strange pattern and somewhat difficult to work with.

The translator functions were modified to a better pattern where a member token pointer was added to the translator class to hold the current token.  The translator functions were modified to use this current token member.  The token is moved out of this member variable when successfully processed (consumed).  The token pointer reference argument was removed from the translator functions.

The get token function was modified to put the token obtained from the parser into this new member, and the token argument was removed.  If the current token member already has a token, then no action is taken.

The process command function when modified was reduced to only a few lines and since it was only called once, its code was moved to the get commands function.  The command token and token arguments were removed from the LET, PRINT and INPUT translate functions, which were modified to get the current token from the translator.  Several access functions for the current token were added to the translator including getting a constant reference to the token pointer member, reseting the token pointer member, and moving the token pointer out of the member.  The latter two force a new token to be obtained upon the next get token function call.

The get operand function was modified to leave the current token pointer member empty if a valid operand was processed (added to the output list and pushed to the done stack).  This forced the next call to the get token function to obtain a new token from the parser.

For sub-string assignments, the get operand function set the reference flag of the token if the token was a sub-string function (LEFT$, MID$, or RIGHT$) and a reference was requested.  The process internal function function to identify a sub-string assignment and to request a string variable for the first argument of the function.  Upon return, this reference flag was cleared.  This reference flag toggling was replaced by passing the reference enumerator as an argument.

There was some code in the get operand function that set the code of the token to the Define Function with Parentheses code enumerator.  These statements were moved to the process parentheses token where the similar statement reside for setting the Array and Function codes.

Some other minor changes were made including changing the RPN list token access function to return a reference to the token pointer instead of the token pointer itself (to prevent the copying of the token pointer and updating the shared pointer use counts), reorganizing some code in the various routines, renaming some local token variables, and updating comments for changes made.

[branch misc-cpp-stl commit 828f210f18]