Friday, January 2, 2015

Token – Header File Refactoring

With the new table model, the token class will contain a pointer to the table entry of the code instead of the code enumerator (currently used as an index).  The table header file currently includes the token header file for the many table functions with a token pointer argument.  This is no longer necessary since only token pointer and token pointer references are used (forward declarations are sufficient).

When the token class contains a pointer to the table entry, optimal access to the table entry members will be achieved if the entry access functions are defined in-line.  Similarly, optimal access indirectly through to the token access functions to the table members is desired.  This will require the table header file to be included in the token header file.

Therefore, some header file refactoring was performed where the token header include statements were moved from all the other header files to the associated source file with the exception of the table header file, which is uses the token type enumeration.  When this enumeration is replaced with the code type enumeration, this statement can be removed.  Forward declarations were added as needed.

[branch table commit 0777959464]

Parser – Token Creation

A side effect of the last change was that tokens for both codes for operators, functions and commands and codes with operands (constants, variables, arrays, defined functions and user functions) were using the same token constructor.  This token constructor searched through alternate codes for the code with the appropriate return data type.

This was unnecessary for operator, function and command codes.  The table new token function called for these codes passed in the return data type from the table entry of the code.  The token constructor then called the new table set token code function.  Since the data type matched the return data type (which was just passed in), no alternates were checked and the code, type and data type of the token was set.  This was extra unnecessary work.

A new token constructor was added for operator, function and command codes, which only required arguments for the code, column, length and string.  The string argument is only used for the REM and REM operator codes.  This constructor replaces the table new token function.  This constructor calls the table set code function which just sets the code, type and data type of the token from the table entry of the code.  For consistency the code argument was put first in the other token constructor for codes with operands.

While looking at the creation of tokens, I decided that using the standard unique pointer within the parser was unnecessary.  The parser can just allocate a token and return its pointer.  The translator then can put the allocated tokens into a standard shared pointer.  The parser was changed to use plain token pointers.  The translator routines were changed to use the new token constructor directly via the standard make shared function.  The translator get operand was changed to use the reset function to set the token member since shared pointers cannot be assigned directly to a pointer.

[branch table commit 1bfb76ae0a]

Parser – Codes With Operands

The last token type not being set fully in the parser were codes with operands (constants, variables, arrays, defined functions and user functions).  Constant tokens were corrected with the last change.  Arrays, defined functions and user functions are not fully implemented and so did not need to be changed.  Variables however, were only partially set in the parser (only to the base Variable or Variable Reference code) and weren't set for the data type of the variable until the translator.

The parser get identifier function was modified to set the data type to Double if the word obtained from the input does not have a type.  This applies to all identifiers not found in the table.  The token constructor for codes is used for commands, operators, functions and codes with operands.  The type argument was unnecessary since that is set from the table entry.  However, an issue was found with how codes were found in the table.

For operators and functions, the [return] data type of the token is set from the table entry.  (This issue doesn't affect commands since command don't have a return data type.)  For codes with operands, the data type of the identifier is used to find the appropriate table entry (for example, Variable, Variable Integer, or Variable String) by looking at the data types of alternate codes.  The current table set token code function did not work correctly because it searches alternate codes by operand data type.  For this instance, the alternate codes need to be search by return data type.

A new set token code function was added without an operand index argument to search by return data type.  If the data type (of the identifier) does match the code passed, then the alternate codes are searched for a matching return data type.  If there are no alternates or none were found, then the code passed is set in the token along with the token type of the code.  The data type is set to the data type of the identifier and not from the table entry (which may not match for codes like arrays that are not fully implemented yet).

The type argument was removed from the token constructor for codes.  The type from the table entry of the code was passed (and the new set token code now does this).  A call to the new set token code was added to the body of the constructor (previously empty).

Since codes for constants, variable, and variable references were found in the table incorrectly by operand data type, these table entries contained operand data types so that it would work.  These codes do not have operands (in the sense that operands and functions do within expressions; not to be confused that in the program, these codes do have an operand index).  These table entries were corrected with expression info instances containing no operands.

The translator get operand previously set the default data type of the token just obtained (set to Double if None and not a function).  This was removed since the parser now does this.  The token set default data type function called to do this was removed.  The call to set the code for a no parentheses (variable) token was also no longer needed.  With the parser now setting the default data type to Double, the expected results to the parser tests (#2, #3 and #5) needed to be updated.

[branch table commit acc37f0650]