Saturday, August 24, 2013

Table Initialization – Expected Data Type

While reviewing the various To-Do entries marked in the code - words NOTE, TODO, and FIXME in comments that QtCreator highlights when the Todo plugin is enabled (Help/Plugins under Utilities) - there was a FIXME "remove" on a check in the table setup and check routine called during initialization.  This check was for an unset expression information structure for an associated code.

This check was in a loop that scans all the tables entries and for each entry that contains operands (operators or internal functions), sets the expected data type for the last operand of an operator or first operand of non-operator.  It does this by scanning the main code and all its associated codes recording the data type expected for each.  After recording all the data types, if both double and integer was found, the expected data type is set to number, or if all data types (double, integer and string) were found, the expected data type is set to any.

However, the associated codes for the sub-string functions, which are set to the related assign sub-string function codes should not be searched and therefore have the second associated code index set to -1.  The check loop was not checking for a -1 index and proceeded into the loop and should not have, so the check was added.  The check with the FIXME for an unset expression information structure was in fact necessary because some associated codes do not have this structure, specifically the input parse type codes.

While studying this code, it was discovered that the AssignList code contained associated codes for AssignListInt and AssignListStr.  This was used by the old translator routines, but not for the new translator routines because the AssignListType codes are now associated codes of the AssignType codes.  The unnecessary associated codes were removed.

[commit 07ec1e4c4f]

Old Translator Removal – Reference Flag Cleanup

The process done stack top routine (formally the find code routine, see post from August 17) contained a section that for the first operand of a sub-string function used in an assignment or an assignment internal code token (determined if the sub-string token had the reference flag set) if the item on top of the done stack did not have its reference flag set, an “expected assignment item” error was reported. If the reference flag of the token was not set, then the reference flag of the item on top of the done stack was cleared (a reference is not needed). At the end of the routine. if the data type of the done stack top item was not correct or could not be converted, the reference flag state was used to determine which error to return.

This reference flag functionality is no longer needed in this routine since the checking of references is handled else where in the new translator routines (specifically by the using the reference argument of the get operand routine and by the get internal function for sub-string assignments). This code was removed, and since it was removed, the INPUT and LET translate routines no longer need to set the reference flag before calling this routine (indirectly via the process final operand from the INPUT translate routine) and clearing it afterward.

To simply the code a bit more with respect to the reference flag, specifically pertaining to sub-string assignments where the reference flag is set for a sub-string function token in the get operand routine (which the get internal function routine uses to determine if a string reference should be requested for the first operand), the reference flag is cleared upon return since the reference status is not needed (a reference was already obtained).

Old Translator Removal – Process Final Operand

For the old translator, the process final operand routine handled the processing of the final operand of operators, internal functions, tokens with parentheses (arrays or functions) and internal codes (for example assign type, print type, and input assign type).  The handling of internal functions and tokens with parentheses are now handled elsewhere by their respective get routines.  So for the new translator, this routine is only called for operators and internal codes.

The code for handling tokens with parentheses, which included the attaching of the operands from the done stack, was removed.  It turned out that is was no longer necessary to check for a reference token, which was used to determine if the item to be added to the RPN output list should also be pushed to the done stack.  Since no internal codes need to be pushed to the done stack, it now only pushes operator tokens to the done stack.  The PRINT translate routine was modified to only drop the done stack top item for the print only functions (TAB and SPC).

The process final operand routine was simplified after removing the unused code.  The process done stack top routine (formally the find code routine, see post from August 17) is stilled called, which returns the first and last operands of the item was on the done stack top (the item is popped before returning).  Afterward, the first operand is deleted if it is an open parentheses.  For an operator token, the first operand is set to the operator token for a unary operator or the first operand of a binary operator, and the last operand is set from last operand of the item that was on done stack top.  For an internal code, the last operand from the done stack top item is deleted if it is a closing parentheses.

[commit ca3b513ac9]

Old Translator – Removal (Unused Definitions)

There are two sub-code definitions that were only used by the old translator routines.  The semicolon sub-code was used for unnecessary semicolons that were entered.  Since this is no longer permitted, this sub-code was removed.  The end sub-code was used to mark the last input parse code in an INPUT statement.  Since the new INPUT translation uses the input begin code to mark the end of the input parse codes, this sub-code is not needed and was removed.

The values of the sub-codes were modified to close the bit gaps left by the semicolon and end sub-codes.  Also several sub-codes that are only used by the translator (used, last, and unused) were given higher bit code values.  It will be convenient and desirable if the same sub-code definitions can be used by the translator and program code.  The program sub-code bit values must fit in a limited space of a 16-bit instruction word that will be shared with the code value.  It may also possible to share bit values for sub-codes that will never be used with the same code.  For example, the question and keep sub-codes will never be used on the same code (input begin string vs. input and input-prompt) so the same bit values could be used.

There was also the end-expression flag on table entries that could signal the end of an expression (close parentheses, comma, semicolon, rem-operator, and end-of-line).  This flag was needed for the token centric old translator, but not used for the new translator, so it was removed.

[commit d356621d95]

Old Translator – Removal (Sub-String Data Type)

The sub-string data type was used by the old translator to identify the sub-string functions (LEFT$, MID$, and RIGHT$) that can be used to assign part of a string variable.  The idea was more appropriate with the original String class that would make handling during run-time easy.  The String class has since been replaced with the Qt QString class, which has different requirements during run-time and this has been accounted for with the new sub-string assignment translation scheme (see posts on new design and with multiple assignments).

The sub-string data type is not needed in the new translator routines and has been removed.  The returning data type of the LEFT$, MID$ and RIGHT$ functions is now just a String like all of the other string functions.  For assignments with these functions, the new sub-string flag is used.  The AssignSubStr code was removed because it was replaced with the AssignLeft, AssignMid2, AssignMid3, AssignRight codes and the AssignListMix code was removed because the various AssignKeep codes replace its functionality (see posts referenced above).  See the commit log for other changes made to remove the sub-string data type.

[commit 71e97bffa2]

Old Translator – Removal (Step 1)

The old translator routines will be removed in steps, the first step being the largest.  All of the functions related to the old translator were removed along with the token handler and command handler functions.  Also removed were the translator variables only used by the old translator routines including the state, mode, count stack (used for arrays and functions) and command stack (used for commands).  This were are needed for the token centric old translator.

The program model was changed temporarily to call the new main translate routine, however, this will be changed back once the new translator routines are renamed (removing the "2" in their names that were added to avoid a conflict with the old translator routines).

The temporary test command line options to activate the new translator were also removed.  The original test command line options now use the new translator.  The old translator expected results files that were previously saved were removed.  The temporary test scripts for running the new translator were removed and the commands added to look for old expected results files were removed from the original test scripts.  All tests pass with the original test scripts using the new translator routines (Windows testing was not performed yet).

[commit f150fbeb3f]