Saturday, September 21, 2013

Translator – Assigning Variable Codes

In the get operand routine, the cases that fell through to the next case were replaced with the actual code needed each the case (in some cases, checks were performed that had already been done or were not necessary).  For now, the identifier with no parentheses token type is assumed to be a variable.  Eventually, the function dictionary will be checked to see if the identifier is a function before assuming it is a variable.  The token is assigned a variable code, or a variable reference code if a reference was requested.

Previously when a reference was requested, the reference flag of the token was set.  This is no longer necessary since nothing down stream needs to check the reference flag (except the debug output code).  The reference flag was added to the variable reference code table entries so that the debug output code knows to output the reference indicator for this token type.

To validate that identifiers with no parentheses are being assigned a code by the end of translation, the token text routine was modified to output a question mark in front of a these tokens if no code has been assigned.  When the with index flag argument is set (during encoder test output), these tokens output the code and operand (the string of the token) instead of just the string since these tokens will be encoded as two program words (the variable instruction and its index).  Also, variable token codes are no longer assigned a code in the output assign codes routine called at the end of translation.

To make the variable output work correctly for both translator and encoder test output, the token type in the table entries for the six variable codes were changed from the internal function with no parentheses type to the no parentheses type.  This will make these no these tokens go through the no parentheses token type case in the token text routine.  This is inconsequential since the token type disappears once the tokens are encoded.

[commit db6795346f] [commit a1f413b2e1]

Translator – Assigning Constant Codes

In the get operand routine, if the token is a constant and the data type requested is a specific data type (double, integer or string), the token can be assigned one of the constant codes.  For the any, none or number data types, it is not yet known what type the constant needs to be, so the code set assignment code will be delayed until it is known.

This will generally take place when the find code routine is called as the type of constant needed will be determined by the operator being processed.  A constant token will be assigned before returning unless an error detected (wrong data type or a double constant that cannot be converted to an integer).

For most functions, constant arguments the will be assigned a code by the get operand routine since the arguments are for a specific data type.  A few functions have multiple forms for double and integer arguments (ABS, SGN and STR$).  For these functions, a constant argument is assigned a code since no code has not been assigned yet.  Integer constants containing a decimal or an exponent (the double sub-code is set) are first changed to the double data type.

Previously, at the end of the get expression routine, the data type of a constant was forced to the expected data type passed.  This would be another place a constant token needs to be assigned a code, however, it turned out that this code was never executed because the type of constants had already been determined and assigned a code by the end of this routine, so this code was removed.

To validate that constants are being assigned a code by the end of translation, the token text routine was modified to output a question mark in front of a constant token if no code has been assigned.  When the with index flag argument is set (during encoder test output), constant tokens output the code and operand (the string of the constant) instead of just the constant since these tokens will be encoded as two program words (the constant instruction and its index).  Also, constant token codes are no longer assigned a code in the output assign codes routine called at the end of translation.

To make the constant output work correctly for both translator and encoder test output, the token type in the table entries for the three constant codes were changed from the internal function with no parentheses type to the constant type.  This will make constant tokens go through the constant case in the token text routine.  This is inconsequential since the token type disappears once the tokens are encoded.

[commit fa0916f67b]

Table – Finding and Setting Token Codes

At the moment, the token codes are assigned to the constant and identifiers with no parentheses token types after the translation is complete.  This was initially the first step of encoding, but was moved into the translator.  Before moving this code assignment earlier in the translation,  a simplified routine was needed where it is given a token with a data type and for a base code, find a code for that data type (which will be either the base code or one of its associated codes).

The table find code routine was being called to perform this action, but this routine does a lot more including converting a constant to the expected data type (so no hidden conversion code needs to be added), finding a conversion code if there was no associated code with the desired data type, and returning the expected data type when data type cannot be converted (an error).

The find code routine contained the part for finding an associated code, so this part was moved into the new set token code routine.  This new routine just looks up an associated code if the expected data type of base case does not match a specified data type for a given operand of the code.  If the base code or an associated code matches, then the token is set to the code and its type and data type is set to that of the code.  Otherwise it returns false.

A secondary simplified set token code routine was added for setting the code of a token for a base code for the data type already in the token.  The base codes that are used for this routine have only one operand, which currently only includes the Const, Var and VarRef codes.  This secondary routine just calls the main set token code with the data type of the token and for the first (and only) operand of the base code.

[commit 32eec6f3c0]