Wednesday, December 31, 2014

Constant Token Codes

The code and token type enumerations will be combined into a single code type enumeration.  Before proceeding, the parser needs to return all tokens assigned to a code.  This has been mostly accomplished, though one exception is constant tokens, which were still being assigned codes in the translator.  This is complicated because the type of numerical constants may not be known when the constant is parsed.  Consider these statements:
A = B + 5
A% = B% + 5
A% = B% + 5.4
For numerical constants, both the integer and double representations of the constant is stored in the constant dictionary except for the case where a double constant does not fit into a 32-bit signed integer.  Optimally the representation required is used without a hidden conversion code to unnecessarily convert the constant.  For the first statement above, the double value of the constant is used.  With the other two statements, the integer value of the constant is used.  Number constant tokens have three states:
  1. An integer (no decimal point or exponent; fits into 32 bits)
  2. A small double (has a decimal point or an exponent; fits into 32 bits when converted)
  3. A large double (does not fit into 32 bits; cannot be converted)
The token was set to the integer data type for an integer and small double, and the double data type for a large double.  For small doubles, the Double sub-code was set.  This required many [somewhat complicated] checks in the translator.  To simplify these checks, the data type is now set to integer for integers only and double for all doubles.  For small doubles, a new Integer Constant sub-code is set (which does not survive past the translator and therefore does not use one of the available sub-code bits).

Instead of passing whether a number is allowed flag to the parser, the requested data type is now passed.  If the data type is integer or double, the code of a constant token is fixed (the Integer Constant sub-code is not needed and is cleared if set).  For other requested data types, either the Constant or Constant Integer code is set is described above with the Integer Constant sub-code set for small doubles.  The parser also now sets the Constant String code for string constants.  The parser makes no attempt to report any errors for data type mismatches.

The decimal flag argument of the token constructor for double constants was removed as the data type is set to Double and the Integer Constant sub-code is set if the value is within the integer range.  The convert code was cleaned up by making the desired data type the primary switch and there was no need for secondary switches on the token data type since only one of two data types need to be checked for each desired data type.  A convert constant helper function was added to handle changing constant token codes.

The table find code function was simplified due to the change on how constants are represented.  The first argument of the set token and set token code functions were changed from a standard shared token pointer reference to a straight token pointer so that they can be called from the parser (with a standard unique pointer), translator (with a standard shared pointer) or token member function (with just a pointer).  This simply required calling the get access function of the unique or shared pointers.

The translator get operand function no longer sets the code for constants.  The get expression and process internal function functions no longer need to look for and set the codes for constants (the later needs to clear the Integer Constant sub-code for functions taking both number argument types, specifically ABS, SGN and STR$).  And the get token function now only needs to pass the data type to the parser.  The token convert and table find code functions are used by the translator and will finalize constants not set by the parser once the final data type is known.

[branch table commit 3099f8850f]