Sunday, December 14, 2014

Table Alternate Codes – Operators/Functions (Use)

With the new alternate codes map implemented and partially filled with all the alternate codes for operators and internal functions, the translator was modified to start using this map instead of the associated codes array.

Two access functions were added to the table class, which included the alternate code and alternate code count functions.  Both take code enumerator and operand index arguments.  These functions have temporary implementations.  When the new table model is fully implemented, these functions won't need the code argument as the this pointer will be used as the key to the map.  They will also return a entry pointer instead of a code enumerator.

The binary operator check for a unary operator in the translator get expression routine was modified to use the new access functions.  In the process internal function routine, the new access function is used to get the alternate code for a function for an operand of a different data type as the primary function code, and when an extra argument is found for a function with multiple forms.

There was an issue with the subtract code table entries.  The current associated code arrays are still being used to process operands of operators because the find code routine is still being used to get associated codes for codes that don't have entries in the new alternate map yet, so couldn't be modified to use the new access functions.

The problem was caused by the change to make the main binary subtract code (two double operands) the second associated code of the negate code, which was made the primary code for minus operator.  This subtract code was moved to after the subtract code with the first integer operand.  Since this code was first, it was made the main binary code and the primary binary alternate to the negate code.  This code did not have the correct associated codes on the current associated code array, so hidden conversion operators were incorrectly added to the output list.

This order of the alternates in the table does not matter with the alternate generation, but for the moment, the new alternate map and the associated code arrays need to agree.  This problem was corrected by moving the main subtract code to before the subtract with first integer operand.  Since the code enumeration is not automatic, the subtract enumerator also had to be moved to match the table.  This is a temporary situation.

[branch table commit 01db8002ba]

Table Alternate Codes – Operators/Functions

The alternate codes map will be implemented in a number of steps including adding alternate codes automatically for operators and internal functions, using the alternate map for operators and functions, removing the associated codes for operators and functions, manually adding alternate codes for internal codes, using the alternate map for internal codes, and adding additional alternate codes (to further reduce the need for code enumerators).

First, the definition for the alternate code static map member was added to the table class along with its instantiation in the table source file.  The standard array is new to the C++11 STL and is just as efficient as a built-in array with improvements.  Since the definition is quite lengthy, it was broken into two definitions (there was no reason to define a constant for the 3 since this is the only place that it is needed; indicating up to three operands or arguments):
using EntryVectorArray = std::array<std::vector<TableEntry *>, 3>;
static std::unordered_map<TableEntry *, EntryVectorArray> s_alternate;
The add function was modified to add an entry as an alternate code if appropriate.  Alternate codes are based on having the same name as another code.  If an entry name is newly added to the name to entry static map, then the routine returns immediately, in other words, the code is a primary code.  If the name is already in this map, has an expression information structure, has operands, and does not have its Reference flag set, then the entry can be added as an alternate code.

If the operand count of the entry is less than the count of the primary code, then it should be the primary code.  In this case, the pointer to the entry replaces the value in the name to entry map, and the previous primary is made an alternate code of the entry, by adding to the array element for its operand count minus one, and returning.  This was a better solution then reporting an error.

The routine does a series of comparisons between the operand data types of the entry with that of the primary to identify which primary code it should be added as an alternate to.  If all the operand data types of the primary code match that of entry and the entry is an internal function with more operands, then it is made an alternate of the primary in the array element one less than the operand count.  This is a multiple argument entry (ASC, INSTR or MID$) so the Multiple flag is set on the primary code.  Otherwise, the entry has duplicate operands as the primary and an error is thrown.

Since the Multiple flag is set automatically, it no longer needs to be specified in the table entry array initialization.  Also since this is automatic, and the requirement that a multiple entry be in the following entry in the table was eliminated, the validation of multiple non-assignment entries was removed from the table constructor.

Using the debugger within QtCreator, the alternate map was verified to be setup correctly.  At this point however, this new alternate map is not being used.  This will be the subject of the next change, which will be to use the alternate map for the operators and internal functions instead of the associated code arrays.

[branch table commit b7021a78ae]

Table – Alternate Codes Map

The handling of alternate codes (formally known as associated codes) will be handled differently in the new table model.  In the current table model, each code in its expression information structure contained a single array of associated codes along with a count and an index to a second set of associated codes within the array.  In the new table model, these will be removed from the expression information structure.

The new table model will contain the information for a code in a single table entry instance, which will be handled by a pointer.  The table class will contain some static data members (members shared by all instances).  The alternate code information will be stored in one of these new static data members, specifically a map from a primary code table entry pointer (the key) to its alternate codes (the value).

The value of this alternate map will contain an array (a standard array will be used) of three elements.  Each element represents the alternate codes for a particular operand.  Generally, the first element (index of 0) will have alternate codes where the data type of the first operand is different from the primary code.  The second element will have alternate codes where the data type of the second operand is different from the primary code.  This was roughly the purpose of the second associated codes.

The third element of the array is applicable only for three argument internal functions, which is new.  There are currently no planned internal functions that have different data types in the third argument.  This third element will be used to associate three argument functions to there primary code with two arguments.  This applies to the MID$ and INSTR functions which have two and three argument versions.  This similarly applies to the ASC function, but its second form has two arguments, so the second element of the array is used.

Each element of this array will contain a vector of alternate code table entry pointers.  A particular element may have an empty vector indicating no alternate codes with different data types for that operand or argument.  The first step will be to automatically generate this map from operator and internal function table entries from the operand data type information.