The two small collection structures (Base Information and Type Information) were moved from the operand source file to the table header file so that they can also be used by the PRINT and INPUT table entry classes. Unlike with the operand classes, the intermediate Print Item and Input Assign classes are still needed because they contain the common recreate virtual table function for the entry classes. The intermediate Input Parse class could be removed since these entry classes don't have a recreate function (the blank function in the Internal class is inherited).
As with the operand entry classes, Base Information literal constants were added for the print item, input assign and input parse entries classes, along with the necessary Type Information literal constants. The input parse table entries previously had a pointer to the Null expression info instance, but now have the same as the input assign entries. The expression info is not used for these table entries, but the Type Information is used for the data type debug name. This was done to reduce the number of literal constant definitions needed.
A new table constructor was added that takes the Base Information and Type Information structures, which the Operand and Internal class constructors now call. This also pushed the call to the append alternate function call up to this Table constructor. This eliminated the calls to this function in most of the derived classes. The remaining call is by the Input Assign Double entry class needed because it needs to be assigned to a second primary entry (Input and Input Assign).
[branch table commit a52bd657b2]
Saturday, February 28, 2015
Thursday, February 26, 2015
Table Class Hierarchy – Refactoring
The class hierarchies for operand class and the PRINT and INPUT commands were implemented some what differently. For the operand classes, the integer and string data type entry classes were based on the double data type entry class, where as with PRINT and INPUT, intermediate classes were implemented with the three data type classes based on them. For their debug names, the operand instances specified theirs, where as the PRINT and INPUT theirs in the constructors. There was still also an undesirable amount of duplication.
The main and data type parts of the debug names for the operand, PRINT and INPUT entries were separated (more details below). To reduce some of the duplication in the operand table entry classes, two small structures were added to hold non-common entry data:
The Hidden table flag was renamed to the No Name flag. While the Hidden was used in the Convert Integer and Convert Double entries, this flag was not checked anywhere. The No Name flag is used during entry initialization to prevent entries with this flag from being added to the Name To Entry lookup map since they now contain the main debug name for the entries with operands. Since the name field is no longer blank for these entries, the parser was changed from checking for a blank name to using the is code with operand access function.
[branch table commit 1598427319]
The main and data type parts of the debug names for the operand, PRINT and INPUT entries were separated (more details below). To reduce some of the duplication in the operand table entry classes, two small structures were added to hold non-common entry data:
Base Information - contains the code, name and table flags for the entryThere is a Base Information structure constant for each main category of operand (constant, variable and variable reference) and there is a Type Information structure constant for each data type (double, integer and string). The appropriate constants are passed from the operand entry constructor to the Operand class constructor where the values of these structures along with the No Name table flag (see below) and precedence value are passed to the base table constructor. The names of these structures will probably be renamed as similar structures will be used for the operator and function entries. The updated class simplified hierarchy is shown in this diagram (compared to initial diagram here):
Type Information - contains the second name (used for the debug data type name) and a pointer to an expression information instance
TableA mentioned above, the debug names of these codes were separated into the first and second names. The data type debug names were changed from Int and Str to % and $ to be consistent with those of the operators and functions (the debug name for Dbl is no character). For example, VarInt is now Var%. A similar change was made to Print Item, Input Assign, and Input Parse (for example, PrintDbl and PrintStr are now PrintItem and PrintItem$). The expected test results were updated accordingly.
└── Operand
├── Rem [remCommand, remOperator]
├── ConstDbl [constDbl]
├── CosntInt [constInt]
├── CosntStr [constStr]
├── VarDbl [varDbl]
├── VarInt [varInt]
├── VarStr [varStr]
├── VarRefDbl [varRefDbl]
├── VarRefInt [varRefInt]
└── VarRefStr [varRefStr]
The Hidden table flag was renamed to the No Name flag. While the Hidden was used in the Convert Integer and Convert Double entries, this flag was not checked anywhere. The No Name flag is used during entry initialization to prevent entries with this flag from being added to the Name To Entry lookup map since they now contain the main debug name for the entries with operands. Since the name field is no longer blank for these entries, the parser was changed from checking for a blank name to using the is code with operand access function.
[branch table commit 1598427319]
Saturday, February 21, 2015
Table Class Hierarchy – INPUT Entries
The INPUT related table entries were put into the new table class hierarchy. The diagram below shows the table sub-class hierarchy for the INPUT table entries:
The intermediate InputCommand, InputAssign and InputParse clases were added to contain the common initializations and functions for their associated entry classes. The constructor of the InputAssignDbl class required two alternate info arguments since its entry instance is assigned as alternates to both the Input and InputPrompt entries. The contents of INPUT translate and various INPUT related recreate functions were moved to the virtual functions of these classes (simply renamed).
Table
├── Command
│ └── InputCommand
│ ├── Input [input]
│ └── InputPrompt [inputPrompt]
└── Internal
├── InputBegin [inputBegin]
├── InputBeginStr [inputBeginStr]
├── InputAssign
│ ├── InputAssignDbl [inputAssignDbl]
│ ├── InputAssignInt [inputAssignInt]
│ └── InputAssignStr [inputAssignStr]
└── InputParse
├── InputParseDbl [inputParseDbl]
├── InputParseInt [inputParseInt]
└── InputParseStr [inputParseStr]
The intermediate InputCommand, InputAssign and InputParse clases were added to contain the common initializations and functions for their associated entry classes. The constructor of the InputAssignDbl class required two alternate info arguments since its entry instance is assigned as alternates to both the Input and InputPrompt entries. The contents of INPUT translate and various INPUT related recreate functions were moved to the virtual functions of these classes (simply renamed).
With these INPUT related table entries created in the new table model, their corresponding entries in the old table entries array were removed along with their code index enumerators. The alternate initializers for these codes in the temporary alternate info initializer were also removed.
[branch table commit c575b478ac]
Friday, February 20, 2015
Table Class Hierarchy – PRINT Entries
The PRINT related table entries were put into the new table class hierarchy. The diagram below shows the table sub-class hierarchy for the PRINT table entries:
The Command class was already implemented for the table entries of commands that have not been implemented yet. The Internal class is for codes that are added to the program to support commands, but are not associated with tokens from the input. The SpecialOperator class are for special operator codes that do not contain arguments. The intermediate PrintItem class contains those items shared by each of the print data type entries like their recreate function.
Table
├── Command
│ └── Print [print]
├── Internal
│ └── PrintItem
│ ├── PrintDbl [printDbl]
│ ├── PrintInt [printInt]
│ └── PrintStr [printStr]
└── SpecialOperator
├── Semicolon [semicolon]
└── Comma [comma]
The Command class was already implemented for the table entries of commands that have not been implemented yet. The Internal class is for codes that are added to the program to support commands, but are not associated with tokens from the input. The SpecialOperator class are for special operator codes that do not contain arguments. The intermediate PrintItem class contains those items shared by each of the print data type entries like their recreate function.
The contents of PRINT translate and various print related recreate functions were moved to the virtual functions of these classes (essentially the names of these functions were renamed). The current print item recreate function was left in place with the PrintItem recreate calling it. This function is still called by the print function recreate function, which hasn't been converted to the new table model yet. This is temporary until function entries are put into the new table model.
The print translate and various recreate functions are in need of refactoring are large and unwieldy. This will be done later time when the translator and recreator classes are refactored. With these PRINT related table entries created in the new table model, their corresponding entries in the old table entries array were removed along with their code index enumerators. The alternate initializers for these codes in the temporary alternate info initializer were also removed.
[branch table commit e495f246cb]
Wednesday, February 18, 2015
Table Class Hierarchy – Operand Entries
The last several commits involved the virtual encode, operand text, and remove table functions. These virtual functions will only be used by codes that contain operands (a second program word with an index of a dictionary entry). The table class hierarchy was implemented for these table entries. The resulting implementation for the operand table sub-class hierarchy is shown in this diagram:
The names in bold are the derived table entry classes used to instantiate the table entries (names shown in brackets). The italic-bold names are base classes and are not meant to be used to create instances. Eventually these will contain pure virtual functions preventing instantiation. Note that some of the derived table entry classes are also used as base classes for related derived table entry classes (Const, Var, and VarRef).
The REM recreate function contents were moved to the Rem class virtual recreate function. The constant string recreate function contents were moved to the ConstStr class virtual recreate function. This function was refactored a bit by separating out the replace double quotes with two function (mainly to simplify the recreate function and remove the necessity of a comment that might become stale).
A new table class append alternate function was implemented taking a reference to a Alternate Item structure and is called from the Operand class constructor. This append alternate function appends the entry pointer (this) to the alternate entry vector of the primary entry for the operand index if the primary entry pointer is not a null pointer (the default if entry is not an alternate). This functionality will probably be moved up into the base Table class since it will be needed for other table entries classes.
Table
└── Operand
├── Rem [remCommand, remOperator]
├── Const [constDbl]
│ ├── CosntInt [constInt]
│ └── CosntStr [constStr]
└── Var [var]
├── VarInt [varInt]
├── VarStr [varStr]
└── VarRef [varRef]
├── VarRefInt [varRefInt]
└── VarRefStr [varRefStr]
The names in bold are the derived table entry classes used to instantiate the table entries (names shown in brackets). The italic-bold names are base classes and are not meant to be used to create instances. Eventually these will contain pure virtual functions preventing instantiation. Note that some of the derived table entry classes are also used as base classes for related derived table entry classes (Const, Var, and VarRef).
A separate derived table class is needed for each table entry because each will implement a unique run function for execution of that code at run-time. Unique classes are not necessary for the REM command and REM (') operator because their run functions will do that same thing at execution (which is nothing).
Since the virtual encode, operand text, and remove table functions are the same for all of these classes, they are implemented in the Operand sub-class and will be inherited by derived entry classes. The operand recreate function is also common to all of these classes and its contents were moved to the virtual recreate function of Operand class.
For the primary derived table entry classes (Rem, Const, Var, and VarRef), only the constructor is different containing arguments necessary for the associated table entry members. The table entry instantiations then contain initialization value unique to the entry. The arguments of each constructor were pushed up the hierarchy for those values that are common.
For the constant sub-classes, each share the same constructor since each needs the same entry member values. Normally a derived class does not inherit the constructors from its base class, but C++11 allows constructors to be inherited with the using syntax (for example, the ConstInt class contains a using Const::Const; statement to inherit the constructor from the Const class). This feature should only be used when the derived class contains no additional data members that would not get initialized (which is the case here).
The REM recreate function contents were moved to the Rem class virtual recreate function. The constant string recreate function contents were moved to the ConstStr class virtual recreate function. This function was refactored a bit by separating out the replace double quotes with two function (mainly to simplify the recreate function and remove the necessity of a comment that might become stale).
The variable classes were similarly implemented and only the Var and VarRef classes contain constructors with the others inheriting theirs. TODO comments were added to where the virtual run functions will be put when implemented for all of the derived table entry sub-classes.
Some of these table entries require alternate entries, so a mechanism was implemented to assign these alternates as they can't be automatic like with the operator table entries. The base Operand constructor contains a reference to an Alternate Item structure instance. This new simple structure contains a pointer to a primary table entry and an operand index that the entry should be assigned to.
A new table class append alternate function was implemented taking a reference to a Alternate Item structure and is called from the Operand class constructor. This append alternate function appends the entry pointer (this) to the alternate entry vector of the primary entry for the operand index if the primary entry pointer is not a null pointer (the default if entry is not an alternate). This functionality will probably be moved up into the base Table class since it will be needed for other table entries classes.
These classes and their instances are defined in the operand.cpp source (previously named basic.cpp) since it contains only sub-classes and entries related to operand table entries. The sub-classes and instances do not need to be known outside of this source file (no header file is needed). This will improve compile time since other source files only need to include the main table header file. The pointers to the instances are added to the static table members via the base Table class constructor that gets called by each entry sub-class constructor.
With these operand table entries are now created in the new table model, their corresponding entries in the old table entries array were removed along with their code index enumerators. The alternate initializers for these in the temporary alternate info initializer were also removed.
[branch table commit d4aec5bf13]
Monday, February 16, 2015
Program Writer Class – Call Chain Removal
To write words into the program, a standard back insert iterator (back inserter) was passed to the encode functions (one in the token class and the others in the table virtual encode functions). The virtual table encode functions also need to add entries to a dictionary in a program unit, so a pointer to the program unit was also passed. Since the token encode calls a virtual table encode function, it is also passed a program unit pointer.
In the case of the program reader class (originally implemented because both an iterator and an end iterator needed to be carried together) there was already a container that could carry the program unit pointer to the operand text and remove functions. Therefore, a Program Writer class was implemented to carry a back inserter and a program unit pointer.
To program writer class was given a couple of member functions including a write code function for inserting a program word with the code and sub-code, and a generate and write operand function for inserting a program word with an operand after the operand is obtained by adding a reference to a dictionary entry.
The back inserter and program unit pointer arguments to the encode functions were changed to a reference to a program writer instance, and these functions were modified to use it. The generate and write operand function calls a new program model generate operand from dictionary function. To create a program writer instance, the create program writer function was added to the program model class.
To be more consistent with the naming of the generate and write operand function name, the program reader get string for operand and remove reference to operand functions were renamed to read operand and get string and read operand and remove reference respectively.
[branch table commit 37a64d99c7]
In the case of the program reader class (originally implemented because both an iterator and an end iterator needed to be carried together) there was already a container that could carry the program unit pointer to the operand text and remove functions. Therefore, a Program Writer class was implemented to carry a back inserter and a program unit pointer.
To program writer class was given a couple of member functions including a write code function for inserting a program word with the code and sub-code, and a generate and write operand function for inserting a program word with an operand after the operand is obtained by adding a reference to a dictionary entry.
The back inserter and program unit pointer arguments to the encode functions were changed to a reference to a program writer instance, and these functions were modified to use it. The generate and write operand function calls a new program model generate operand from dictionary function. To create a program writer instance, the create program writer function was added to the program model class.
To be more consistent with the naming of the generate and write operand function name, the program reader get string for operand and remove reference to operand functions were renamed to read operand and get string and read operand and remove reference respectively.
[branch table commit 37a64d99c7]
Sunday, February 15, 2015
Program Reader Class – Call Chain Removal
As mentioned recently, having a chain of function calls is not good object-oriented design because it requires the initial caller to have knowledge of the internal members of each of the subsequent objects in the chain breaking encapsulation. The table operand text virtual function contained this call chain:
The get string for operand function was added with an operand type argument. This eliminated the call chain in the operand text function, which now for clearly reads:
Similarly, the remove reference to operand function with an operand type argument was added for the remove table function along with a remove reference from dictionary program model function with the operand type and operand arguments.
The unit access function, which allowed the call chain in the first place, was no longer used and was removed. This takes care of the function call chains when reading from the program. The function call chain when writing to the program will be eliminated next.
[branch table commit 0e6751f023]
return programLineReader.unit()->dictionary(m_operandType)->string(operand);Before removing this call change and the equivalent one in the table remove virtual function, some code rearrangement was performed. This included renaming the Program Line Reader class to simply Program Reader since it reads more than just from a program line (it also access a dictionary). The Program Reader class and its functions were also moved to their own header and source files.
The get string for operand function was added with an operand type argument. This eliminated the call chain in the operand text function, which now for clearly reads:
return programReader.getStringForOperand(m_operandType);To prevent a call chain in this new function (though with one less function call), the get string from dictionary function was added to the program model class with operand type and operand arguments. An internal read operand function was added for clarity because calling the function operator function internally is the not very self-explanatory this->operand()().
Similarly, the remove reference to operand function with an operand type argument was added for the remove table function along with a remove reference from dictionary program model function with the operand type and operand arguments.
The unit access function, which allowed the call chain in the first place, was no longer used and was removed. This takes care of the function call chains when reading from the program. The function call chain when writing to the program will be eliminated next.
[branch table commit 0e6751f023]
Table – Operand Text & Remove Virtual Functions
Like to the encode function, the twelve operand text and remove functions did the same things with different dictionaries. These were replaced with a single operand text and remove functions. Both check if the code has an operand. The operand text function returns an empty string if the code does not have an operand, and the remove function does nothing.
Similar for the initialization of the table entries for the encode function pointers, the operand text and remove function pointer arguments of the secondary table constructor (used for code with no operands) were changed to dummy void pointers so all the table entries didn't need to be modified.
The operand text and remove function pointer arguments were removed from the primary table constructor and the associated function pointers were removed from the table entries of codes with operands. The operand text function and remove function pointer members were removed from the table class along with their access functions. And the twelve individual operand text and remove functions were removed.
The tester class function operator function was modified to use the single dictionary pointer access with operand type arguments. The individual dictionary access functions were now no longer called and were removed.
[branch table commit c5cb6ed943]
Similar for the initialization of the table entries for the encode function pointers, the operand text and remove function pointer arguments of the secondary table constructor (used for code with no operands) were changed to dummy void pointers so all the table entries didn't need to be modified.
The operand text and remove function pointer arguments were removed from the primary table constructor and the associated function pointers were removed from the table entries of codes with operands. The operand text function and remove function pointer members were removed from the table class along with their access functions. And the twelve individual operand text and remove functions were removed.
The tester class function operator function was modified to use the single dictionary pointer access with operand type arguments. The individual dictionary access functions were now no longer called and were removed.
[branch table commit c5cb6ed943]
Saturday, February 14, 2015
Table – Encode Virtual Function
The six encode functions did the same thing except with a different dictionary, a pointer that was obtained from one of six dictionary access functions. With the dictionary pointers in an array, a new single dictionary access function was added to the program model with an operand type argument.
A single encode function can be implemented but it needs to obtain the correct operand type for the code being encoded. The encode function is passed a token pointer and the token contains the pointer to the table entry for the code. Therefore, an operand type member was added to the table class, but using the token table entry to get the operand type is not the best way to get the operand type (it would be more of a calling chain that is being eliminated).
Since the encode function is virtual, once the new table class hierarchy is implemented, the code with operand class will have its own encode function. Being a member of the table class, the encode function will have direct access to the operand type member, so there is no need to go through the token anyway.
The table encode virtual function previously called the encode function contained in the encode function pointer member if it was set. For now this encode function was modified to do what the previous six encode functions did. If the code has an operand, then the dictionary for the operand type is called to add the entry in the token to the dictionary and the resulting entry index is inserted into the program. The function call chain is still present (which is in the process of being eliminated). For other codes, nothing is done, which is what the default encode virtual function will do.
The encode function pointer member was no longer used and was removed along with its access function and the six old encode functions. To avoid modifying all the table entries (except those with operands), the encode function argument of the table constructor was changed to dummy void pointer (so the encode function pointer initializer in the entries are ignored) and the operand type member is initialized to the No enumerator.
For the codes with operands, a new table constructor was added with an operand type argument and without the encode function pointer argument. This constructor was made the main constructor with the original constructor calling it. Since the operand type member can be used to determine if the code has an operand, the Operand table flag was removed and the is code with operand access function was changed to use the new member instead of the flag.
[branch table commit f549771028]
A single encode function can be implemented but it needs to obtain the correct operand type for the code being encoded. The encode function is passed a token pointer and the token contains the pointer to the table entry for the code. Therefore, an operand type member was added to the table class, but using the token table entry to get the operand type is not the best way to get the operand type (it would be more of a calling chain that is being eliminated).
Since the encode function is virtual, once the new table class hierarchy is implemented, the code with operand class will have its own encode function. Being a member of the table class, the encode function will have direct access to the operand type member, so there is no need to go through the token anyway.
The table encode virtual function previously called the encode function contained in the encode function pointer member if it was set. For now this encode function was modified to do what the previous six encode functions did. If the code has an operand, then the dictionary for the operand type is called to add the entry in the token to the dictionary and the resulting entry index is inserted into the program. The function call chain is still present (which is in the process of being eliminated). For other codes, nothing is done, which is what the default encode virtual function will do.
The encode function pointer member was no longer used and was removed along with its access function and the six old encode functions. To avoid modifying all the table entries (except those with operands), the encode function argument of the table constructor was changed to dummy void pointer (so the encode function pointer initializer in the entries are ignored) and the operand type member is initialized to the No enumerator.
For the codes with operands, a new table constructor was added with an operand type argument and without the encode function pointer argument. This constructor was made the main constructor with the original constructor calling it. Since the operand type member can be used to determine if the code has an operand, the Operand table flag was removed and the is code with operand access function was changed to use the new member instead of the flag.
[branch table commit f549771028]
Friday, February 13, 2015
Program – Dictionary Pointers
The individual dictionary pointers in the program model class were replaced with a standard array of dictionary pointers. The standard array was chosen (over a map) because it is the most efficient. The standard unique pointer was used for the dictionary pointers so no extra work is needed to clean up the dictionaries when a program model instance is deleted.
To index the dictionary pointer array, a Operand Type plain enumeration was defined where its enumerators are used as indexes. A plain enumeration was chosen since is enumerator can be used as integer indexes whereas a C++11 enumeration does not without casting. The enumerators include Rem, Constant Number, Constant String, Variable Double, Variable Integer and Variable String for each of the current dictionaries.
After the dictionary enumerators, there is a Number Of enumerator that is used to dimension the dictionary pointers array. It is also used as the limit for the loop that clears the dictionaries in the program model destructor. There is a No enumerator at the end of the list that is used for codes that do not have operands. The default enumerator (with a value of zero) would be used for this purpose, but the first enumerator needs to be used to index the first dictionary.
The dictionary pointer array contains the base dictionary class pointer. The constant number and string dictionaries were Info Dictionary class pointers, but with the last change, the base class pointer can be used for these dictionaries since only the base class virtual functions are used at the moment. Dynamic casting will eventually be needed to get to the information of these classes.
However, there was one issue where the destructor of the constant string dictionary was no longer being called (to free the string constants). This was resolved by making the Dictionary base class destructor a virtual function, which causes the destructors of the derived classes to be called.
Each of the program model dictionary access functions were changed to use the new array and operand type enumerators. These functions will shortly be replaced with a single generic dictionary access functions taking an operand type argument (for the effort of reducing the number of functions).
[branch table commit 61d82727f8]
To index the dictionary pointer array, a Operand Type plain enumeration was defined where its enumerators are used as indexes. A plain enumeration was chosen since is enumerator can be used as integer indexes whereas a C++11 enumeration does not without casting. The enumerators include Rem, Constant Number, Constant String, Variable Double, Variable Integer and Variable String for each of the current dictionaries.
After the dictionary enumerators, there is a Number Of enumerator that is used to dimension the dictionary pointers array. It is also used as the limit for the loop that clears the dictionaries in the program model destructor. There is a No enumerator at the end of the list that is used for codes that do not have operands. The default enumerator (with a value of zero) would be used for this purpose, but the first enumerator needs to be used to index the first dictionary.
The dictionary pointer array contains the base dictionary class pointer. The constant number and string dictionaries were Info Dictionary class pointers, but with the last change, the base class pointer can be used for these dictionaries since only the base class virtual functions are used at the moment. Dynamic casting will eventually be needed to get to the information of these classes.
However, there was one issue where the destructor of the constant string dictionary was no longer being called (to free the string constants). This was resolved by making the Dictionary base class destructor a virtual function, which causes the destructors of the derived classes to be called.
Each of the program model dictionary access functions were changed to use the new array and operand type enumerators. These functions will shortly be replaced with a single generic dictionary access functions taking an operand type argument (for the effort of reducing the number of functions).
[branch table commit 61d82727f8]
Thursday, February 12, 2015
Dictionary – Minor Refactoring
Before continuing with the reduction of the virtual table functions for the codes with operands, some minor refactoring and clean up was performed on the dictionary classes. This included the removal of unnecessary comments (stating the obvious or had become inaccurate). The inline functions defined in the class definitions were moved outside of the class cleaning up the class definition.
The add function was reimplemented so that the key map of the dictionary is not searched twice for new entries. Previously it searched for the key (using the find function) and then if it wasn't found, added it added it to the key map (using the emplace function, which did a second search). The emplace function can cover both with one call. It returns a standard pair of the iterator for the key and a flag whether the key was added or not. The value of the key is not affected if the key is already present.
[branch table commit a0243036df]
The add function was reimplemented so that the key map of the dictionary is not searched twice for new entries. Previously it searched for the key (using the find function) and then if it wasn't found, added it added it to the key map (using the emplace function, which did a second search). The emplace function can cover both with one call. It returns a standard pair of the iterator for the key and a flag whether the key was added or not. The value of the key is not affected if the key is already present.
[branch table commit a0243036df]
Wednesday, February 11, 2015
Dictionary – Proper Base Class
The Dictionary class was not designed to be an abstract base class in a dictionary class hierarchy. An abstract base class contains virtual functions, which the Dictionary class did not have. The four functions, clear, add, remove and string were changed to virtual functions.
The Info Dictionary class is derived from the Dictionary class. This is an intermediate class used as a base class for the Constant Number Dictionary and Constant String Dictionary classes. This class implemented its own clear, add and remove. They did their action before or after calling the base Dictionary class function. The add and remove functions had different function signatures from the base class, so these needed to be modified to be the same so that they can override the base class functions.
The Dictionary class add function returned the index of the entry and contained an optional argument for returning the entry type (New, Reused or Exists). Only the Info Dictionary class add function used this argument. It is not a good practice to use an argument for output. The return value of the base add function was modified to return a standard pair of the index and entry type, and the optional output argument removed. The Info Dictionary class add function signature was made to match and was modified to use the standard pair return value. Instead of the output argument.
The return value of the Dictionary base class remove function was corrected from integer to boolean since that is what type of value was returned. The Info Dictionary class remove function signature was made to match and changed modified to return a value (previously it did not have a return value), but note that no caller actually uses this return value.
The six table encode functions were modified to use the first value of the pair returned by the add function. This is temporary until the changes described in the last post are fully implemented. The dereference operator used with the back insert iterator was removed as it is not necessary since the assignment operator of these iterators is also defined to insert to the back of the container.
A small correction was made to the Dictionary class clear function in how the free stack is cleared. Previously an empty initializer {} was used. However, this gave and error with GCC 4.9.2 (the latest available in the tool chains repository, see post on August 5 for how to install or upgrade to GCC 4.9 - substitute "4.9" for "4.8" in the instructions). The empty initializer was changed to std::stack<uint16_t>{} to eliminate the error, which also works with GCC 4.8 (both versions of GCC will be still supported).
[branch table commit c3f9335697]
The Info Dictionary class is derived from the Dictionary class. This is an intermediate class used as a base class for the Constant Number Dictionary and Constant String Dictionary classes. This class implemented its own clear, add and remove. They did their action before or after calling the base Dictionary class function. The add and remove functions had different function signatures from the base class, so these needed to be modified to be the same so that they can override the base class functions.
The Dictionary class add function returned the index of the entry and contained an optional argument for returning the entry type (New, Reused or Exists). Only the Info Dictionary class add function used this argument. It is not a good practice to use an argument for output. The return value of the base add function was modified to return a standard pair of the index and entry type, and the optional output argument removed. The Info Dictionary class add function signature was made to match and was modified to use the standard pair return value. Instead of the output argument.
The return value of the Dictionary base class remove function was corrected from integer to boolean since that is what type of value was returned. The Info Dictionary class remove function signature was made to match and changed modified to return a value (previously it did not have a return value), but note that no caller actually uses this return value.
The six table encode functions were modified to use the first value of the pair returned by the add function. This is temporary until the changes described in the last post are fully implemented. The dereference operator used with the back insert iterator was removed as it is not necessary since the assignment operator of these iterators is also defined to insert to the back of the container.
A small correction was made to the Dictionary class clear function in how the free stack is cleared. Previously an empty initializer {} was used. However, this gave and error with GCC 4.9.2 (the latest available in the tool chains repository, see post on August 5 for how to install or upgrade to GCC 4.9 - substitute "4.9" for "4.8" in the instructions). The empty initializer was changed to std::stack<uint16_t>{} to eliminate the error, which also works with GCC 4.8 (both versions of GCC will be still supported).
[branch table commit c3f9335697]
Tuesday, February 10, 2015
Program – Dictionary Class Problem
While investigating how clean up the chain calling mentioned in the previous post, it was discovered that the dictionary class hierarchy with respect to the base class was not implemented correctly. The following is the thought process on refactoring the call chain and how this dictionary problem was discovered. Consider the current complete REM operand text function:
The problem with this scheme is the number of functions needed. Each of the other five codes have nearly identical table functions, except different dictionaries are used, for a total of 18 functions. The Program Reader class will need 18 access functions, one for each table function. Continuing this pattern will required many more table and access functions when the other codes are implemented (array, defined functions and user functions).
To reduce the number of the Program Reader class access functions to three (add to a dictionary, get string from a dictionary, and remove from a dictionary) will be to implement a enumeration for the dictionary types (remarks, constant numbers, constant strings, double variables, integer variables, string variables, and the rest of the unimplemented codes) and pass this to the access function. The REM function would look something like:
The indexable container will need to contain pointers to the dictionaries. The requires base class pointers to the dictionaries. This is where the problem was discovered - the dictionary class does not have virtual functions. This needs to corrected next.
To reduce the number of table functions to three (encode, operand text, and remove), the dictionary enumerator will be placed in the table entry. The three virtual table functions will pass the dictionary enumerator from the table entry to the program reader function. The intermediate Code With Operand table entry class will define these functions and the individual code classes (REM, constants, variables, etc.) will inherit them. Now, the common operand text function will look something like:
const std::string remOperandText(ProgramLineReader &programLineReader)The encode and remove functions are similar except they have no return value and different dictionary functions are called. The encode function has a back inserter iterator argument instead of a program reader and still has a program unit pointer argument. To remove the call chain and make the code as readable as possible, the function could be changed to something like:
{
auto operand = programLineReader();
return programLineReader.unit()->remDictionary()->string(operand);
}
const std::string remOperandText(ProgramReader &programReader)Where the name of the program line reader class was going to be simplified (the "Line" part of the name added nothing significant and didn't seem appropriate since the program reader was now reading from a dictionary in addition to a program line). There would be similar functions for adding and removing from the dictionary for the encode and remove functions.
{
return programReader.getStringFromRemDictionary();
}
The problem with this scheme is the number of functions needed. Each of the other five codes have nearly identical table functions, except different dictionaries are used, for a total of 18 functions. The Program Reader class will need 18 access functions, one for each table function. Continuing this pattern will required many more table and access functions when the other codes are implemented (array, defined functions and user functions).
To reduce the number of the Program Reader class access functions to three (add to a dictionary, get string from a dictionary, and remove from a dictionary) will be to implement a enumeration for the dictionary types (remarks, constant numbers, constant strings, double variables, integer variables, string variables, and the rest of the unimplemented codes) and pass this to the access function. The REM function would look something like:
const std::string remOperandText(ProgramReader &programReader)For this implementation, the dictionaries in the program model will need to be placed into an indexable by enumerator container. The program model will provide an access function to the dictionary by enumerator (there is currently one access function for each dictionary - six access functions).
{
return programReader.getStringFrom(Rem_Dictionary);
}
The indexable container will need to contain pointers to the dictionaries. The requires base class pointers to the dictionaries. This is where the problem was discovered - the dictionary class does not have virtual functions. This needs to corrected next.
To reduce the number of table functions to three (encode, operand text, and remove), the dictionary enumerator will be placed in the table entry. The three virtual table functions will pass the dictionary enumerator from the table entry to the program reader function. The intermediate Code With Operand table entry class will define these functions and the individual code classes (REM, constants, variables, etc.) will inherit them. Now, the common operand text function will look something like:
const std::string CodeWithOperand::perandText(ProgramReader &programReader)
{
return programReader.getStringFrom(m_dictionaryType);
}
Sunday, February 8, 2015
Program Model Access Refactoring (Part 2)
The next observation was that the program line reader argument of the operand text and remove virtual table functions was accompanied by the program unit pointer argument (needed to access the dictionaries). Therefore, the program line reader instance can carry the program unit pointer to these virtual functions.
A pointer to the program unit was added to the program line reader class. The new program model create program line reader function was to pass the pointer to itself to the program line reader constructor. An access function was added for this pointer. The program unit pointer argument was removed from all of the operand text and remove functions. The same changes was made to the token constructor for program words, which also has a program line reader argument.
Unfortunately, the resulting operand text and remove functions have an undesirable calling chain, which existed before, but now was extended another level. This is undesirable because it breaks encapsulation, meaning that these functions have knowledge of the internals of each class in the chain. Considered the statements of the REM operand text function:
[branch table commit 66a0db8247]
A pointer to the program unit was added to the program line reader class. The new program model create program line reader function was to pass the pointer to itself to the program line reader constructor. An access function was added for this pointer. The program unit pointer argument was removed from all of the operand text and remove functions. The same changes was made to the token constructor for program words, which also has a program line reader argument.
Unfortunately, the resulting operand text and remove functions have an undesirable calling chain, which existed before, but now was extended another level. This is undesirable because it breaks encapsulation, meaning that these functions have knowledge of the internals of each class in the chain. Considered the statements of the REM operand text function:
auto operand = programLineReader();This function has knowledge that the program line reader has a unit pointer member, that the unit pointer member has REM dictionary member, and that the dictionary has the string function. These statements are also unusual in that a program word is read from the program line reader, and then passed back to the program line reader through a chain of calls. This issue will be rectified with the next change.
return programLineReader.unit()->remDictionary()->string(operand);
[branch table commit 66a0db8247]
Saturday, February 7, 2015
Program Model Access Refactoring
Upon reviewing the most recent changes made to the encode, operand text, and remove virtual table functions, I decided that some refactoring was in order to clean up how the program model is accessed from these functions and how the program line reader is utilized.
I observed that for each of the three locations that created a program line reader instance were identical. The program line reader constructor contained three arguments, the beginning of the code vector, the offset of the line within the code vector and size of the line. For code readability, the number of arguments in a function call should be as few as possible and three is a little too many.
In all three constructions, the offset and size of the line were obtained from a line info structure. This could have been one way to reduce the number of arguments by one - to pass just a reference to a line info structure. This would have meant that the program code header file where the program line reader class is located, would need access to the line info structure, which is an internal class to the program model.
The more header files included, the longer compilation takes, so this was not a desirable solution, plus there would still be two arguments. This would be an improvement, but there was an alternative to eliminate the other argument - the begin code vector iterator.
Since the program model contains the code vector, the line info structure definition and includes the program code header file containing the program line reader class definition, a new create program line reader function was added to the program model class. This function contains a single reference to a line info structure argument and creates a program line reader instance from this argument and the begin code vector iterator.
[branch table commit ba6e5ee46d]
I observed that for each of the three locations that created a program line reader instance were identical. The program line reader constructor contained three arguments, the beginning of the code vector, the offset of the line within the code vector and size of the line. For code readability, the number of arguments in a function call should be as few as possible and three is a little too many.
In all three constructions, the offset and size of the line were obtained from a line info structure. This could have been one way to reduce the number of arguments by one - to pass just a reference to a line info structure. This would have meant that the program code header file where the program line reader class is located, would need access to the line info structure, which is an internal class to the program model.
The more header files included, the longer compilation takes, so this was not a desirable solution, plus there would still be two arguments. This would be an improvement, but there was an alternative to eliminate the other argument - the begin code vector iterator.
Since the program model contains the code vector, the line info structure definition and includes the program code header file containing the program line reader class definition, a new create program line reader function was added to the program model class. This function contains a single reference to a line info structure argument and creates a program line reader instance from this argument and the begin code vector iterator.
[branch table commit ba6e5ee46d]
Table – Remove Virtual Function
The remove function is called to remove a reference to a dictionary entry for the operand of a code, possibly removing the entry if there are no more references to it. This function is used by the program model dereference function called when program lines are removed or replaced. This function had to check if the code had an operand (using the is code with operand access function) before the virtual function could be called.
The program model dereference function was modified to use a program line reader instance. The remove function function was modified to check if the remove function pointer is not null before calling the remove function of the table entry. By default, for codes without operands, the remove function will do nothing.
The operand argument of the remove virtual functions was changed to a reference to a program line reader instance. The remove functions were modified to use the program line reader instance to obtain the operand to pass to the desired dictionary to remove the reference to the item.
[branch table commit b980c95369]
The program model dereference function was modified to use a program line reader instance. The remove function function was modified to check if the remove function pointer is not null before calling the remove function of the table entry. By default, for codes without operands, the remove function will do nothing.
The operand argument of the remove virtual functions was changed to a reference to a program line reader instance. The remove functions were modified to use the program line reader instance to obtain the operand to pass to the desired dictionary to remove the reference to the item.
[branch table commit b980c95369]
Wednesday, February 4, 2015
Table – Operand Text Virtual Function
The operand text function is called to obtain the text for the operand of a code (for example, the name of a variable or the value of numeric constant). It is used when a program code for line is decoded into a list of tokens (an then back to the original text of the line). It is also used for test output. In both cases it had to check if the code has an operand (using the is code with operand access function) before calling the virtual function.
Similar to the encode functions that control the back insert iterator as needed when the code has an operand, the operand text function should control the advancing of the iterator being used to iterate over the program line if the code has an operand. The default operand text function does nothing and does not need to advance this iterator. For now the operand text function was modified to check if the operand text function pointer member is set and then calls the function, otherwise a blank string is returned.
The Program Line Reader class contains two members, the iterator itself and an end iterator used to check for the end the line. The constructor contains three arguments, the begin iterator of the program code vector, the offset for the start of the program line, and the size of the line. These values are used to calculate the initial value of the iterator (begin plus offset) and end iterator (iterator plus size).
This class contains three access functions. The function operator function itself returns the program word pointed to be the iterator and advances the iterator. The has more words function returns if there are more words in the program line (iterator is not equal to the end iterator). And there is a previous function for returning the previous word obtained (word before the iterator), which is used by the program model debug text function to get the raw value of the operand to output when there is an operand (the operand text function advances the iterator past this word).
The operand text functions were modified to take a reference to the program line reader instance instead of the operand itself so the codes with operands can advance the iterator as needed (and the other codes without operands do nothing).
Eventually the debug/test code will be removed (as much as possible) from the regular application into separate unit test programs. Adding the data type character was moved from the operand text functions of the variable codes to the program model debug text function. This was the only use of the sub-code argument, so it was removed from the operand text functions. The special sub-code enumerator was also removed.
Since this section made heavy use of the token, a new token constructor was added to do this work. The arguments for this new constructor are a pointer to the program unit (for code with operands that need to access the dictionaries) and a reference to the program line reader instance used for iterating over the program line.
The new constructor obtains a program word from the program line reader instance. The entry member is set to the table entry for the code of the word. The sub-code member is set to the sub-code of the word. The operand text function for the entry is called passing the program unit pointer and program line reader. The string returned is put into the string member. The string is blank for codes that do not have an operand.
[branch table commit ddc2efabbd]
Similar to the encode functions that control the back insert iterator as needed when the code has an operand, the operand text function should control the advancing of the iterator being used to iterate over the program line if the code has an operand. The default operand text function does nothing and does not need to advance this iterator. For now the operand text function was modified to check if the operand text function pointer member is set and then calls the function, otherwise a blank string is returned.
Program Line Reader Class
A regular iterator could have been used with the operand text functions, but the code would look messy (dereference and increment operators would be required). To make the code more readable and the iterator easier to use, it was encapsulated within a new small Program Line Reader function operator class.The Program Line Reader class contains two members, the iterator itself and an end iterator used to check for the end the line. The constructor contains three arguments, the begin iterator of the program code vector, the offset for the start of the program line, and the size of the line. These values are used to calculate the initial value of the iterator (begin plus offset) and end iterator (iterator plus size).
This class contains three access functions. The function operator function itself returns the program word pointed to be the iterator and advances the iterator. The has more words function returns if there are more words in the program line (iterator is not equal to the end iterator). And there is a previous function for returning the previous word obtained (word before the iterator), which is used by the program model debug text function to get the raw value of the operand to output when there is an operand (the operand text function advances the iterator past this word).
The operand text functions were modified to take a reference to the program line reader instance instead of the operand itself so the codes with operands can advance the iterator as needed (and the other codes without operands do nothing).
Operand Text Sub-Code Argument
The operand text functions for the variable codes previously added the data type change to the string. This was only required for test output and was inhibited for normal operation when a program line was decoded into tokens (this character is added later when the tokens are converted into text using the data type of the code table entry). The sub-code was passed as an argument and when decoding a line, the sub-code was set to a special No Data Type Character sub-code. The reverse logic made the code less readable.Eventually the debug/test code will be removed (as much as possible) from the regular application into separate unit test programs. Adding the data type character was moved from the operand text functions of the variable codes to the program model debug text function. This was the only use of the sub-code argument, so it was removed from the operand text functions. The special sub-code enumerator was also removed.
Decoding Into Tokens
The program model decode function, a user of the operand text function, was modified to use the new Program Line Reader class. It previously created a token from the table entry of the code in the program word. It then added the sub-code to the token and if the code had an operand, called the operand text function with the No Data Type Character sub-code to prevent adding the data type character and put the resulting string into the string of the token.Since this section made heavy use of the token, a new token constructor was added to do this work. The arguments for this new constructor are a pointer to the program unit (for code with operands that need to access the dictionaries) and a reference to the program line reader instance used for iterating over the program line.
The new constructor obtains a program word from the program line reader instance. The entry member is set to the table entry for the code of the word. The sub-code member is set to the sub-code of the word. The operand text function for the entry is called passing the program unit pointer and program line reader. The string returned is put into the string member. The string is blank for codes that do not have an operand.
Encoder Tests
The encoder tests did not contain a test where the double data type # character was tested on a variable. One line in encoder test #1 was modified to contain a variable with the # character on both on the left side (reference) and right side (value) of an assignment. The expected results for this test were updated accordingly.[branch table commit ddc2efabbd]
Subscribe to:
Posts (Atom)