Sunday, September 22, 2013

Condensing Sub-Code Bits

A program word will consist of instruction words and operand words.  The instruction word will consist of the instruction code and sub-codes.  The sub-codes will generally only be used to recreate the original program and will not be used during execution (though there will be a few exceptions).  The code will reside in the lower 10 bits of the instruction word, which will allow for 1,024 different codes (which should be sufficient).  This leaves the upper 6 bits for the sub-codes.

There are already five sub-codes that need to be stored in the instruction word, including the Parentheses, Colon, LET, Keep and Question sub-codes.  This only leaves one bit for another sub-code and there are a lot more commands to implement.  The sub-code bit usage needed to be reduced.  As it so happens, the LET, Keep and Question sub-codes will never be on the same code, so only one bit is really required for these sub-codes.

To accomplish this, these three sub-codes were replaced with the new Option sub-code.  The LET and INPUT translate routines now set this sub-code instead of the individual sub-codes that were removed.  This new sub-code also can be reused for other commands requiring an individual sub-code.

The bits of the sub-codes were also rearranged with the sub-codes that will be used in the instruction words (Parentheses, Colon, and Option) in the same bit positions.  This will prevent having two sets of sub-code definitions.  A new Program Mask sub-code definition was added that will be used the mask the program sub-codes from the other token sub-codes when the instruction is created.

So that the proper sub-code can be output during testing, an option name variable was added to the table entries.  Commands used the option sub-code will have the text name of the sub-code in the option name.  In the token text routine, if a token has the Option sub-code, the option name is output (or the string "BUG" if the code does not have an option name).

[commit e721a3e3ae]

Encoding – First Phase - Revisited (Tagged)

The tokenCodes0.5 branch of development was successful so it was merged back into branch0.5 (a fast forward merge meaning the branch0.5 pointer was simply moved).

Since the commands.h header file will contain definitions for other things (like routines for constants), it was renamed to the more generic basic.h, which was also added as a dependency to the application binary in the CMake build file so that it appears in the Project source file list within QtCreator.

In the table entry initialization, it turns out that using "" (blank string) and NULL (pointer) for an initializer to a constant QString variable has the same effect, therefore, all "" were replaced with NULL in the table entries.

The first phase required for encoding is complete (again) except now it is now performed by the translator.  This is a good point to tag version v0.5.1.  Work will now begin again in encoding the tokens into the internal program code format to be stored in the program model.

[commit b461229bcd] [commit f64d4401d2]

Translator – Determine Code Size

The output assign codes routine contained two loops, the first to assign codes to tokens without codes, and the second to assign program word index values and determine the encoded size of the tokens.  Since the token codes are now assigned within the translator routines, the first loop was left to just looking for unimplemented token types.  This now can be accomplished in the second loop by simply checking if the token does not have a valid code.

When this routine was being modified, I realized that this routine along with the output append and output insert routines should be members of the RPN List class and not the Translator class since they only deal with the RPN list and do not use any of the translator member variables (with one exception).  Therefore, these functions were moved to the RPN List class.  The append and insert routines remain in the translator class as output list access functions for the command translate routines, however, the translator routines were changed to access the output list functions directly.

The  output assign codes routine was renamed to the set code size routine since that is basically what it does besides setting program word indexes (related) and checking for token types not yet supported (needed only during development).  The reset code size and increment code size access functions were removed since the code size member variable can be access directly.  This routine now returns a boolean value (false meaning an unimplemented token was found, which is returned).  An argument for the reference to the table instance to access the flags for codes was also required.

[commit d0e39b01f2]