Saturday, March 20, 2010

Translator – Adding Token Errors

Several errors could occur when calling the add_token() function. Each of these errors need to be tested. These errors are:
Expected Operand – Occurs if the token contains a binary operator when the state of the Translator is Operand indicating an operand or a unary operator was expected.
Expected Operator – Occurs if the token contains an operand when the state of the Translator is Operator indicating an operator was expected.
Expected Binary Operator – Occurs if the token contains a unary operator that cannot be a binary operator when the state of the Translator is Operator indicating a binary operator was expected.
During development of the Translator, several other diagnostic errors can occur. These errors are the result of checks put into the code to verify that the Translator is working correctly, in other words, to report on situations that should not occur. These checks will be removed once the code is debugged and tested. These errors are:

Translator – Simple Expressions (Implemented)

Handling of simple expressions has been implemented and appears to be working. Not all the functionality or error handling has be tested yet. There is a bug in the code where the program is not terminating correctly (must hit Ctrl+C to abort the program; or crashes when using the debugger). I suspect there is a memory allocation error causing some sort of memory corruption. I wanted to release what what there is since these kind of problems take a while to figure out.

The Parser routines have been updated to set the length for non-string tokens – these are tokens for operators, commands and internal functions. I thought this would be helpful in highlighting the entire token when outputting errors.

Other changes include the Table class where the unary_code and precedence fields have been added with the associated initializer data. More access functions have been added to the Table class for the two new members and also for the name members (so that these can be output).

The translator test code has also been updated. Instead of printing all the tokens like was done when testing the parser, I decided to just output the RPN list as a list of tokens separated by spaces. The outputting of the full tokens will be added as a debug option.

The ibcp_0.1.4-dev-2.zip file has been uploaded at Sourceforge IBCP Project along with the binary for the program (test stack files have been removed). The ibcp.exe program doesn't terminate properly when done testing the Translator, but appears to work. Not everything has been tested and it only includes three simple expressions.

Translator – Adding Operators

Adding operator tokens will be handled by the separate function add_operator(). Some extra processing must be performed in addition to adding the token to the output list and pushing the output list's element pointer onto the done stack like is performed for simple operands.

Eventually this function will verify that the data types of operand(s) of the operator are the correct data type. Also for the case of operators taking two operands of the double data type, if either of the operands is an integer, then the special hidden CvtDbl operator must be inserted into the output list after the operand. This is the purpose of having the done stack and why it contains pointers to the output list elements. The pointers will be used to insert tokens after an element when necessary.

For now, the operands will simply be popped off of the done stack. For unary operators, only one operand is popped off of the done stack. This function will return the status of adding the operator, which will be an error or Good upon success.

Translator – Adding Tokens

Adding tokens to the output RPN list for simple expressions consists of four major parts. But upon the first call, the master element for the output list needs to be allocated and the special Null token pushed onto the hold stack (to prevent popping tokens past the bottom of the stack). The state of the Translator is then changed from Initial to Operand.

When the state is Operand, for simple operand tokens (tokens that not operators and do not have a parenthesis), the token is appended to the end of the output list and the pointer to the output list's element is pushed onto the done stack. The state is changed to BinOp since a binary operator is expected next. A Good status is returned.

Also when the state is Operand, the token may contain a unary operator. The token is determined to contain a unary operator if the operator's table entry's unary_code is not the Null code. The token's index is replaced with the table index for the unary operator code.

When the state is Operator, the token must be an operator and must not be a unary operator. An operator could be only a unary operator, for example the NOT operator. The operator is determined to be only a unary operator if the operator's table entry has the same code and unary_code.

For all operators, all higher or the same precedence tokens on the hold stack are popped and added to the output list via another function. Once the precedence of the token on the top of the hold stack is less than the current operator, if the token is not the EOL (end of line) token, the operator token is pushed onto the hold stack. The state is set to Operand since an operand is expected next (for a unary operator, the state stays at Operand). A Good status is returned.

If the token is the EOL token, then end of line processing is performed. The hold stack has already been emptied because the EOL token is lower precedence than all the other operators except of course for the special Null token that was put on the bottom of the stack. The Null token is popped off the stack. For simple expressions, nothing else needs to be done so a Done status is returned.