Sunday, June 30, 2013

New Translator – Get Expression Design

Five new functions were implemented to support simple expressions (simple operands, unary and binary operators) in the new translator and two existing translator functions are utilized.  The names of new translator routines that conflict with existing functions are being temporarily suffixed with a '2' character.  This functions will replace the existing functions once the new translator is fully implemented and working.

The new routines consist of the top level translate routine (currently only supports expression mode); the get expression routine for getting an expression (to be utilized by any command needing an expression), which will return the token terminating the expression; the process operator routine for handling the precedence of operators; the get operand routine for getting an operand (simple operands to start); and the get token routine for getting a token from the parser.  Click Continue... for some more details of each of these new routines.

Translate

The new translate routine starts by setting the input line to the parser, creating a new output RPN list, pushing the null blocker token to the hold stack, setting the local token pointer to null and calling the new get expression routine for any data type.  Upon successful return, the terminating token is checked for an end-of-line token, otherwise an "expected operator or end-of-line" error is returned.  For an end-of-line token, the null token is popped from the hold stack, the result is popped from the done stack (not used here), and both stacks are checked to make they are empty.  For an error, the error information is set in the output list.  This routine currently only supports the expression test mode.

Get Expression

The new get expression routine starts in an operand state, expecting either a unary operator or an operand.  If not a unary operator, it gets the operand and then gets the next token that follows, which will either be a binary operator or the end of the expression (any unrecognizable token, which will be up to the caller to determine the validity of).  The unary or binary operator operator is processed.  The expected data type of the operator is obtained and the get expression routine is called recursively to process the rest of the expression for the expected data type.

Because of this flow, no state variable is needed to keep track of whether in operand state, operator or end state, operator state or any other state.  The get expression routine has arguments for the reference to the token to return that terminated the expression and the desired data type of the expression.  This routine starts be getting a token if none was passed in (eventually a caller may pass in a token that has already been obtained).

Process Operator

The new process operator routine is similar to the existing process operator routine, which starts by clearing the hold stack of higher precedence operators.  For each operator popped from the hold stack, the existing process final operand routine is called.  Once the hold stack is cleared, the existing function proceeded to call a token handler to process the token, which called the general operator token handler for operators.

Token handlers are not necessary with the new translator since the caller will check and process terminating tokens.  If the token ends the expression, the new process operator routine will return immediately with a done status.  Otherwise, only the general operator token handler functionality is needed.  This handler did some error checking (based on the current state and mode of the translator) and then called the process first operand function.  The error checking is not needed since the caller is now performing this, so only the existing process first operand needs to be called.

Get Operand

The new get operand routine contains a reference to a token where the caller may pass in an already obtained token and a token can be returned if an error is detected.  If no token is passed, then a token is obtained.  A data type is also passed in, which will only be used for an unacceptable token to report the appropriate "expected XX expression" error.  For now, only constant and identifiers with no parentheses tokens are recognized.  For both, the token is appended to the output list and pushed to the done stack.

Get Token

At the bottom of the new translator functions is the new get token routine.  This routine has a reference to a token pointer argument to hold the token being returned and a flag argument if the token could be an operand.  The flag is used to set the operand state of the parser (for correctly detecting negative constants).  If token from the parser has an error, the new parser error is returned, which the caller will need to handle.

No comments:

Post a Comment

All comments and feedback welcomed, whether positive or negative.
(Anonymous comments are allowed, but comments with URL links or unrelated comments will be removed.)