Thursday, March 11, 2010

Translator

The job of the Translator is to convert the order of the tokens parsed from an input line into reverse polish notation (RPN), which will make for optimal execution of the program. See the post Internal Language for an explanation.

Essentially, the Translator will obtain tokens from the Parser expecting either an operand or an operator token type. Operands will be added directly to the RPN list, and operators will be added on an operator stack. Before putting an operator token on the operator stack, any operators on the operator stack that have higher precedence than the current operator are pulled from the stack and added to the RPN list. After the higher precedence operators are pulled from the stack, the current operator token is pushed on the stack. At the end, all operators from the operator stack are pulled and added to the RPN list.

Take the simple example A + B * C. The A token is added directly to the RPN list. The + token is pushed onto the empty operator stack. The B token is added directly to the RPN list (now set to A B). The + token on the operator stack is not high precedence than the * token, so the * token is pushed on the operator stack. The C token is added directly in the RPN list (now set to A B C). At the end, the operator stack is cleared, so the * token and then the + token is added to the RPM list (now set to A B C * +). When this is executed, A, B, C is pushed onto a stack. The * operator pops C and B off of the stack, multiplies them and pushes the result back onto the stack.  The + operator pops B*C and A off of the stack, adds them and pushes the result back onto the stack.

Next take the simple example A * B + C. The A token is added, the * token is pushed onto the operator stack, and the B token is added. Now with the + token, the * token on the top of the stack is high precedence than +, therefore the * token is popped from the stack and added (now set to A B *). The + token is pushed on the operator stack. The C token is added (now set to A B * C). At the end, the + token is popped and added (now set to A B * C +).  Execution follow similarly except that A and B are multiplied and pushed back to the stack before C is pushed onto the stack for the + operator.

The process of translating is much more involved than this, because other things need to be handled like unary operators that may precede operands, parentheses including arrays and functions, commands and conversions of data types like if operands of an operator are different but acceptable (e.g. integer and double).

No comments:

Post a Comment

All comments and feedback welcomed, whether positive or negative.
(Anonymous comments are allowed, but comments with URL links or unrelated comments will be removed.)