Sunday, March 27, 2011

Translator – A Unary Operator Curiosity

One of the test statements created for testing the unary operator fix was:
A = -B^NOT C% + -D*NOT E%
The intention of this statement was test the NOT unary operator in front of the second operand of both the exponentiation and multiplication operators. The translation of this statement was expected to be (the blue expression being the first operand and the red expression being the second operand of the addition):
A<ref> B C% NOT ^* Neg D Neg E% NOT *%2 + Assign
However, what was produced was this unexpected translation:
A<ref> B C% D Neg E% NOT *%2 +%1 Cvtint NOT ^* Neg Assign
Upon reviewing the precedence of the operators (see Translator – Operator Precedence) and the code, it turns out that this translation was correct. The ADD is higher precedence than the NOT, so the operands of the ADD are C% and the –D*NOT E% expression with MUL (*%2) higher precedence than ADD, its operands are the –D and NOT E%).

So while the exponentiation is highest precedence, with NOT having a low precedence allowed the ADD to bind the rest of the expression to the NOT, which becomes the second operand (yellow above) of the exponentiation, with the first negation being the final operator. In C/C++, the not (!) and negation (-) operators are very high precedence (and there is no exponentiation operator). But here, NOT was given a low precedence just above the other logical operators but below the math operators (see Translator – Operator Precedence for reasoning). Normally the NOT operator would probably not be used in the same expression as exponentiation like above.

Translator – Unary Operator Problem

While testing the negative constant changes, a new problem was discovered with unary operators, specifically this statement:
A = ---B
Which produced a “done stack empty” bug error at the first negation token. The problem occurred because the second negation operator forced the first negation operator from the hold stack because it was greater or equal precedence, and when checking the operand of the first negation operator, there was nothing on the done stack. Here and some additional examples:
A = B*-C
A = B^-C
The first statement translated correctly because negation is higher precedence than multiplication leaving multiplication on the hold stack. However, the second statement failed because negation is lower precedence than exponentiation forcing exponentiation from the hold stack but with only one operand on the done stack generating the done stack empty bug error. A new rule was needed for unary operators.

Basically, unary operators should not force any tokens (unary operators, binary operators, arrays, or functions) from the hold stack regardless of their precedence because not all of their operands have been received yet (the negate and its operand will be their operand and it has not been fully received yet). As currently implemented, other non-unary operators should still force unary operators from the done stack if the unary operator has higher precedence.

The check to force tokens from the hold stack was changed to if the precedence of the operator on the hold stack is higher than the current operator and if the current token is not a unary operator. Unary operators will now not force other tokens from the hold stack, but other tokens will still force unary operators from the hold stack if higher in precedence. While testing this change, a curious result was produced from one of the test statements...

Parser – Negative Constants

Negative constants were previously not considered by the Parser, which interpreted a minus as the subtract operator. The Translator then changed it to a negate operator when it appeared in the operand state. Consider these two examples (along with there current translations):
A = B-1.5      A B 1.5 Sub Assign
A = -1.5+B     A 1.5 Neg B Sub Assign
The reason for the Parser to not look for signs on numerical constants can be seen in the first example. If the Parser produced the four tokens A = B -1.5, the Translator would generate an “expected operator” error at the -1.5 token since a second operand token was received when it was expecting a binary operator. The second example produces an unnecessary negate token after the constant. While perfectly valid, this is not desirable.

In order for the Parser to correctly interpret negative signs on numerical constants, it needs to be aware of whether the Translator is in operand state or not. If in operand state, the Parser can look for a negative sign in front of a number constant, otherwise a minus should be interpreted as an operator.

A new operand state flag was added to the Parser with an access function to set its value (which is initialized to off). The Parser get number routine was modified to have a new sign flag used to determine if a negative sign was found first. This flag will also prevent multiple negative signs. However, it will only check for a negative sign if no digits or a decimal was seen and if the new operand state flag is on.

An access function was added to the Translator to get the current operand state (either operand or operand-or-end state). Before calling the Parser get token routine, the Parser operand state is set from the Translator's current operand state. While testing, a problem was discovered with unary operators...