Sunday, June 27, 2010

Translator – New Token Modes

Translating INPUT statements will require additional token modes. The current token modes are:
Command – Translator is expecting a command token (or start of an assignment)
Assignment – Translator is expecting an item for an assignment statement
EqualAssignment – An equal token was received when the mode was Command or Assignment; another equal token would indicate a multiple assignment statement (commas are not permitted)
CommaAssignment – A comma token was received when the mode was Command or Assignment; another comma token would indicate continuation of a multiple assignment statement (an equal token would indicate the end of the list and the begin of the expression)
Expression – Translator is expecting operands of operators depending on the current state
When a semicolon appears at the end of an INPUT statement, no further tokens should be received except for an end of statement token (EOL, colon, ELSE, and ENDIF). A new mode is required so the Translator can make sure no additional non end of statement tokens are received:
EOS – Translator is expecting an end of statement token only
An INPUT statement contains variable(s) that are to be input. Expressions are not allowed (except for the string expression after the PROMPT keyword, or within subscripts of array variables). The INPUT translation could be implemented to check if the token on top of the done stack has the reference flag set, and if not, report an “expected variable” error. But that could leave to strange errors being reported, consider this example (with the translation of the expression):
INPUT A*B+C  A B * C +
The + will be on top of the done stack after this expression is translated (being the result of the translated expression). The + token will not have the reference flag set since it is an operator. The INPUT is expecting  a reference, so it would report “expected variable” pointing to the + token. This would be very confusing – why would a variable be expected at the +? The correct error should be “expecting comma or end of statement” pointing to the * token. A new mode is required so the Translator can make sure no operators (except for end of expression operators comma, semicolon and EOL) are received:
Reference – Translator is only expecting reference tokens and end of expression operator
Reference tokens include tokens without parentheses and tokens with parentheses. However, these type of tokens could be variables, arrays or functions, but the Translator is not able to determine which. Therefore, the Encoder could still find errors if a user function was placed in an INPUT statement. Lastly, sub-string functions (while valid in assignment statements) are not valid in INPUT statements, therefore the Reference mode needs to check for these and report an error.

No comments:

Post a Comment

All comments and feedback welcomed, whether positive or negative.
(Anonymous comments are allowed, but comments with URL links or unrelated comments will be removed.)