Sunday, January 3, 2010

Parsing Identifiers

Once a letter is found, the parser will continue collecting letters, digits and under-bars. When no more are found, the next character is checked for a data type symbol (“%”, “$” or “#”) and if found the data type will be set to an integer, string or double (initially set to none). Finally, the next character is checked for an opening parenthesis, and if found a parenthesis flag is set. If the the first two characters are FN, then the token type is set to Defined Functioned, and the string is returned along with the data type and parenthesis flag.

The Operator Table will then be searched for the identifier. If the identifier is found in the table and the entry is not flagged that the word is part of a two word command, then the internal code for the command, operator or internal function will be returned for the token along data type (internal function only). No string from the input needs to be returned.

If the table entry is flagged that the command is part of a two word command, then a second identifier after white space is read from the input. The second identifier will not have a data type symbol or parenthesis. The Operator Table is searched again for the two word identifier (with one space between). If found then the internal code for the command will be returned. If not found and if the first word by itself is a valid command, then the internal code for the command is returned; otherwise a syntax error is reported at the second word.

If the identifier was not found in the table, then is could be a Variable, Array, Generic Function or Subroutine name. At this point, the Parser has insufficient information to determine which of these types the identifier is. The global or local dictionaries would need to be checked and this is beyond the scope for the Parser (this will be handled by the Translator or Encoder). The Parser will simply return a type of No Parenthesis or Parenthesis.

No comments:

Post a Comment

All comments and feedback welcomed, whether positive or negative.
(Anonymous comments are allowed, but comments with URL links or unrelated comments will be removed.)