Friday, January 29, 2010

Parser – Implementation Notes

The get_token() function always allocates a new Token and sets the column to the current position (by subtracting the pointer to beginning of the line, i.e. input). Therefore a constructor was added that accepts an integer value (the column) and also initializes the string pointer to NULL. A destructor was also added to delete the string if it is not NULL.

The get_identifier() function calls scan_word to get the first word and then checks to see if the string begins with “FN” (including lower case) for a one-line user function. No reason to search the table first. Otherwise, the table is then searched. If not found than the identifier is returned including whether there was a opening parentheses and the data type if any.

If the word was found in the table and is only a single word, the command, internal function or word operator is returned. I decided to change the Boolean twoflag in the table to an Enumeration named Multiple so that three word and three character operators could be supported. For now only two word commands and two character operators are supported.

If the command can be two words, white space is skipped and scan_word() is called again to get the second word. If there is a data type or parentheses, then it is not a valid second word of a command. If the first word is valid as a single word, then it is returned as the token (the second word is held for the next token). Otherwise, the table is searched again for the two words. If not found in the table, then if the first word is valid as a single word, then it is returned as the token, otherwise an error token is returned. If found, the two word command is returned as the token. Support for three word commands was not implemented at this time.

The implementation of skip_whitespace(), scan_word(), and get_number() was straight forward. The implementation of get_operator() was similar to get_identifier() except only single characters are involved (no scanning necessary). Support for three character operators will not be implemented at this time.

I decided that searching the table should not be part of the Parser class, but should be part of another class for the Table. By the way, the Operator Table will now be known as simply the Table, since it holds more than just operators.

No comments:

Post a Comment

All comments and feedback welcomed, whether positive or negative.
(Anonymous comments are allowed, but comments with URL links or unrelated comments will be removed.)