Saturday, March 26, 2011

Parser – Tokens With Parentheses

While correcting issues with define function tokens, it was noticed that it is not necessary to also store the opening parentheses in the string field of the token. This also includes the generic tokens with parentheses. The parentheses is not necessary because there are separate token types to identify tokens with and without parentheses.

It is also advantageous to not store the parentheses so that an array or define function name can be found in the dictionary. For define functions, take the code snippet:
DEF FNHypot(X,Y)
    FNHypot=SQR(X*X+Y*Y)
END DEF
Z = FNHypot(3,4)
Both the FNHypot( and FNHypot tokens appear, which represent the same function. If the parentheses was stored in the dictionary for the function name, it would require complicated string comparisons to figure out that the FNHypot token is the same function. This same issue applies to regular function names.

A similar issue also applies to arrays. When functions and subroutines are implemented, there will be a feature to allow an entire array to be passed to a function or subroutine. Exactly how this will work has not been defined yet, but this code snippet shows how it would look:
DIM Array(10)
CALL subroutine(A)
The changes for removing the parentheses from these tokens were very simple. In the Parser get identifier routine, when creating the string for these tokens, the length provided to the string constructor was changed to be one less than the actual length so the parentheses would not be included. In the Translator when reporting an error at the open parentheses of a define function with parentheses token, the minus one was removed since the length is now one less. Finally, in the token test output routines, an open parentheses was added to the output of these tokens.

Project – Tokens Status Enumeration

Each time a new error (token status) is added, renamed or deleted, two changes were needed. Both the token status enumeration (include file) and the message array (source file) needed to be changed. The correct changes were needed or the two files will be out of sync. To check for problems, code had been added to initialization to check for duplicates and missing entries.

Similar to automatic generation of the code enumeration from the table entries, the token status enumeration will also be generated automatically. Each message array element was a structure containing a token status value and a pointer to a message string. The elements were changed to just the message string with the name of the token status in a comment at the end of the line.

The codes awk script was renamed to enums and was modified to also read the token status message array to generate the token status enumeration automatically be reading the name in the comment after the message string. The name of the output file was changed from codes.h to autoenums.h to be more generic and allow for additional automatic enumerations.

During initialization, in addition to checking for duplicates and missing entries, a translation index array was built to translate from token status value to index. Both the checking and the translation array were removed since they are no longer necessary. The awk script will check for duplicates and the token status is now the same as the index into the message array.

The error type template class is no longer needed for token status errors (but still for the table entry erros). Also, the duplicate and missing were no longer needed and were removed. Some problems were found in this template class for table entries where the range error was not working because the wrong constructor was being called. Instead of storing indexes to the errors, the variables were changed to be the type of the template. This made the range error constructor unique.