Thursday, January 7, 2010

Parser - Tokens

The Parser will return tokens, which will be contained in a structure containing the fields:

    type – the type of token
    column – the column of the input that the token started
    code – the internal code for the token
    string – pointer to the string of the token from the input
    data type – the data type of the token
    value – union of an integer and double value

Not all fields will be used for all token types, but type and column will always be set.  The token types that will be returned by the parser along with the other fields that will be set are:

    COMMAND (code)
    OPERATOR (code)
    INTFUNC (code, data type)
    REMARK (code, string)
    CONSTANT (code, string, data type, and value for numeric constants)
    DEFFUNC (string; data type may be set)
    NOPAREN (string; data type may be set)
    PAREN (string; data type may be set)

The string field will be set for numeric constants and this will be save for later recreation of the source (so that the user will get back the constant that was entered – take the example of 10.0e2, 10e2, 1000 and 1e3; these are represent the same value and it would be nice if what the user enters is what he gets when the program is listed) and the same constant strings will only be saved once to conserve memory (along with a used count).  The same applies to strings of remark commands.

The data type field will be set to one of the following based on which data type symbol was preset  in the identifier or the type of constant:

    NONE
    DOUBLE
    INTEGER
    STRING

There will be a special token returned when the end of the input line is reached.  This will be a command with it's own code, which will be used by the Translator but will probably not be stored in the internal code.

No comments:

Post a Comment

All comments and feedback welcomed, whether positive or negative.
(Anonymous comments are allowed, but comments with URL links or unrelated comments will be removed.)