Wednesday, January 13, 2010

String Class

The Token structure has a string field that will hold a string constant, identifier, numeric constant or comment string. Eventually, at least for a string constant, the string will be need to be used as a BASIC style string.

A BASIC string is specific data type in the language, whereas C/C++ does not have a string type but uses character arrays for strings terminated by a null character (a character with value zero). A BASIC string has a length value associated with it and in BASIC, a 0 character, aka CHR$(0), is allowed in a string, so a character value of zero cannot be used as a terminator.

Therefore, to work with BASIC style string, a string class is needed that will contain a char array pointer (a char array will be allocated for a given length) and a length value. When breaking up an input line, each token will be put into one of these String objects. This will save the extra work of extracting strings from the input line and then adding terminators to them. These Strings will also be used throughout for identifiers, numeric constants and comment strings.

The String class will contain the necessary functions for working with BASIC strings, like concatenation, comparison, object creation, and string functions (like needed for MID$, LEFT$, RIGHT$, etc.).