Saturday, November 8, 2014

Parser – Standard Input Stream

The parser functions have been modified to use a standard input string stream, so the input member could now be changed to the std::istringstream class, and the input position member removed.  The current input position can be obtained directly from the input stream using the tellg member function.  The skip white space function was removed since this can be done directly on the input stream (in other words, extract white space):
m_input >> std::ws;
In several places, a temporary position or length integer variable is used to get the current position (tellg) or length (length) because these functions return a pos_type and size_type values, which are 64-bit integers.  The token constructors and error structure only accept integers (32-bit).  There is no reason to change the member variable types to 64-bit integers as there will never be input lines or strings that are long enough to require 64-bit integers.

When using an input stream, care must be taken when using the tellg function to obtain the current input position.  This function returns an EOF value (-1) once the input stream has been read past the end.  So this function can't be used is a previous operation could have possibly read past the end.

There times when the input position must be reset (like when the second word of a possibly two-word command is not valid).  The seekg function is used to the input position.  However, this function does not work once the input stream has been read past the end.  This is because the EOF flag is set.  To clear this condition, the clear function needs to be called, so this call precedes all seekg function calls except one where an EOF cannot have occurred.

In the get string function, the characters read are counted so that the length of the string in the input is known when the token is constructed and returned (pairs of double quotes count as one character in the string, but take two characters in the input string, so must be counted as two).  The ending input position cannot be used to determine the length in the input string because the position is not valid if the string constant is at the end of the line (see issue with tellg function above).

The constructor of the parser was changed to take a standard string input.  Both callers were modified accordingly - the tester class already had a standard string, but the translator needs to convert from its QString to a standard string (until the translator is modified).  Dependency on Qt has almost been removed from the parser except for one call to obtain the name for the REM command (which will be handled when the table is modified).

[branch parser commit 8e71a71fd5]

One outstanding item remains - the token string member is still a QString though its constructors have been modified to take standard string arguments.  This will not a trivial change since many users of the token string still expect a QString.  Therefore, this work will take place in a new development branch.  This concludes work on the parser, so the parser branch was merged to the develop branch and deleted.

[branch develop merge commit 2cafb22a8e]

No comments:

Post a Comment

All comments and feedback welcomed, whether positive or negative.
(Anonymous comments are allowed, but comments with URL links or unrelated comments will be removed.)