Tuesday, August 6, 2013

Extra Token Delete Detection

The other type of token allocation error that can occur is when a token that has already been deleted is deleted again.  This is a less common problem, but can still could occur none the less.  Extra token delete detection was also previously implemented, but removed.  The original implementation contained a list of tokens deleted extra times.  A copy of the token was appended to the list.

This time around, the text of the token (by using the text access function) along with its index (as part of the string) is saved in a list of strings of deleted tokens.  When it comes to to report these errors, each string in the list is output.  In the reimplemented delete operation function, before the token is marked a unused in the used token vector (its pointer set to null), if the token is already marked as unused, its index and text is appended to the deleted list instead and the token is not pushed onto the free token stack (it is already there).

A new private DeletedList class was  implemented inside the Token class.  The destructor is called automatically upon termination of the application and any extra token deletes are reported. This class also contains a report errors function that does the work of outputting the deleted list to the standard error output stream and clears the list afterward.  The destructor calls this function.

Again, a separate function was implemented so that it can be called at the end of each line translated in test mode and extra token deletes for that specific line are reported with the line.  For non-test (GUI) mode, any extra token deletes are reported when the application terminates.

Token (Memory) Leak Detection

While implementing the token caching, I thought it would be extremely helpful to put back some token leak detection, a common time consuming problem seen during debugging of the new translator routines.  Leak detection was previously implemented, but removed during the Qt transition.  According to the post on October 28, 2012, this was due to obscure compiler errors.  Since valgrind was being used for memory leak detection by then, this code was removed.  The log for this commit did not provide any additional clues to the issues other than to say Qt does not interface well to self implemented new and delete operators (?).

The original token leak detection consisted of a static list of used tokens and each token contained a pointer into this list.  This was easy to implemented with how the original List class was designed, but the QList class had no easy equivalent (using an iterator did not work).  I suspect this partly caused the problem, which may also have had something to do with the way members are initialized by the constructor when a new instance is created and having a Qt member complicated things.  Anyway, a new easier method of keep track of used tokens was implemented this time around that did not cause any problems.

A plain integer index member was added to the Token class.  Every token allocated gets a unique index number, which the token will have through the life of the application.  The index values assigned start at zero and each additional token allocated gets the next index value.  This index is used to index into a QVector of token pointers.

When a new token is allocated, the index is set to the size of this used token vector, and then the token is appended to the vector.  In other words, the index points to its corresponding element in this vector.  When a token pointer is popped from the free token stack, the element corresponding to the token is set to the token pointer.  When a token is deleted, the element corresponding to the token is set to a null pointer to indicate the token is not currently used (it is in the free token stack).

A token pointer was used for this vector instead of a simple boolean flag (for indicating a token was used), so that when it came time to see if there are any tokens used that have not been freed (the element in the vector contains a non-null token pointer), the token pointer can be used to print information about the token (type, code, string, etc.).

A new private UsedVector class was implemented inside the Token class, similar to the FreeStack private class.  The destructor is called automatically upon termination of the application and any token leaks are reported and the tokens deleted so that valgrind does not also report memory leaks.  This class also contains a report errors function that does the work of scanning the vector and reporting any token leak errors found to the standard error output stream and deletes the tokens.  The destructor calls this function.

A separate function was implemented so that it can be called at the end of each line translated in test mode and token leaks for that specific line are reported with the line.  This does not interfere with the test scripts since the errors are output to the standard error stream, which is not captured in the output that gets compared with the expected results files.  Once a test is seen to contain errors, it can be rerun from the command line to the standard output to see which test statements caused the errors.  Having the errors output with the test lines saves debugging time in having to identify which test statement caused the error.

For non-test (GUI) mode, any token leaks are reported when the application terminates, though if not run from the command line, these errors will not be seen.  This should not be an issue because any token leaks should have been eliminated by this time.