Interactive BASIC Compiler Project: September 2014

Tuesday, September 30, 2014

RPN List – Put Stream Operator

There are a number of member functions in various classes that create text from an instance for outputting while running tests. A better implementation of this is to overload the put stream operator (<<). The RPN List class was the first class changed. An RPN list instance is now output like this:

std::cout << rpnList;

The text member function was moved from the RPN List class source file to the Tester class source file (the only current caller) and renamed to the put operator (operator<<). The return value and first argument was changed to an output stream reference. A second argument was added for a constant reference to the RPN list instance. Normally these put stream operator function are made friend functions of the class so that private members can be accessed, but the RPN List class already provides the necessary access functions.

A few changes made to the new. The output stream argument is used to output to instead of the local string stream variable, which was removed. The local index variable (used to create an RPN item pointer to index map) is used to detect the first item in the list instead of the number of characters written to the output stream. And the output stream argument is returned instead of the contents of the local string stream.

If there is a future need for this operator beyond the Tester class, the function can be moved and a function prototype provided in a header file. If this future need is for a string, the string can be obtained by using an output string stream and getting its string:

std::ostringstream oss;
oss << rpnList;
std::string string = oss.str();

[branch stl commit 365d20b2b]

Monday, September 29, 2014

Tester – Standard Output Stream

The non-GUI classes use many strings that will be changed from QString to std::string. In the Tester class, these strings are output using Qt text output streams. The next step in the STL transition was to change the Tester class to use std::ostream instead of QTextStream. This also required changes to the Command Line class, which provides the tester instance with the output stream to use for output.

The text stream member of the Command Line class (where either stdout or stderr was opened) was changed to a pointer to an std::ostream (which is now set to a pointer to either std::cout or std::cerr). The cout() member function was changed to return a reference to the output stream and allowing the output stream member to be set from is argument (defaulting to std::cout). Since nothing is opened, nothing needs to be closed so the coutClose() member function and destructor was removed.

Where ever a QString is output to the output stream, the toStdString() function was added since QString is not supported by std::ostream. This is temporary until most of the QString instances are changed to std::string. The usage string member was changed to a local std::string since no other code used it and it's access function was removed.

The text stream member of the Tester class was changed to an output stream reference. The toStdString() was added to QString instances (temporary). All uses of endl were changed to the new line character ('\n'). The endl manipulator (both Qt and STL) not only outputs the new line character, it also flushes the stream and this flush was not needed.

[branch stl commit 5cc2438dd0]

Saturday, September 27, 2014

Error List – Use Standard Vector

Like other classes, the Error List class was publicly derived from the QList class. It was changed to contain the list as a private std::vector member. Several access functions were added for access to the list (since the list is no longer public) including the bracket operator (constant and non-constant), constant at, clear and count. The at and count names were used so that callers didn't need to be changed.

A binary search was previously used to find an error by a line number or the closest error not greater than the line number. The std::lower_bound() function does a similar operator except returns an iterator to an element instead of an index. There are two forms of this function, one that assumes the elements being searched have the less than operator defined, and the other where a function object is passed defining the comparison.

The Error Item class does not have a less than operator defined so the second form of std::lower_bound() was used. A C++11 lambda function was defined for comparing the line numbers of error items. This lambda function definition and the call to std::lower_bound() was put into a new private find iterator function, which returns the resulting iterator. Since only line numbers are used for searching, a new constructor was added to the Error Item class for initializing just the line number member to pass to std::lower_bound().

The find function returns the index of the error item found for a line number or the closest error to the line number. This index is used an an insert point for a new error and is also passed to the edit box instance for maintaining the extra selections list (used for highlighting the errors). This function was modified to call the new find iterator function, which converts the iterator to an index by subtracting the begin iterator or the vector.

In several places in the code, the error index is compared to the size of the error list. Unlike with the Qt classes, the size of STL classes is returned a size_t type, which is an unsigned integer. Throughout the code, the type of the error index variable was changed from an integer to either size_t or where possible auto.

The find index function is used to only return an index of an error item for a line number, or a value indicating the line does not have an error (a -1 value was used). This function was also modified to call the new function, but a -1 value could not be used since the index returned is unsigned. Instead, an index one past the end (in other words, the size of the vector) is returned to indicate a line number without an error. The callers of this function was modified accordingly.

The std::vector class uses iterators instead of indexes to insert and remove items from the vector. Similar to converting an iterator to an index, an index is converted to an iterator by adding it to the begin iterator of the vector. The insert and remove at access functions were modified accordingly.

[branch stl commit 22e8013fb4]

Wednesday, September 24, 2014

New Status Message Class

The final lower-level class with translate functions was the token class, which contained a static function for converting a status code to an error message. This function was moved to a new status message class as there was no other logical class to put it in. A plain function could not be used since the easiest way to use the translate functions is to wrap them inside of a class using the Q_DECLARE_TR_FUNCTIONS() macro.

This new status message class only contains this lone static text function. To prevent an instance of this class from being created, the default constructor was deleted using the C++11 delete feature (previous to C++11, this was accomplished by making the default constructor private). To prevent this class from being used as a base class, the C++11 final keyword was added to the class definition:

class StatusMessage final
{
    Q_DECLARE_TR_FUNCTIONS(StatusMessage)

    StatusMessage() = delete;
public:
    static const QString text(Status status);
};

The three callers of the text function are the main window class (by the status bar update slot function), the program model class (by the debug text function used by the temporary program view widget), and the tester class (by the print error function). Each of these were changed to use the new status message text function.

Currently no translation is loaded, so no translation occurs. When translation gets added, the tester class should not do any translation, otherwise the expected results will not match (because they are in the default English). Therefore, when translation does get added, if a test option is selected from the command line, no translations will be loaded.

After making these changes, two minor issues was found the table class header file. The first was that this header was relying on the token header file to include the Qt core application header (which contains the translate functions macro), so this include was added. The second was the argument to this macro incorrectly contained the context name Test instead of Table. This context is used by the translate utility to identify what the translatable strings belong to. This did not cause a compile error, but was corrected. The table class will be redesigned to not require the translation functions (used for a few error messages).

[branch err-msgs commit 1047df7065]

This concludes the changes to use status codes throughout until an actual error message is needed. The err-msgs branch was merged into the develop branch and deleted. A new branch will be created for the next set of C++11 related changes, which will be the replacement of more Qt with the STL in the non-GUI classes.

[branch develop merge commit 6d5ea9367f]

Tuesday, September 23, 2014

Error Item – Error Messages

The error item contains information about an error, which includes the type (none, input or code), line number, column, length and previously the error message. Continuing with the change from error messages to error status codes, the error message member was changed to an error status code. This required a few additional changes.

The cursor changed signal is emitted from the edit box instance and is connected to the status bar update slot of the main window instance. Previously the argument of this signal and slot was the error message string. These arguments were changed to the status code. The slot was changed to obtain the error message for the error status code. If the status code was the default (same as the good status), a blank message is displayed on the status line.

The cursor changed signal previously obtained the error message from the error message access function of the program model. This function obtained the error message for a line number by accessing the item in the error list in the program model. If there was no error on the line a blank string was returned. Since the items in the error list now contain error status codes, the status code is returned. If the line does not have an error, the default status code (good) is returned.

Finally in the tester class, the error string argument of the print error function was changed to a error status code since that's what all the callers now have. The error message is now obtained from the status code argument in the print error function instead of by each of the callers obtaining the error message.

[branch err-msgs commit 0050184ac7]

Sunday, September 21, 2014

RPN List – Error Messages

If a line being translated contains an error, the error column and length members of the RPN list were set to the error. Previously the translator also set the error message member to the error message. Continuing with the change from error messages to error status codes, this error message was changed to an error status code. The translator was modified to set the error status instead of the error message. The receivers of an RPN list, the program model and tester classes, were modified to take the error status code from the RPN list and get the error message for the status code.

[branch err-msgs commit c9bad9f36b]

Parser – Error Messages

Previously the parser returned detected parse errors by setting the token type to an error and setting the string member (normally used to hold the string of the token) to the error message. Since these messages need to be translated, the parser required the Qt translate functions. The parser was modified to use status error codes instead of messages.

The token does not have a status member, and only the parser would need to use it for errors. Instead, an error status member was added to the parser class with an access function. If token has an error, the caller obtains the status error code by calling the access function. The set error access functions of the token class were also moved to the parser class, since only the parser class uses them. These were modified to set the parser error status code instead of setting the token string to an error message.

The parser error messages were changed to status codes and appropriate enumerators were added to the status enumeration, and the messages were added to the switch statement in the static token error message function. The result is that the parser class is no longer dependent on the Qt translate functions.

The two users of the parser class, the translator and the tester classes, now retrieve the error status from the parser when the token has an error and this error status is used to get the error message. Previously they obtained the error message from the token where the parser put it.

[branch err-msgs commit 84c52f0e9b]

Saturday, September 20, 2014

Error Status And Message Refactoring

The next effort is to replace the Qt classes and Qt dependencies with STL classes in the lower level non-GUI related classes (Token, Parser, Translator, Recreator, RPN List, Error List, Table, and Dictionary). Besides using Qt classes for some of their members, the Token and Parser classes also use the locale translate function. These are used for error messages. (The Table class also uses the translate function for error messages, but this class will be handled separately, and may not be necessary when this class is redesigned.)

In order to remove the dependency on the translate function, the Token, Parser, RPN List, and Error List classes will be modified to only use status codes instead of holding error messages. It will be the responsibility of the user of these classes to convert the status code to an error message when needed. The Translator class is the user or the Token and Parser classes and it will pass the status code along in the RPN list that it returns.

Currently, the Token class contains a static member function for converting a status code to an error message. There is no reason that this function along with the status enumeration to be part of the Token class (which doesn't have a status member variable).

The Parser class does not currently use status codes to return errors. Instead, it sets the token type to Error and the string member to the error message. This mechanism needs to be changed so that the Parser uses status codes. This will remove the dependency on the translate function.

The first step was to move the Status enumeration from the Token class to the main application header file, since the Status enumeration shouldn't be part of the Token class. This main header file does not have any dependencies on Qt. The error status and error message refactoring will take place on the new err-msgs branch. The goal of this branch is to remove direct handling of error messages by these lower classes.

[branch err-msgs commit 837f085074]

Thursday, September 18, 2014

End of the Initial C++11 Changes

A comment about the single remaining naked new and delete operations in the constant string dictionary mentioned in the last post. An attempt was made to use a vector of standard unique pointers, but it appears that QString and QVector classes don't play nice with std::unique (or perhaps the problem is unrelated to the Qt classes). This issue will be revisited when the dictionaries are transitioned to STL classes.

To end the initial C++11 transition, a few additional minor changes were made though there were many of them. These included:

Replacing all uses of the untyped NULL macro with the C++11 nullptr typed null pointer.
Replacing tests against the untyped NULL macro with testing the pointer directly as described recently (though for QString instances, the isNull() function was required to check if the instance contained a null string, which is not the same as an empty string).
Replacing unnamed enumerators to define integer constants with C++11 constexpr statements.
Removing empty constructors and destructors (the compiler generates these by default).
Moving empty constructors that have only member initializations to the header file.
Removing unnecessary include statements (to no longer used Qt classes).
Changing to the C++11 universal initializer list syntax throughout (except for when a specific constructor needs to be called, like giving a size to a container, or with a reference variable, which is apparently not allowed to be initialized this way).

With the conclusion of the initial C++11 changes, the cpp11 branch was merged into the develop branch and deleted. A new branch will be created for the next set of C++11 related changes.

[branch cpp11 commit 299f71ab5c]
[branch develop merge commit 53175d69b5]

Tuesday, September 16, 2014

Unique Smart Pointers

Another C++11 STL smart pointer class are unique pointers (std::unique_ptr). This smart pointer is for a single scope and deletes its resource when it goes out of scope (like when a function returns). If used for a class member variable, it deletes its resource when the class instance is deleted or goes out of scope.

Most of the rest of the allocated resources were changed to these unique pointers. Two of these were in local blocks of functions. The rest were members of various classes. For all the classes involved, after the delete operators were removed from the destructors, there was no code left, so the destructors were removed. A default constructor is now generated, which calls the destructors for the unique pointers causing their resources to be deleted.

One drawback to using unique pointers is that forward references can no longer be used in the header files for the classes involved. Forward references could previously be used since only a pointer to the class was used. However, with unique pointers, the size of the class is needed. Therefore, the forward references were replaced with include statements for the headers of the classes.

These changes covered all the remaining naked new and delete operations except for one. This remaining naked resource is the string (QString) pointers kept in a vector (QVector) for the constant string information used in the constant string dictionary (that holds the constant strings in the BASIC program). There is probably a better way to implement this and so will be handled separately.

[branch cpp11 commit 95c680e107]

Sunday, September 14, 2014

Token – With Shared Pointers

Fortunately there were no major problems this time around, but quite some time was required to go through each translator function to make sure token pointers were copied or transferred (moved) appropriately. The token pointer alias temporarily set to a plain pointer was changed to a shared pointer.

By using shared pointers, it is no longer necessary to track the use of tokens, keeping them when they are still used, and deleting them when they are no longer needed. When a shared token pointer goes out of scope or is in a container that gets deleted, it will be automatically deleted if it is no longer used. This also eliminates the need to mark closing parentheses tokens as used or unused and doing special handling to delete parentheses tokens that are in the first and last token of items on the done stack.

Only two minor problems occurred with the changes. The first was in the LET translate routine where the hidden option sub-code (indicating the LET keyword is not present) was set in the assignment token after the token had been moved when appended to the RPN output list. This was resolved by setting the sub-code before appending. The other was that two PRINT statement errors were reversed due to a check for null token pointer being changed incorrectly. Click Continue... for additional details of the changes made.

[branch cpp11 commit fe37432066]

Continued... »

Saturday, September 13, 2014

Token – Shared Pointer Preparation

The first attempt to change token pointers to shared pointers failed (after the change, the code had many memory errors and crashes). The problems may have been caused by the various containers holding token values being passed as pointers (RPN items in the RPN output list, and items on the translator hold and done stacks). These containers have now been updated so that their contents are destructed properly and automatically.

The change to shared pointers is being crept up on. The next step was to replace all references to Token* (pointers to tokens) with the alias TokenPtr:

using TokenPtr = Token *;

This will allow for an easy change to shared pointers. The changes made during the previous failed attempt were extensive, and it was hard to determine exactly which changes caused the problems. So the alias change will be separate from the shared pointer change. The alias change was made first and thoroughly tested (using the memory test script).

Another preparatory change made included passing a constant reference to a token pointer to functions that don't modify the token passed. This doesn't have much effect for simple pointers, but for shared pointers, it will prevent a copy from being made of the pointer (which incremented its used count, only to be decremented when the argument goes out of scope at the end of the function).

The use of the NULL definition for checking if a token pointer is unset or to assign it to an unset value was changed to the C++11 nullptr, an actual null pointer. For tokens pointers initialized to a null pointer, the {} C++11 initializer syntax was used.

[branch cpp11 commit 233a8f4017]

Wednesday, September 10, 2014

Translator – Done Stack As Standard Stack

There were no surprises with changing the done stack to a standard stack, which went like that of the hold stack. The Done Stack class was replaced with a single alias to std::stack and the functions were changed accordingly.

With the change of the push calls to emplace calls, a constructor was required in the Done Item class. The push contained arguments for the RPN item pointer, first and last token pointers, where the first and last pointers were optional (defaulting to a null pointer). The use patterns for the push function were:

RPN item pointer only
RPN item pointer and last token pointer
RPN item pointer, first and last token pointers

A similar constructor with the three arguments and the same optional arguments could have been added, but this would continue to require a null pointer for the middle use pattern. The first and last tokens pointers could have been switched, but this could lead to confusion. Instead, three specific constructors were added for the the above use patterns with no optional arguments.

[branch cpp11 commit ca85a259fa]

Tuesday, September 9, 2014

Translator – Done Stack Realignment

The done stack of the translator is used for temporarily holding RPN items appended to the RPN output list to be consumed as operands by operators, functions and commands. This will be the next stack changed to std::stack. But first some realignment was required to move functionality from the Done Stack class to the Done Item structure. Like the Hold Stack, the Done Stack also has QStack as its base class with functionality added. Unlike the Hold Stack class, this added functionality did more than just add stack convenience functions.

Each item on the done stack contains a shared pointer to the RPN item plus token pointers to the first and last token of the expression of the RPN item, which may contain open and closing parentheses tokens that will eventually be dropped (in most cases). These are for reporting errors to an entire expression including open and closing parentheses. The parentheses token are deleted when they are no longer needed. A closing parentheses is not deleted for extra parentheses that need recreated.

The non-stack functionality added was for handling these first and last tokens. The added pop function only returned the RPN item of the done item popped from the stack, but also deleted the first and last tokens if they contained parentheses (checking if unused for the last token). Similarly, the drop function also deleted the first and last tokens if parentheses. These called functions in Done Item structure to do the check and delete of the tokens. The final non-stack function replaced the first and last tokens of the done item on top of the stack.

This non-stack functionality was moved to the Done Item structure. The replace top first last function of the done stack called the replace first and replace last functions of the done item (the only caller to these functions). Each replace function deleted the token being replaced if a parentheses token. These functions were combined into a single replace first last function, which is now called directly for the top item on the stack and the done stack function was removed.

The added pop and drop functions both deleted the first and last tokens if parentheses. These functions also decreased the size of the stack (by a stack pop or resize down one element). This would cause the destructor of the done item to be called. No destructor was declared, but since the RPN item was changed to a shared pointer, the compiler generated a default destructor to call the destructor of the shared pointer. The last and first token pointers were plain old data, so they were not affected. The delete parentheses calls were removed from the pop and drop functions, and added to a new done item destructor (the RPN item destructor is still handled automatically). However, this caused a curious problem in pop function (a multiple deleted closing parentheses token), which now contained the single statement:

return QStack::pop().rpnItem;

The pop() returns a copy of the done item on top of the stack. The item on top of the stack is then removed, which calls the new done item destructor for the item (deleting a closing parentheses in the last token pointer). The copy is used to get the RPN item, and then the copy goes out of scope, which calls the done item destructor. Since the copy has the same last parentheses token as the first, it gets erroneously deleted twice. This was happening previously, but since the [default] destructor was not affecting the first and last tokens, no double delete occurred. This was corrected by replacing the above statement with:

RpnItemPtr rpnItem = top().rpnItem;
drop();
return rpnItem;

Now a temporary of only the RPN item is made. The drop() call resizes the stack down one item, causing the new destructor to be called, deleting the last token holding an unused closing parentheses. The RPN item is then returned. This also matches the operations that will be required for std::stack. These specialized delete parentheses token functions won't be needed once shared pointers are used for token pointers.

[branch cpp11 commit c23603ea2e]

Monday, September 8, 2014

Translator – Hold Stack As Standard Stack

Continuing with the transition to STL classes and for the preparation of changing token pointers to a shared smart pointer, the hold stack, which holds token pointers, was changed from a QStack to a std::stack. This stack contains Hold Item structures, each with a pointer to a token and a pointer to the token of the first operand if the token is an operator. The Hold Stack class was created to add two functions to the QStack used as its base class.

The added push function contained arguments for the pointer to a token and an optional pointer first operand token. The stack was resized up by one element and the new top element was set to the token pointers provided. This was an attempt to optimize the push operation. The default constructor would be called to the stack item added by the resize, but since the Hold Item structure only contained plain old data members (pointers), there was no constructor.

The added drop function resized the stack down one element. The destructor for the element dropped would be called, but again, since the element was a simple structure, there was no destructor.

The functions of std::stack work differently. With C++11, the std::stack gains the emplace function, which as previously described, does an in-place construction of the item being pushed. This is exactly what the added push function was implemented to achieve, so all push calls were changed to emplace although this required that the Hold Item have a constructor with arguments for the token pointer and optional first token pointer (like the added push function).

The pop function of the std::stack class does not return a reference like with QStack and it only pops the top item off of the stack (there is no return value). Therefore, all pop calls had to be changed to top calls (to get the value) followed by a pop call. However, the pop function is identical to the added drop function, so these were changed to pop calls. Finally, the isEmpty() calls were changed to the standard equivalent empty() and the Hold Stack class was removed.

An attempt was made to modify the Hold Stack class to combine the top and pop functions, but the resulting code was slightly larger and slower than using std::stack directly (determined by using a small test program with many iterations). This is probably why these functions were implemented the way they were. Therefore, these functions will be used as is.

The std::stack class is an adapter for the underlying class, which by default is the std::deque ("deck") class that works like both a vector (array) and double linked list (sounds expensive). The std::stack class can work with any class that supports empty, size, back, push back and pop back. The class is specified as the second template parameter after the stack element type. Using a test program, a std::vector and std::list were also tried, but the default std::deque class was the fastest (though not the smallest). The std::forward_list class, a simple single linked list, was also tried as a stack, though it can't be used with the std::stack class. And while std::forward_list is the simplest, it was also the slowest (by over four times). Therefore, the default std::stack will be used.

[branch cpp11 commit e4d568e43f]

Sunday, September 7, 2014

RPN List – Pass By Value

The decode and translate now return RPN lists by value, either from a local variable or member variable as described in recent posts. For the functions that had an RPN list pointer as an argument, the argument was changed to a constant reference (constant because the RPN list is not modified). The callers that receive RPN lists no longer are responsible for deleting the instance, which will occur automatically when the variable goes out of scope, so all the delete calls were removed.

One issue that was discovered during compiling of functions taking a constant reference to an RPN list was an error when using the code size access function. This function only returned the code size member and did not modify the object, so it should haven declared as a constant function.

The translate input routine in the Tester class previously returned a null pointer to indicate an error. Since a pointer is no longer returned, the RPN list is cleared and an empty list is returned to indicate an error. An empty list could also mean an empty BASIC line (which is allowed), but empty lines are already being ignored and are not translated duing testing.

The null pointer initializer for the output list member was removed from the constructor of the Translator class since it is no longer a pointer - it is now initialized using the default constructor for the RPN List class. All uses accessing the pointer member were changed to regular member (dot) operators.

[branch cpp11 commit 5b9245cf0a]

RPN List – Default Move Not Sufficient

The RPN list class contains other members in addition to the list including the code size, error column, error length and error message members. Since the list member is a standard list with a move operation, it is cleared by the default generated move constructor when calling the standard move in the return statement of the translate routine. All but the error message are integers with no default move operation so these members are not affected. The error message is currently a QString, which also does not have a move operation (the Qt5 classes do) and was also not affected.

When a new RPN list instance was allocated at the beginning of the translate routine, these other members were initialized according to the constructor (though only the error members were initialized). Now, at the return statement, the list of the RPN list member is cleared from the default move, but with no move operations for the other members, these members retained their values.

This was detected while running the tests. Once a line with an error was reported, that same error was reported for every line afterward (at least until there was a different error). This occurred because the error column member (used to test if the RPN list has an error) was not initialized for the next translation.

One solution would be to reset these other members at the at the beginning where the allocation was. The better solution of course is to add move constructor and assignment functions. The move constructor initializes each of the members from the other RPN list instance. In the function body, each of the members of the other RPN list instance are set to an initialized state. The list and error message members are cleared and the integer variables are set to appropriate default values. The move assignment swaps all the members before returning a reference to the instance.

RPN List – Return Non-Local Variable

Both the decode and translate routines generate an RPN list and instead of a pointer to the list will return the list by value, preferably by a move operation. The decode routine (of the Program Model class) contains a local RPN list variable, and so the value of the variable is returned by a move operation since the variable is going out of scope. However, the translate routine builds the RPN list in a member value of the Translator class (so all translator functions have access to the list). If this variable is returned by value, the value is copied because the variable is not going out of scope.

With the pointer implementation when a pointer was returned, since the translate routine no longer needed the list, it made a temporary copy of the member list pointer, set the member pointer to null and returned the temporary copy of the pointer (this transferred the list to the caller who was then responsible for deleting the list):

RpnList *output = m_output;
m_output = NULL;
return output;

The Translator class RPN list member will be the actual list, not a pointer to a list. When returning from the translate routine, the RPN list is no longer needed, so it can be transferred to the caller with a move operation. This move does not occur by just returning the variable because it is not going out of scope. The move operation can be forced by using the standard move (which resets the member variable to an empty list and transfers its value to the receiving variable of the caller):

return std::move(m_output);

By default, C++ provides a default constructor, copy constructor, copy assignment, move constructor, move assignment and destructor for a class if a program uses them. If a constructor is declared, a default constructor is not generated. If a copy operation, move operation or destructor is declared, then no default copy operation, move operation or destructor is generated.

Defining a destructor indicates that something other than the default operation needs to be done for a copy or move (for example, dealing with an allocated resource). However, these rules are not completely enforced. For backward compatibility, GCC still generates default copy operations if a destructor is defined without a warning.

The upstream problem mentioned a few posts ago http://interactivebasiccompilerproject.blogspot.com/2014/09/rpn-list-base-class-to-member.html was thought to have occurred due the destructor of the list base class was not being called because the above return move statement was not clearing the list in the member variable. The problem was finally identified to be caused by the destructor (which wasn't doing anything) declaration was removed. With the destructor, there were no move operations, the standard move performed a copy. Removing the destructor allowed the default move operations to be generated. A default move operation can be generated using the new C++11 syntax:

ClassName(ClassName &&other) = default;

This solved the issue of the member list not being cleared. However, the default move operations were not sufficient as there was another problem...

C++11 Move Constructors and Assignments

The actual instances of RPN lists will be returned from functions instead of a pointer to an allocated instance that requires the receiver to delete the instance when it is done with it (or a memory leak occurs). The new C++11 move constructors and assignments feature makes this efficient.

When a function is called that has a return value, temporary memory (usually on the stack) needs to set aside hold the return value because the variable holding the return value in called function goes out of scope and will be destroyed (destructor called) if not a plain old data (POD) type. Prior to C++11, the return value was copied from the temporary into a receiving variable using a copy constructor if not a POD type. If the variable was a large container like a list or vector, this was not efficient. This is why RPN lists were returned by a pointer.

However, with a C++11 move constructor, the return value is moved directly into the variable being constructed or assigned, each having a slightly different mechanism. For a container class like a list or vector, the only thing that is moved is the pointer to the data, not the actual data itself. Consider this example of a function that returns a list and code that calls the function:

std::list<SomeType> someFunction()
{
    std::list<SomeType> localList;
    ... list created here ...
    return localList;
}
...
std::list<SomeType> list {someFunction()};     ← caller

The new C++11 initializer syntax is used to show that the move constructor is called (but the old assignment syntax does the same thing). Without a move constructor, the local list variable destructor would be called after the return value was copied to a temporary. With the move constructor, the return value is moved directly into the receiving variable and the local variable is reset, and when the destructor for the local is called, it has nothing to do. With a container class, minimal values are copied (essentially a pointer with a little other possible housekeeping like a size). A move constructor has this basic layout:

ClassName(ClassName &&other) :
    member1{other.member1},
    member2{other.member2},
    ...
{
    other.member1{};
    other.member2{};
    ...
}

First the members of the receiving variable are initialized with another instance (the return variable). In the body of the move constructor, the other members are initialized to defaults. If one of the members was a pointer to allocated data, the pointer is transferred, and the pointer in the return variable that is about to go out of scope is set to null, so when its destructor is called, it has nothing to delete.

A move assignment is similar, but is used when the receiving variable (that already exists) is being assigned to a return value, where the caller looks like this:

std::list<SomeType> list;
...
list = someFunction();

In this case, the value in the receiving variable must be destroyed first and then the return value moved in. Without a move assignment, this is handled by the copy constructor, which destroys the old value, and copies the new value. The destructor for the local variable is called. A move assignment has this basic layout:

ClassName &operator=(ClassName &&other)
{
    std::swap(member1, other.member1);
    std::swap(member2, other.member2);
    ...
    return *this;
}

For each member, the values of the members are swapped. The standard swap function does this efficiently with move operations where a member is moved into a temporary variable, the other member is moved into the current member, and the temporary moved into the other member. No destructors are called for any of the moves. Here, the other instance is the local variable of the function that is going out of scope. What this does is moves the value of the receiving variable into the local variable, which then gets destroyed. As with any assignment, the reference of the object is returned.

The above descriptions may not be exactly what occurs internally, but it is my interpretation and explains the basic concept behind the move mechanisms. One final mention is that if the object does not have a move constructor or assignment, then the old copy through a temporary mechanism (copy constructor) is used. There are reasons why an object may not have move operations, which will be covered in the next post.

One last note. The implicit sharing feature of the container classes of Qt is another mechanism to make returning instances more efficient. This feature acts like a shared smart pointer. The return value is copied to the temporary, and the use count is incremented. When the local variable is destroyed, the use count is decremented and nothing needs to be destroyed. Shared smart pointers is another mechanism (as already used for RPN item pointers).

Saturday, September 6, 2014

RPN List – To Standard List

The RPN List class list member was changed from QList to std::list. The largest part of this change is due to standard lists not being indexed by a simple integer like a QList (which is implemented as an array internally), but with iterators as the standard list is a double linked list internally.

The significant parts of this change include iterating over the items in the list (changing regular indexed for loops to C++11 range-for loops) where the get item at index function was removed and access to the list's constant begin and end iterators was added; updating the RPN List class access functions for adding to or inserting into the list; and updating the INPUT command translate function to handle an iterator for inserting input parse codes into the list instead of an index. Click Continue... for details of these changes.

[branch cpp11 commit 9bca14b4b5]

Continued... »

RPN List – Base Class To Member

The next step is to replace the use of pointers to RPN lists. Currently, an RPN list is allocated in the translator and decoder, a pointer to which is returned. It is then the caller's responsibility to delete the RPN list. This is the so called naked new and delete that can be problematic (cause memory leaks if not careful). This could be another use for a shared pointer. However, for this case, there is a another better C++11 solution. This change involves several distinct parts.

During these changes, a problem was found upstream with the RPN List class that was misinterpreted as being caused by using a list class as a base class. The list classes (QList or std::list) were not designed to be used as base classes. The problem was perceived to be due to these classes not having virtual destructors, but this was not the case. A bigger issue is that having the list class as a public base class opens up all of the public list class members for access to users of the RPN List class, which is not good object-oriented design.

Therefore, the first part of the next set of changes was to change from using the list class as a base class to having a list member variable. Access functions to the list member were added, which includes getting the list size, checking if the list is empty, getting the token of the last item, getting an item at an index, and clearing the list. There are already access functions for appending items to the list, comparing lists (equality operators), inserting into the list, and converting the list to text (for test output).

The Translator class also contained access functions to the output list (for the various external translate functions) and these were insured to call the RPN List class access functions instead of the list directly. Only the output last token function needed to be modified since it was accessing the list directly.

Finally, the RPN List class destructor was calling the clear function of the list, but this is no longer necessary as this will happen automatically when each member (including the new list member) is destroyed. This was necessary previously because of the problem mentioned above that the base list class destructor was not defined as virtual and therefore was not being called (the RPN List class destructor overrode the list class destructor).

[branch cpp11 commit 32ab971b45]

Thursday, September 4, 2014

Token Text – Standard String

Continuing with the change to the STL, the text function of the Token class was also modified to build the string into a standard string stream. Like with the text function of the RPN List class, the with indexes option argument was not being utilized, so it was removed.

For the Token class, the return value of the text function was changed to a standard string. Callers were modified accordingly for this change. Only one caller remains that expects a QString value, the delete operator function. This function will not been needed once smart (shared) pointers are used for token pointers.

The way the private text operand helper function was used in the text function was changed making a separate function unnecessary. This helper function surrounded the token text with vertical bars and only two call located were left once the with indexes feature was removed. Instead of setting the second string, the string was changed to a flag and at the end of the function, if set, the vertical bars and the token string are added to the string stream.

[branch cpp11 commit 47766746a1]

Wednesday, September 3, 2014

RPN Item – Index Member and Text Function

The next step will be to use smart pointers for RPN list pointers. In keeping with the change to STL classes, the RPN list will be changed from a QList to a std::list. Even though QList is like std::vector internally (an allocated array), a double linked list container is more appropriate because there is one instance so far (the INPUT command) where RPN items are inserted into the list. Inserting into an array requires moving all the elements from the insertion point.

While reviewing the code for these changes, it was noticed that the index member of the RPN item was only being used for test output. This index is set when an RPN item is appended to the list. If an item is inserted into the list, the index of every item after the inserted item needs to be incremented. This index is only output on attached items as a check to make sure the correct item is attached.

Current, attached RPN items always occur before the item that are attached to, which means the indexes could be assigned to an item as it is converted to text. Therefore the index of each item will be assigned and temporarily held in an unordered map (RPN item pointer to index) instead of assigning an index when the item is appended (and incremented later for inserts). The index was removed from the RPN Item class.

The text function of the RPN List class was modified to handle the conversion of each item to text using access functions of the RPN Item class so that the indexes can be added (from the temporary map) instead of calling the text function of the RPN item (which would not have the temporary indexes). Since the text function of the RPN Item class is no longer needed, it was removed. Also, the with indexes option argument was not being utilized and so this feature was also removed.

The text function was also modified to build the text of the RPN List into a standard string stream (std::stringstream, which works like an output stream but puts the result into a string). This is easier for building a string than using the standard string which is feature limited. For now, when the final standard string is obtained from the string steam, it is converted to a QString (since that it how the caller uses it, for now). Also, the text from the token needs to be converted from a QString to a standard string before adding to the string stream.

[branch cpp11 commit da4fcfbb23]

Monday, September 1, 2014

RPN Item Pointers As Shared Pointers

An attempt was made to replace all the token pointers with a shared pointer class (QSharedPointer). A lot of changes were required that created many compile issues to correct. Worst, when running the first time, there were many crashes that were not being easy to resolve. I believe some of the problems were related to the RPN items (that hold token pointers) stored in the RPN output list. So, these changes were temporarily stashed.

Instead of starting with the token pointers, the simpler RPN item pointers were changed to use QSharedPointer. This was successful, however, I decided to try using the STL std::shared_ptr class (C++11 only) instead . After some testing, it appears that STL classes are faster than the equivalent Qt classes.

Originally, STL classes were avoided in favor of Qt classes, since this is a Qt based GUI application. With the recent desire to use C++11 and considering that the STL classes were enhanced and optimized to use C++11, STL classes will now be used as much as possible except when dealing specifically with GUI elements requiring Qt classes or when Qt classes have features not available in the STL classes (for example, the STL std::string class does not have a case insensitive comparison, but the QString class does).

To start the changes for the RPN item pointers, all instances of RpnItem* were changed to RpnItemPtr, which was defined as an alias (this using syntax is new to C++11 and is just an easier form of the typedef statement):

using RpnItemPtr = QSharePointer<RpnItem>;

Many functions that had an RPN item pointer as an argument were changed to a reference to a RpnItemPtr so that the use count of the isn't unnecessarily incremented when a copy is made for the function call argument (and then decremented when the function returns). If the item is copied inside the function, the use count will get incremented. RPN items no longer need to be deleted as they now will be done automatically when they they go out of scope or their container is deleted or goes out of scope.

The RPN item constructor was changed to use the new initializer syntax instead of setting the member inside the constructor body. This is more efficient because otherwise, the default constructor is first called for each non-plain member and then a value is copied into the member. The RPN list class also no longer needed a clear function to delete the RPN items (the items get deleted automatically).

The attached array (with a count) in the RPN item class was naively implemented as a plain C array, which required naked new and delete operations and was changed to a QList. This change eliminates the need for a count variable (QList maintains its size) and also eliminates the need to delete the attached RPN items (automatic when the RPN item is deleted automatically).

When the RPN item pointer was changed to the STL std::shared_ptr class, the changes with QSharedPointer were put on a branch and the branch abandoned. The change in shared pointer class only required minor changes, which included changing the alias above and adding an alias for RPN item pointer vector:

using RpnItemPtr = std::shared_ptr<RpnItem>;
using RpnItemPtrVector = std::vector<RpnItemPtr>;

The latter alias is used for the attached member (just changed to a QList) was changed to a std::vector (which is more similar to QList than to std::list as QList is an allocated array, std::list a double-linked list). Also, the size of STL classes is accessed using the size member function compared to the count member functions of the Qt classes.

[branch cpp11 commit a2069aae24]
[branch rpnitem-qt-sharedptr commit 7840954d5e]