Sunday, September 7, 2014

RPN List – Pass By Value

The decode and translate now return RPN lists by value, either from a local variable or member variable as described in recent posts.  For the functions that had an RPN list pointer as an argument, the argument was changed to a constant reference (constant because the RPN list is not modified).  The callers that receive RPN lists no longer are responsible for deleting the instance, which will occur automatically when the variable goes out of scope, so all the delete calls were removed.

One issue that was discovered during compiling of functions taking a constant reference to an RPN list was an error when using the code size access function.  This function only returned the code size member and did not modify the object, so it should haven declared as a constant function.

The translate input routine in the Tester class previously returned a null pointer to indicate an error.  Since a pointer is no longer returned, the RPN list is cleared and an empty list is returned to indicate an error.  An empty list could also mean an empty BASIC line (which is allowed), but empty lines are already being ignored and are not translated duing testing.

The null pointer initializer for the output list member was removed from the constructor of the Translator class since it is no longer a pointer - it is now initialized using the default constructor for the RPN List class.  All uses accessing the pointer member were changed to regular member (dot) operators.

[branch cpp11 commit 5b9245cf0a]

RPN List – Default Move Not Sufficient

The RPN list class contains other members in addition to the list including the code size, error column, error length and error message members.  Since the list member is a standard list with a move operation, it is cleared by the default generated move constructor when calling the standard move in the return statement of the translate routine.  All but the error message are integers with no default move operation so these members are not affected.  The error message is currently a QString, which also does not have a move operation (the Qt5 classes do) and was also not affected.

When a new RPN list instance was allocated at the beginning of the translate routine, these other  members were initialized according to the constructor (though only the error members were initialized).  Now, at the return statement, the list of the RPN list member is cleared from the default move, but with no move operations for the other members, these members retained their values.

This was detected while running the tests.  Once a line with an error was reported, that same error was reported for every line afterward (at least until there was a different error).  This occurred because the error column member (used to test if the RPN list has an error) was not initialized for the next translation.

One solution would be to reset these other members at the at the beginning where the allocation was.  The better solution of course is to add move constructor and assignment functions.  The move constructor initializes each of the members from the other RPN list instance.  In the function body, each of the members of the other RPN list instance are set to an initialized state.  The list and error message members are cleared and the integer variables are set to appropriate default values.  The move assignment swaps all the members before returning a reference to the instance.

RPN List – Return Non-Local Variable

Both the decode and translate routines generate an RPN list and instead of a pointer to the list will return the list by value, preferably by a move operation.  The decode routine (of the Program Model class) contains a local RPN list variable, and so the value of the variable is returned by a move operation since the variable is going out of scope.  However, the translate routine builds the RPN list in a member value of the Translator class (so all translator functions have access to the list).  If this variable is returned by value, the value is copied because the variable is not going out of scope.

With the pointer implementation when a pointer was returned, since the translate routine no longer needed the list, it made a temporary copy of the member list pointer, set the member pointer to null and returned the temporary copy of the pointer (this transferred the list to the caller who was then responsible for deleting the list):
RpnList *output = m_output;
m_output = NULL;
return output;
The Translator class RPN list member will be the actual list, not a pointer to a list.  When returning from the translate routine, the RPN list is no longer needed, so it can be transferred to the caller with a move operation.  This move does not occur by just returning the variable because it is not going out of scope.  The move operation can be forced by using the standard move (which resets the member variable to an empty list and transfers its value to the receiving variable of the caller):
return std::move(m_output);
By default, C++ provides a default constructor, copy constructor, copy assignment, move constructor, move assignment and destructor for a class if a program uses them.  If a constructor is declared, a default constructor is not generated.  If a copy operation, move operation or destructor is declared, then no default copy operation, move operation or destructor is generated.

Defining a destructor indicates that something other than the default operation needs to be done for a copy or move (for example, dealing with an allocated resource).  However, these rules are not completely enforced.  For backward compatibility, GCC still generates default copy operations if a destructor is defined without a warning.

The upstream problem mentioned a few posts ago http://interactivebasiccompilerproject.blogspot.com/2014/09/rpn-list-base-class-to-member.html was thought to have occurred due the destructor of the list base class was not being called because the above return move statement was not clearing the list in the member variable.  The problem was finally identified to be caused by the destructor (which wasn't doing anything) declaration was removed.  With the destructor, there were no move operations, the standard move performed a copy.  Removing the destructor allowed the default move operations to be generated.  A default move operation can be generated using the new C++11 syntax:
ClassName(ClassName &&other) = default;
This solved the issue of the member list not being cleared.  However, the default move operations were not sufficient as there was another problem...

C++11 Move Constructors and Assignments

The actual instances of RPN lists will be returned from functions instead of a pointer to an allocated instance that requires the receiver to delete the instance when it is done with it (or a memory leak occurs).  The new C++11 move constructors and assignments feature makes this efficient.

When a function is called that has a return value, temporary memory (usually on the stack) needs to set aside hold the return value because the variable holding the return value in called function goes out of scope and will be destroyed (destructor called) if not a plain old data (POD) type.  Prior to C++11, the return value was copied from the temporary into a receiving variable using a copy constructor if not a POD type.  If the variable was a large container like a list or vector, this was not efficient.  This is why RPN lists were returned by a pointer.

However, with a C++11 move constructor, the return value is moved directly into the variable being constructed or assigned, each having a slightly different mechanism.  For a container class like a list or vector, the only thing that is moved is the pointer to the data, not the actual data itself.  Consider this example of a function that returns a list and code that calls the function:
std::list<SomeType> someFunction()
{

    std::list<SomeType> localList;
    ... list created here ...
    return localList;
}
...
std::list<SomeType> list {someFunction()};     ← caller
The new C++11 initializer syntax is used to show that the move constructor is called (but the old assignment syntax does the same thing).  Without a move constructor, the local list variable destructor would be called after the return value was copied to a temporary.  With the move constructor, the return value is moved directly into the receiving variable and the local variable is reset, and when the destructor for the local is called, it has nothing to do.  With a container class, minimal values are copied (essentially a pointer with a little other possible housekeeping like a size).  A move constructor has this basic layout:
ClassName(ClassName &&other) :
    member1{other.member1},

    member2{other.member2},
    ...
{

    other.member1{};
    other.member2{};
    ...
}
First the members of the receiving variable are initialized with another instance (the return variable).  In the body of the move constructor, the other members are initialized to defaults.  If one of the members was a pointer to allocated data, the pointer is transferred, and the pointer in the return variable that is about to go out of scope is set to null, so when its destructor is called, it has nothing to delete.

A move assignment is similar, but is used when the receiving variable (that already exists) is being assigned to a return value, where the caller looks like this:
std::list<SomeType> list;
...
list = someFunction();
In this case, the value in the receiving variable must be destroyed first and then the return value moved in.  Without a move assignment, this is handled by the copy constructor, which destroys the old value, and copies the new value.  The destructor for the local variable is called.  A move assignment has this basic layout:
ClassName &operator=(ClassName &&other)
{
    std::swap(member1, other.member1);
    std::swap(member2, other.member2);
    ...
    return *this;
}
For each member, the values of the members are swapped.  The standard swap function does this efficiently with move operations where a member is moved into a temporary variable, the other member is moved into the current member, and the temporary moved into the other member.  No destructors are called for any of the moves.  Here, the other instance is the local variable of the function that is going out of scope.  What this does is moves the value of the receiving variable into the local variable, which then gets destroyed.  As with any assignment, the reference of the object is returned.

The above descriptions may not be exactly what occurs internally, but it is my interpretation and explains the basic concept behind the move mechanisms.  One final mention is that if the object does not have a move constructor or assignment, then the old copy through a temporary mechanism (copy constructor) is used.  There are reasons why an object may not have move operations, which will be covered in the next post.

One last note.  The implicit sharing feature of the container classes of Qt is another mechanism to make returning instances more efficient.  This feature acts like a shared smart pointer.  The return value is copied to the temporary, and the use count is incremented.  When the local variable is destroyed, the use count is decremented and nothing needs to be destroyed.  Shared smart pointers is another mechanism (as already used for RPN item pointers).