Saturday, August 16, 2014

Number of Data Types Enumerator

Not only does the data type enumerator have the size of enumerator (which needs to be removed before changing to an enumeration class), it also has a number of enumerator, which was placed after the three main data types (double, integer and string).  This enumerator was removed, but is was used to dimension two arrays that needed to be changed.

In the Table constructor, there is a section that scans the secondary associated codes of a main code, the purpose of which is to set the expected data type for the main code.  If the main code has associated codes for both doubles and integers, the expected data type is set to number (for example, the minus operator); and if there are associated codes for all three types (numbers and string), the expected data type is set to any (for example, the plus operator).

This is accomplished by using bit masks where there is a bit for each type (double, integer and string).  For each associated code, the data type of the code is bit-wise ORed together.  After all the associated codes are scanned, the final value is checked to see if it as the two number bits set or all three bits set to determine the expected data type.

Originally, an array of three elements was defined with the bit masks for the three data types.  The number of enumerator was used to dimension this array.  With the number of enumerator removed, this array was replaced with an unordered map, with an initializer list to set the bit masks for each data type enumerator.

The other use of the number of enumerator was in the equivalent data type function of the Table class, which contained an array from data type to equivalent data type, but when the sub-string and temporary string data types were removed, this function ended up essentially just returning the data type passed to it.  In other words, this function is no longer does anything, so it was removed and the one use of it was replaced with the data type (in the LET translator routine).

[branch cpp11 commit 8f56e1117a]

Enumeration Class Hash

In order to use the STL unordered map class, a hash function is required for the key of the map.  Hash functions are provided for all the built in types plus some of the other STL classes (like std::string), but unfortunately, not for enumeration classes.  For the built in types (integers), the number itself is used as the hash.  However, enumeration class enumerator values can't be because they can't be used as integers even though there values are just numbers.  The solution is to use a generic (template) function type that will work for all enumeration classes that converts the enumerators to integers:
struct EnumClassHash
{
    template <typename T>
    std::size_t operator()(T t) const
    {
        return static_cast<std::size_t>(t);
    }
};
This works for any enumeration class by returning an integer for an enumerator (a size_t is an integer that is large enough to cover any enumeration) using a static cast (hopefully the only place a cast will be needed).  An unordered map to use an enumeration class with an initializer list to assign values to the enumerators would be defined as:
std::unordered_map<EnumName, QString, EnumClassHash> names {
    {First_EnumName, "First"},
    {Second_EnumName, "Second"},
    {Third_EnumName, "Third"}
};
The name of the map does not need to be repeated for an assignment of each value (which could be tedious for a long name or with a lot of enumerators).  This does not resolve the issue of forgotten values, but hopefully when an undefined enumerator value is accessed, the default value (in this case a blank string) would be detected or identified easily.

QMap vs. Standard Map (Initializer Lists)

Using enumeration classes will require using an associated array container class (like QMap or QHash) since the size of the enumeration (the number of enumerators) is not obtainable (without kludgy type casting) to dimension an C style array.  As mentioned in the previous post, to use a associated array container class requires run-time assignments to fill the container.

C++11 provides a solution with initialize lists.  Unfortunately, the Qt containers do no support C++11 initializer lists (specifically Qt4 doesn't because Qt5 does contain support for initializer lists).  The Standard Template Library (STL) containers does support initializer lists.  Therefore, the STL containers will be used as needed until the inevitable change to Qt5.  STL containers are technically already available since the various Qt containers have method functions for converting STL containers to and from Qt containers.

For an associated array, either the QMap or std::map container could be used.  Both of these containers order the keys of the elements.  Ordering of the keys is not required in this case.  Qt provides the QHash class for an associated array not requiring ordered keys.  Similarly, STL provides std::unordered_map, which will be the class used.

Enumerators As Indexes – Using A Map

The first enumeration that will be changed to an enumeration class will be the data type enumeration.  One of the differences with enumerations is that unscoped enumerators can be used as integers.  One of the other things I preferred to do when defining an enumeration was add a size of enumerator at the end:
enum EnumName {
    First_EnumName,
    Second_EnumName,
    Third_EnumName,
    sizeof_EnumName
};
This size of enumerator can then be used to dimension an array, like to hold a conversion to another value (another enumeration, string, etc.).  This will not work with enumeration classes because the enumerators can't be used as integers without resorting to kludgy type casting (something I prefer to avoid if possible).  Plus, once this size of enumerator is added, then any switch statement using the enumeration will generate a warning since there is no case statement for this enumerator.  Adding a blank default statement is also kludgy.

Alternatively, using an associated array (map) instead of a C style array solves the issue of needing to know the size of the enumeration.  Compare the array solution to the map solution:
QString names[sizeof_EnumName] = {    QMap<EnumName, QString> names;
    "First",                          names[First_EnumName] = "First";
    "Second",                         names[Second_EnumName] = "Second";
    "Third"                           names[Third_EnumName] = "Third";
};
The array method is error-prone and care must be taken to make sure the right values (strings in this case) are applied to the right elements in the array.  This could be solved by using assignments that would look identical to the map assignments, though may not be as efficient because the assignments are done at run-time instead of at compile time, and the array still needs to be dimensioned.

The map method method of using assignments (same for the array method using assignments) can also be error-prone because it is easy to miss an assignment of one of the values.  In the array case, a null pointer would be returned, which will probably cause a segmentation fault unless it is checked for.

For the map method, however, accessing a value that doesn't exist simply adds a new element to the map with a default value (in this case a blank string).  This is still a problem, but much less fatal.  Alternatively, in the case of QMap, the value method function could be used instead of the [] operator, which doesn't add an element for a non-existing element and allows a default value to be returned for the non-existing element (for example, in this case a default value like "BUG" could be used).

C++11 – New Enumeration Class

With original implementation of C/C++ enumerations (enum), I preferred to add a suffix to each of the enumerators to associate them to the enumeration so that when one appears in the code, it is easy to identify it to the enumeration:
enum EnumName {
    First_EnumName,
    Second_EnumName,
    Third_EnumName
};
Without this suffix, it is difficult to see which enumeration the value is associated to.  (Though using an IDE like Qt Creator, there is a command to go right to the definition.)  In addition, obviously the same name can't be used for two different enumerations.  A suffix (a prefix could have been used instead) solves this issue.

C++11 introduces a new enum class where the enumerators are scoped inside the enumeration.  The enumerators for these are also more strongly typed then the normal enumerations (which represent integers and can be used as such).  As an enumeration class, the above enumeration becomes:
enum class EnumName {
    First,
    Second,
    Third
};
In order to use these enumerations, they must be scoped, so to refer to the second enumerator, the name EnumName::Second would be used.  This is similar to my solution of using suffix (in this case it is a prefix).  An example of an enumeration class was added to the try compile test program.  The intention is to change the enumerations used in the project to enumeration classes.

[branch cpp11 commit 2a8e3899ee]