Sunday, July 14, 2013

Multiple String Assignments – With Sub-Strings

Multiple string assignments will be handled by the AssignListStr code.  At run-time, this code (as the other assign list codes) will pop the value to be assigned from the stack and then begin popping variable references from the stack, assign the value, and continue until the stack is empty.  This will not work if any of the references to assign is a sub-string.  With the old design, mix-string assignments were handled with the AssignListMixStr code.  This was detailed in the post on May 22, 2010, but no details were given how this would be handled at run-time.

For the new design, if a multiple string assignment contains at least one sub-string, then there will be a specific assign code for each assignment instead of a single assign list code.  The specific assign codes will keep the value being assigned on the stack for the next assign code.  Only the last code will be a regular assign code.  Consider this mixed string assignment and its translation (note color coding showing the tokens that the codes process):
A$, LEFT$(B$,5), RIGHT$(C$,2) = D$
A$ B$ 5 C$ 2 D$ AssignKeepRight AssignKeepLeft AssignStr
The assign keep codes will pop the value to be assigned from the stack, pop the reference to assign, assign the value to the reference and push the value back to the stack for the next assign code.  The final regular assign code will not push the value to be assigned back to the stack leaving the stack empty.

There will be five assign keep codes: AssignKeepStr, AssignKeepLeft, AssignKeepMid2, AssignKeepMid3 and AssignKeepRight.  In the table, these codes will be the second associated code for the AssignStr, Left, Mid2, Mid3 and Right code entries.

Sub-Strings – New Design

Previously with the original String class, as an optimization, sub-string functions were handled differently than the other string functions.  Instead of returning a temporary string, they simply adjusted what part of the string they referred to, and would therefore work for either temporary strings (results of other string functions or operators) or reference strings (from variables).  Sub-strings assignments would be handled with an assign sub-string code that would work with the result of a sub-string of a variable reference.  This was detailed by the series of posts in May, 2010.

With the change to the QString class, this optimization will not work, and is not necessary.  The sub-string functions (LEFT$, MID$, and RIGHT$) will work like the other string functions and operators where they will return a temporary string.  However, sub-string assignments will need to be handled differently.  There will need to be specific new codes for handling sub-string assignments, to be named AssignLeft, AssignMid2, AssignMid3, and AssignRight.

Consider the following sub-string assign along with the old translation and the proposed new translation:
LEFT$(A$,5)=B$
Old: A$<ref> 5 LEFT$(<ref> B$ AssignSub$
New: A$<ref> 5 B$ AssignLeft
With the old translation, the sub-string reference would be on the stack (along with the value to assign) for the generic AssignSub$ to process.  The new translation is simpler where the new AssignLeft code will expect a regular variable reference, the length argument of the LEFT$ function, and the value to assign.  Internally all the sub-string assignment codes will use the QString::replace() function.

New Translator – Testing and a Correction

Now that all four expression tests along with the first three translator tests are working with the new translator routines, it is becoming somewhat time consuming to test each individually including checking for memory errors.

Therefore, temporarily a new memory test script (memtestn) was added that is basically identical to the current memory test script (memtest) except that the new translator is used and only the first three translator tests are run.  As more of the new translator is implemented and more tests are working, the script will be updated.  This change was put into its own commit, so that it can be reverted once the new translator implementation is complete and the old translator routines are removed.

While using the new memory test script, a problem was discovered with translator test #3 on one of the error tests, which was causing a segmentation fault, but only when compiled for Release.  When compiled for Debug (as used for development), the segmentation fault did not occur.  Finding the problem was difficult because the segmentation fault did not occur when compiled for Debug.

The debugging method used was the insertion of qDebug() calls until the location of the crash was found.  The problem occurred in the new outputLastToken() access function added to the Translator class so that the command translate routines can access the last token added to the RPN output list.  The problem was that this function did not actually have a return keyword.  When compiled for debug, the correct pointer gets returned, but when compiled for release, this is optimized out and a null gets returned.

[commit 06bd286162] [commit 8c872f68c6]