Saturday, February 26, 2011

Code Enumeration vs. Table Entry Indexes

While designing the error handling mechanism for the INPUT command, a thought occurred related to program codes. For efficient program execution at run-time, the index of the table entry will be stored in the internal program code, not the code enumeration value.

If the code enumeration value was used, the index for the table entry (needed to get the run-time handler function pointer) would first need to be converted to an index by going through the code to index array setup during table initialization. The intention all along has been to use table entry indexes in the internal program code.

For the INPUT command's error recovery, when it is backing up execution and checking for input parse codes, it will need to convert the table entry indexes to a code before it can check if it is an input parse code. This would not efficient during program execution. Even though for the INPUT command, execution time is not critical since it is about to stop and wait for user input. A few extra program cycles won't matter much. But this problem could occur for other more critical commands.

Therefore, it is desirable if the code enumeration values were the same as the table entry indexes. One simple solution is to make sure the code enumeration values matched the table entries. Unfortunately this relies on the programmer to keep the two in sync, is very error prone and is just a general pain to begin with.

There is a better way where the code enumeration is generated automatically from the table entries using an awk script. This method would be similar to how the test_codes.h file (used by the test_ibcp.cpp source file) is generated automatically by scanning for codes in the ibcp.h file.

INPUT Execution – Error Handling

Errors can occur while parsing the input – in one of the input parse codes. When an error occurs, after it is reported, the already parsed values in the temporary input values stack need to be thrown away and execution needs to resume at the beginning of the INPUT statement, except instead of issuing the prompt again, the cursor will be positioned at the beginning of the input, which will contain the previously erroneous input to allow the user to correct the input instead of reentering it (the traditional “redo from start”).

The temporary input values stack can't simply be reset (setting the internal index to -1, the empty stack indicator) because elements may contain allocated string values, which need to be deleted to prevent memory leaks. Like the evaluation stack, the elements in the temporary input values stack won't have any indicator what data type they are. Consider the basic format of the INPUT statement (only up to the parsing codes is shown):
InputBegin InputParseType1 InputParseType2 InputParseType3'End' ...
Say an error occurs on the second parse code (the program execution pointer will be pointing at next code, the InputParseType3, in other words, the pointer is incremented after reading each program code word, then the code read is executed by calling its run-time handler). Execution needs to be backed up until the InputBegin is reached (reading the program codes in reverse).

As each code is passed, one element will be popped from the temporary input values stack. If the code passed is an InputParseStr code, then the element popped is a string that needs to be deleted. The beginning is reached when a non input parse code is reached (it could be InputBegin, InputBeginStr or InputBeginTmp).  The stack will now be empty.

The INPUT begin code will be executed again and the begin code will call the get input routine. The get input routine will normally allocates the temporary input values stack. For error recovery, it will see that the stack is already allocated, so instead of outputting the prompt and saving the cursor position to the beginning of the input, it will restore the saved cursor position and get the input starting with the previously entered erroneous input. Execution will then resume with the corrected input.