Sunday, May 16, 2010

Sub-Strings

Because strings are variable length, they are dynamically allocated as needed during run-time. It is therefore beneficial to reduce as much as possible the amount of allocating, copying and deleting of the character arrays. This is why reference strings, the values of string variables and array elements, are used as-is so a new character array does not need to be allocated and copied to in order to put the reference string on the evaluation stack. But this will require extra code to know when to delete a temporary string and not to delete a reference string on the stack, which will be accomplished with additional associated codes.

There is another way to reduce some allocation and deleting of temporary strings for the sub-string internal functions, aka LEFT$, MID$, and RIGHT$. These functions can have a reference string or a temporary string as an argument. The obvious way to implement these functions is to create a new temporary character array of the appropriate size for the resulting string, copying the characters from the argument string to the new array and if the argument string is a temporary string, to delete it.

There is a simpler way to handle sub-strings that will eliminate the allocation of a new character array, deleting the temporary string argument if present and the copying for a reference string. Consider how a string is stored, there is a character array, there is a pointer to the character array and there is the length of the character array. The pointer and length make up the members of the String class along with the allocated array. A resulting sub-string will never be larger than the string argument. so why not use the same character array, since it has already been allocated. Next, details on how this will work with reference strings and temporary strings...

No comments:

Post a Comment

All comments and feedback welcomed, whether positive or negative.
(Anonymous comments are allowed, but comments with URL links or unrelated comments will be removed.)