Saturday, December 5, 2009

Background Part I – BASIC Interpreter

I worked on my own BASIC interpreter back around 1980-1981. The first computer I used was a SWTPC 6800 (SS-50 bus computer) with 4K of memory (1977). It did not have enough memory to run either the 4K or 8K BASIC that was available. Eventually it was upgraded with enough memory to run the 8K BASIC and eventually it upgraded to a 6809 (around 1979). The 6800 machine code could not be run on the 6809 directly. The 6809 was source level compatible with the 6800, which meant the a 6800 assembly source file could be assembled with a 6809 assembler with no changes to produce 6809 machine code.

This BASIC interpreter started out as a simple conversion from 6800 to 6809. I used a 6800 disassembler on the 8K BASIC. With some modification of the output, I was able to get the 8K BASIC running on the 6809. Around the same time we acquired Microsoft BASIC for this computer, I don't remember the exact timing or whether it was for the 6800 or 6809. I do remember that the Microsoft BASIC was much faster. So I dug deeper to find out why.

It turned out that the 8K BASIC interpreter was a pure interpreter. That is it put the lines entered directly into memory as is. I did discover that it did convert the first command on a line (PRINT, IF, etc.) into an address (pointer) to an internal table that contain the text of the command and the location of the subroutine for handling the command. This was probably done for executing direct commands. But it only converted the first command on the line, not any other commands on a line (multiple statements separated by colons). For these it reverted to parsing mode for execution. The Microsoft BASIC on the other hand converted all commands, words (TO, THEN, etc.) and operators (<>, <=, etc.) into one-byte tokens. So when it ran the program, the parsing of the tokens was already performed.

At this point, I did two things to my 8K BASIC, I parsed all tokens into one-byte codes and I rewrote most of the code to take advantage of the 6809. This significantly made it faster, but still not quite the speed of the Microsoft BASIC.

What is an Interactive Compiler?

A Compiler is a program that converts a human readable programming language like BASIC into a language understood by a computer. Once converted, the program is no longer readable by a human and is not easy if not impossible to convert back to the original human readable source file. However, the compiled program runs fast.

An Interpreter is a program that directly executes a human readable programming language directly. Interpreters tends to run the program slowly since it is continuously reading and parsing the source file to execute the program. However, interpreters offer interactivity where no separate compile step is necessary and offers the ability to execute commands immediately (like for examining program variables when the program is temporarily stopped like for debugging).

An Incremental Compiler converts or compilers each line of a source file into an internal language understood by the computer as it is entered. It offers the advantages of a compiler (speed) and the advantages of an interpreter (interactivity).

An Interactive Compiler is technically any programming environment that is interactive, anything from a pure interpreter to an incremental compiler, but not a full a compiler. What I am referring to as an Interactive Compiler is one that its internal language is in the form that is easily converted back into the original source lines (though not necessary exactly as typed) and in a form that the computer can run efficiently without doing an time wasting interpretation (which is done during the creation of the internal code and not during run time as in an interpreter).