etherforth_logo

Compiler module

This module is a complete etherforth compiler. It reads one source block from SRAM upon request received from Interpreter or Loader, compiles source code into binary image, and sends this image as an ether message via Link module to the target chip.

node 000 node 001 node 002 node 003 node 004 node 005 node 101 node 102 node 103 node 104 node 105 node 201 node 204 node 205 node 301 compiler module floorplan

Description

Compilation process starts in node 205, which receives an ether message either from Interpreter or Loader containing block number to be compiled. It initializes dictionary in node 204, address and slot pointers in node 102, and erases image memory in node 301. Then it requests SRAM with an ether message to get content of the block to be compiled. It reads words from SRAM, decomposes them into 6-bit characters, and sends them to node 105 for parsing. When it encounters eob symbol (3C) it discards the rest of the block, and informs Interpreter or Loader that the compilation has finished.

Parser in node 105 reads symbols one by one, using a jump table indexed with four low bits of a tag to select appropriate code for text processing. Both green and yellow tokens are passed intact to node 104, preceded by zero; numbers (decimal and hex) are converted to binary form and sent to node 104 preceded by green or yellow number tag; green, yellow, and red words are converted to 18-bit number that keeps last three characters of the string (or zeros if shorter string is being parsed), and this number, preceded by a word tag is sent to node 104. Other tags and characters that follow them are ignored. Thus, parser outputs a simplified family of tags as shown in the table below.

hextagcolor
00token
02char
03num
04num
05char
07char

Pairs of simplified tag and corresponding pre-parsed code enter node 104, from where they are further distributed. Token codes 2A and higher, representing names of ports and registers (see character set table here) are sent to node 004, where they are converted into corresponding numbers, and placed on compiler's stack in node 002. Yellow numbers are similarly placed on the stack in node 002. Green numbers are temporarily stored in node 002 memory until compiled into image as literals. At the same time @p instruction is included into compiled code for each literal.

Green words are first searched for in a dictionary in node 204, that contains defined words and their execution addresses. If found, nodes 102 and 002 are requested to compile a call to the corresponding execution address into the image in node 301. Otherwise, the system hangs up and needs to be rebooted. Though not very user friendly, it is much simpler solution than implementing any error recovery mechanism.

Yellow words are first searched for in the green dictionary in node 204. If found, the execution address is placed on compiler's stack in node 002. Otherwise, it is searched for in the yellow dictionary in node 005. If found there, the corresponding action is executed in nodes 005 and 004. If not found in any of the two dictionaries, the system hangs up.

Finally, when a red word is encountered, which indicates beginning of a new definition, a request is sent to node 102 to fill all unused slots of the currently compiled instruction word with . (nop) instructions, compile all literals waiting in node 002 into the image, and return the updated instruction pointer. This is sent together with the word's name (last three characters) to node 204 with a request to enter it as a new definition into the green dictionary.

From node 104 two streams of data flow; one, carrying tokens and green words, via LEFT port to node 102; the other, carrying numbers and yellow words, through DOWN port to node 002.

The stream of tokens is examined in node 103 in order to processes flow control words. Those that can be defined as f18 instructions (begin, end, next, if, and -if) are compiled from here. Resolving forward references and compilation of more complex flow control words (ahead, leap, zif, till, -till, for, then, else, when, and -when) are carried out in node 003.

Nodes 102 and 002 form the heart of the compiler. Node 102 keeps an instruction word being compiled, as well as instruction and slot pointers. It compiles f18 instructions into appropriate slots as well as calls to green words. Whether an instruction word has enough bits available for the call is tested in node 002. It either returns address in the right format to be incorporated into the current instruction word or it instructs node 102 to fill the rest with . (nop) instructions and place the call into next instruction word. Node 002 also keeps green numbers in its memory before compiling them into the image as literals. Furthermore, it uses its stack to keep yellow numbers as well as destination addresses of flow control words, and it provides words for resolving forward references. This node also contains some words used to build an ether path to the target node used by Link module.

Compiled instruction words and literals are passed to node 001, which places them into the image in node 301 or memory used for initialization code in node 201. This node also processes compilation of constants (with , word), and performs tail recursion with help of code in node 000. Tail recursion controls use of ; (return) instruction immediatelly following a call to green word, and replaces such a call with a jump.

Node 101 builds an ether message from image and initialization code stored in nodes 301 and 201, respectively, and together with some words from node 001 delivers the whole compiled code via Link to the target node. This action ends the whole compilation.