[<<][staapl][>>][..]
Sun May 31 15:22:19 CEST 2009

the 14-bit VM

I'm pasting the source code of the more exotic VM in this post as a
backup.  Code will be modified in-place to move to a simpler DTC
architecture.  There are 3 files

dtc-control-i.ss      on-target immediate words (untested)
dtc-control-m.ss      same, but using host macros
dtc.ss                core interpreter



----------- dtc-control-i.ss -----------
#lang planet zwizwa/staapl/pic18 \ -*- forth -*-
provide-all

\ On-target immediate words implementing the control words.
staapl pic18/double-math
staapl pic18/double-pred
staapl pic18/execute
staapl pic18/dtc


\ This needs "comma" and a way to back-patch words.  The idea is to
\ compile to a RAM buffer first, and transfer it to FLASH when it's
\ done.
staapl pic18/double-comma

macro
: _address  word-address lohi ;
forth

: _mask    #x3F and ;
: _lmask   _mask #x40 or ;
: _compile _mask _, ;  \ takes word address as 2 bytes
: _literal _lmask _, ;
: _0       0 0 ;

\ These compile unconditional and conditional jump.
: _jump,   ' _run    _address exitbit _compile ;
: _0=jump, ' _0=run; _address         _compile ;


\ Jumps are proper primitives.  They take a single argument which we
\ compile as a literal.
: _hole    _here@ _0 _literal ;
: _lpack   _>> _lmask ; \ pack byte address as literal
: _then    _>r _here@ _lpack _r> _! ;  \ patch hole

: _if      _hole _0=jump, ;
: _else    _>r _hole _jump, _r> _then ;

: _begin   _here@ ;
: _again   _lpack _, _jump, ;
: _until   _lpack _, _0=jump, ;


\ COMPLICATIONS: because of the exit bit, jump targets need to be
\ protected so the previous instruction doesn't get exit-tagged.  See
\ -m.ss

---------- dtc-control-m.ss -----------
#lang planet zwizwa/staapl/pic18 \ -*- forth -*-
provide-all

\ Macros implementing the control words.  For a self-hosted
\ interpreter these need to be replaced by immediate words.

staapl pic18/double-math
staapl pic18/double-pred
staapl pic18/execute
staapl pic18/vm-core

macro

\ note: XT need to be word addresses, since i have only 14 bit
\ literals. return stack still contains byte addresses though, so for
\ now it's kept abstract.



\ create a jump label symbol and duplicate it (for to and from)
: 2sym>m      sym >m m-dup ;

\ jumps are implemented as literal + primitive (instead of reading
\ from instruction stream)

: m>jmp    m> literal ' _run     compile _exit ;
: m>0=jmp  m> literal ' _0=run;  compile  ;

: _begin    2sym>m m> label: ;   \ back label
: _again    m>jmp ;              \ back jump
: _until    m>0=jmp _space ;     \ conditional back jump

: _if     2sym>m m>0=jmp ;                  \ c: -- label1
: _else   2sym>m m>jmp m-swap m> label: ;   \ c: label1 -- label2
: _then   m> label: _space ;        \ c: label --

: _space  ' _nop compile ; \ necessary when 'return' needs to be isolated.

\ : _for    _2sym>m m> label ' do-for compile ; \ c: -- label
\ : _next   _m>literal ' do-next compile _space ;

: _for
    ' _>r compile
    _begin ;
: _next
    ' do-next compile  m>0=jmp
    ' _rdrop compile
    _space ;



------------ dtc.ss -------------
#lang planet zwizwa/staapl/pic18 \ -*- forth -*-
provide-all

staapl pic18/double-math
staapl pic18/double-pred
staapl pic18/execute

\ ************************************************************************

\ A direct threading composite code interpreter.  It has a number of
\ small differences to standard Forth.  The idea is this will run a
\ version of forth without parsing control words, but using quoted
\ code instead.


\ *** CONTINUE resumes the execution of the VM, more specificly the
\ program pointed to by IP.  A program is an array of primitive
\ instructions.  Primitive instructions are primitive code (word)
\ addresses + a continuation discard bit (EXIT bit).  IP is
\ implemented by TBLPTR (the f register).

\ *** I want to express iteration using TAIL RECURSION.  This means
\ the caller needs to pass the proper continuation to the callee on
\ the RETURN STACK, discarding the current thread if necessary.  For
\ this purpose, one 'EXIT' bit will be reserved in the instruction
\ field, and the interpreter loop will pop the stack before calling
\ the next primitive.

\ *** A continuation can be invoked by RUN, so there is no distinction
\ between programs and continuations.  A continuation takes a data
\ stack as argument, just like ordinary programs.  RUN is the dual of
\ forth's EXECUTE, which is used here to invoke primitives.

\ *** The machine return stack is reserved for the underlying STC
\ forth / machine code.  The VM uses the STC retain stack as return
\ stack, to limit interference.

\ *** To treat composite code as a primitive, an array of primitive
\ instructions needs to be prefixed by a machine code element 'CALL
\ enter', which will save the current continuation (IP) and invoke a
\ new one.  This 'enter' could be duplicated if a large address space
\ is spanned, so a short branch can be used.

\ *** The interpreter is explicit: this is done so that primitives do
\ not need to end in NEXT, as is done traditionally, enabling the use
\ of native/STC primitives.  All 16-bit primitives are prefixed with
\ '_' (underscore) so they are easily mapped and debugged in STC
\ forth.


\ TODO: some modifications.
\ - all data sizes used (literals, primitives, composite) fixed at 14bit
\ - interpreter runs on top of memory model: composite code in ram possible


\ ************************************************************************

\ IP + RS
\ instruction pointer manipulation. only the ones that affect the
\ machine return stack and machine flags need to be macros. the rest
\ can be functions for ease of debugging.

macro
: @IP+  @f+ ;  \ read bytes from the instruction stream
forth

: _IP!
    _<< fh ! fl ! ;   \ store to IP

: enter \ asm (rcall ENTER) wraps composite code in prim
    _IP>r
    TOSL fl @!  \ TOS cannot be movff dst, but src is ok
    TOSH fh @!
    pop ;

: _>r    >r >r ;
: _r>    r> r> ;
: _rdrop rdrop rdrop ;


\ These 2 govern the format in which threaded addresses are stored on
\ the return stack. For return stack tricks to work, this is taken to
\ be word addresses.

: _IP>r              \ save current IP to VM RS
    clc
    fh @ rot>>c +r !
    fl @ rot>>c +r ! ;

: _r>IP              \ pop IP from VM RS
    clc
    r- @ rot<<c fl !
    r- @ rot<<c fh ! ;


\ INSTRUCTION FORMAT + INTERPRETER / COMPILER

macro

\ in flash, code is stored as [EXIT | LIT | DATA]. the shift left when
\ reading will move EXIT->carry and LIT->negative flags.

: exit?    c? ;
: literal? n? ;
: prim@/flags         \ fetch next primitive from composition
    \ clc             \ low bit is ignored by PIC
    @IP+ rot<<c        \ this sets c and n flags
    @IP+ rot<<c  ;


\ compilation macros. DTC is compiled by mapping it to native forth in
\ symbolic form. these macros implement the encoding.

: pow2      1 swap <<< ;
: set       pow2 or ;
: mask      pow2 1 - and ;

: exitbit   15 set ;
: litbit    14 set ;
: mask14    14 mask ;


: literal  mask14 litbit   ,, ;  \
: compile  word-address ,, ;     \ xt --
: _exit    dw> exitbit  ,, ;
: _;       _exit ;

\ utility macros
: _c>>     rot>>c 2nd rot>>c! ;
: _<<c     2nd rot<<c! rot<<c ;

forth

\ inner interpreter loop
: continue
    prim@/flags                 \ fetch next primitive + set type flags
    exit? if _r>IP then         \ c -> perform exit
    literal? if 14bit ; then    \ n -> unpack literal
    execute/b continue ;        \ execute primitive

: 14bit \ interpret doubleword [ 1 | 14 | x ] as a signed value.
    _c>>                 \ [ x | 1 | 14 ]
    #x3F and             \ high bits -> 0
    1st 5 high?
    if #xC0 or then      \ high bits -> 1
    continue ;

: _bye      pop              \ quit the inner interpreter
: _nop      ;



\ trampoline entry. 'interpret' will run a dtc primitive or primitive
\ wrapped program.


: bye>r      enter ' _bye compile _exit
: interpret     \ ( lo hi -- )
    bye>r       \ install continuation into dtc code "bye ;"
    execute/b   \ invoke the primitive (might be enter = wrapped program)
    continue ;  \ invoke threaded continuation


\ CONTROL FLOW WORDS

\ 'run' is the dual of 'interpret'. it takes threaded code addresses. in
\ combination with the exit bit, this can be used to implement
\ conditional jumps.

: _run \ word-addr --
    _IP>r
    _IP! ;

\ : _0=run \ flag addr --
\     _run
\     or nz? if _r>IP then
\     drop ;


\ "go" = "run ;"

\ i don't want to use the word 'jump', but conditional jump is not the
\ same as conditional run.

: _0=run;  \ ? program --
    _>r
    or nz? if
	_rdrop
    else
	_r>IP
    then drop ;



forth
: do-next \ -- ?
    _r> _1- _dup
    _>r _0= ;



[Reply][About]
[<<][staapl][>>][..]