This is the dev log for the Staapl project (aka Brood).

You probably want the edited blog:
http://zwizwa.be/ramblings/staapl-blog

For more information see 
http://zwizwa.be/staapl

---------------------------------------------------------------------

WARNING. This file is a dev log. It contains notes about problems I
encounter during the learning and development process, and mainly
serves as archive and sound board for myself.

It is largely unedited. Some notes might only make sense in my mind,
especially if it's about things I don't understand yet. Some notes are
just plain wrong, and this is not always indicated.

It is a story about how concrete ideas grow out of a puddle of mud,
how a series of mistakes and half--assed understanding can lead to
something beautiful in the end. This description of the process is
something you rarely find in research papers. However, I've always
found texts and talks about mistakes and the process of correcting
them much more valuable than any shiny end result. This text is my
contribution to that part of knowledge, and like most blogs, can be
extremely boring if you don't know what to ignore.

The single rule for this log is: I am allowed to delete embarassing
erroneous entries if I explain exactly what went wrong in the
reasoning.

---------------------------------------------------------------------

Staapl consists of:

* Scat: a set of macros for PLT Scheme implementing a family of
  dynamically typed concatenative languages usable inside scheme code.

* Coma: a COmpositional MAcro language: an extension of the Scat
  language with data types representing target code, and a
  specification syntax to define target code pattern matching
  primitives.

* Control: Extension of the Coma language with Forth-style control
  primitives based on conditional and unconditional jumps, useful for
  low level programming.

* Comp: a compiler that instantiates Coma+Control macros to produce a
  code graph structure + performs a-posteriori optimizations.

* Asm: a straightforward multipass relaxation assembler with arbitrary
  expression evaluation in terms of target addresses and a high level
  opcode definition language.

* Forth: parsing extensions for representing classic Forth syntax +
  PLT scheme language layers.

* Purrr: uses the Coma+Control core and Forth syntax to implement a
  template for a minimalistic Forth macro language.

* Pic18: uses the Purrr template specialized to the Microchip PIC18
  architecture, to implement a PLT Scheme #lang.



The idiomatic forth language and the ideas behind the peephole
optimizing compiler have remained fairly constant since the 1.x
version. The Forth dialect is implemented as a collection of macros:
functions which operate on a stack of assembly instructions, and a
stack of nested control structures.

The thing which changed throughout the versions is the use of better
(higher) abstractions to implement the basic ideas, and factor the
design into simpler components.

Note that this project is much more about functional programming and
PLT Scheme than about Forth. The Forth dialect is used mostly as a
practical macro assembler on top of the clean Coma
architecture. However, the end result is nice to play with PIC18
chips, and it is used for real-world stuff.


Entry: monads
Date: Sun Jan 28 14:43:30 GMT 2007

EDIT: i clearly didn't get it here.. much of the monad stuff i talk
about is not what you find in Haskell. the thing is: a stack that's
threaded through a computation behaves a bit like a monad. it solves
some same practical issues, especially if you start to use syntax
transformations that can convert pure functional code to code that
passes a hidden top element. but it doest have the power of a generic
bind operation. i talk about this later though.



i think i finally start to get the whole monad thing. in layman's
terms: it is centered around splicing together (using 'bind')
functions that take a simple object to a container.

.in what i'm trying to accomplish, this is just compilation: take in
some code and output an updated state.

maybe i should give up on the whole CAT thing after all, and
concentrate on using scheme and some special structures to actually
create a proper language and macro language. i already have a way to
write concatenative code in scheme without too many problems (see
macro.ss)

the layer is probably just not necessary.. scheme is more powerful,
and everything i now do in CAT i could try to move over to a virtual
forth: write everything from the perspective of the forth
itself.. something like: 'spawn this process on the host'.

another thing that's wrong with CAT is the lack of decent data
structures. it's overall too simple for what i'm trying to do: proper
code generation on top of a simple basic language.

let's go back to my recently restated goals for BROOD:

* basis = forth VM on a microcontroller for simplicity
* write cross compiler in a language similar to forth
* use as a FP approach for the compiler language


the middle one can take on different forms. i still think it is very
beneficial to have an intermediate language to express forthisms. but
this language can just be embedded in scheme stuff.

so let's just start to build the thing, right?



Entry: CAT design
Date: Sun Jan 28 17:28:56 GMT 2007

[EDIT: Sun Oct  7 00:20:32 CEST 2007
- the pattern matching grew to a 'quasi algebraic types' construction
- from forth -> machine code there are now a lot more passes
- the shared forth elimination is made machine specific.]

the design is quite classic forth, but it might be simplified a
bit. CAT consists of the following pipeline:

      (1)	  	    	       (2)
forth --> peephole optimized assembler --> absolute bound machine code

currently (1) is the compiler while (2) is the assembler. it might be
more interesting to actually split it up in two parts. introduce a
peephole optimizer that can separate out the forth compiler to a
higher level and the assembler to a lower, machine specific level,
making it a bit more like a frontent/backend multiplatform compiler.

also, several things could be made declarative. peephole optimization
is basicly pattern matching. currently CAT implements it as a pretty
much imperative process: if last instruction is dup, then drop undos
dup etc..

give the target we are using, it is possible to completely write the
assembly language in forth style.

so, summary: split peephole opti in 2 parts:
- shared forth elimination (as a result of macro expansion)
- machine specific assembler opti



Entry: declarative peephole optimizations
Date: Sun Jan 28 17:38:31 GMT 2007

basicly, this is a rewriting system. currently i use a tree structure
for this (ifte). this is a list of transformations:

(
 [(dup drop) ()]
 [(dup save) ()]
 [(save drop) ()]
 [(1 +) (inc)]
 [(1 -) (dec)]
 [((lit?) +) ()]
 )

what do do with things that do not fit this? for example
literals.. i really do need predicates. actually, i should make a
list of all optimizations to make it a bit more clear. currently
the code is way to dense.


Entry: rewriting
Date: Mon Jan 29 00:26:40 GMT 2007

funny.. google search on rewriting lead me to the pragmatic
programmer. maybe i shouldn't read joel on software. especially not
his rant about never scratching a whole project and starting
over. 

anyways.. there are some serious things wrong with the way i'm trying
to solve the compilation / optimization problem. i'm using a massive
tool and am still writing in the gerilla hack style most forths are
written in. i have proper data structures now, so why not use them?
why not make some minilanguages that do special tasks.

it would be interesting to start moving functionality over to the lamb
core as soon as possible. most of the code is optimization though,
so..

i'm curious about this rewriting business.. looks like there's
something to learn there. i know it works in a naive way, since that's
what i already have. but i'm curious if this can be taken
further. with faster static code it might be possible to do a whole
lot more (non deterministic stuff).

some problems to face are literals and some..

one thing that worries me is 'how to prevent loops'.  i know, if
things get smaller, there can be no loops, but i can imagine some more
fancy expand/collapse rules that might start looping with a naive
approach.

looking at the optimizations, most are about reducing stack juggling
and moving it to register transfers. this is almost universal on all
machines. i do need to think a bit about a sort of 'base line forth'
that will be the end of the optimizations, such that eventual
compilation is straightforward. this seems like an elegant solution.



Entry: purely compositional approach (joy)
Date: Fri Feb  2 12:35:56 GMT 2007

whenever program text is read, it is immediately compiled. each symbol
is replaced by it's particular function, and each constant is replaced
by a function that pushes the constant.

sym   -> (lookup sym)
data  -> (lambda stack (cons data stack))

to get back the value of a data atom (practical issue), you pass any
list, apply the function, and pop off the data.

this can be done at compile time.

so, can we do with data structures composed entirely of functions?
probably yes.. probably this isn't even such a bad idea..

it looks like it is not a good idea to map composition to single
lambda expressions, but to have an interpreter for it instead, so we
can implement things like CAR efficiently: it is possible to implement
CAR on an abstract function which represents a list by 'testing' it on
a stack, however, this is a lot worse than just getting the left
element of the first pair..

then, why not represent constants by constants instead of their
wrapping functions? back to square one..



Entry: jit compiler + parser
Date: Fri Feb  2 15:24:47 GMT 2007

if i'm absolutely sure that function names are static, it's possible
to use a jit compiler without sacrificing this semantic property:
leave them as symbols until they are encountered, then compile
them. this would also eliminate the problem of forward/backward
declarations etc.

this seems to work very well in the simple first experiments..

the parser seems to work too: parse code = list of things. if one of
the things is a list, parse it and wrap it in a lambda.

so what about closures?


Entry: rewriting
Date: Sat Feb  3 10:52:54 GMT 2007

i'm working on the rewriting, and it looks like this is ideal to use a
compositional mini language for.. so i've quickly extended the 'run'
function to take a 'compiler' argument which will resolve symbols to
functionality, still using the jit compiler.

the 'compiler' term could be used to give context (lexical) dependent
information about symbols. however, it should really be tied to the
stored code then..

the idea is to represent pattern matchers as ordinary composed code,
but using a special compiler (macro) to instanitiate them.

so this gives the list of problems for today:
- solve lexical compilation issues
- how to 'execute' from within a primitive


lol. this is again exactly the same thing as i already had: each forth
word is a macro :)

so the problem reduces to the lexical thing (namespaces) and how to
compile a generic pattern matcher into a macro.



Entry: do i really need lambda?
Date: Sat Feb  3 11:24:04 GMT 2007

what i need is local names, just for the sake of code organization and
different sublanguages. i don't need lambda really. i don't need
runtime binding of symbols to names. the whole idea of
combinatory/compositional/concatenative languages is to eliminate
variable names...


Entry: macro semantics
Date: Sat Feb  3 12:37:36 GMT 2007

i have something like this now:

(define (macros name) name)
(define-resolver register-macro macros)

(register-macro 'nop (lambda stack stack))
(register-macro 'dup (lambda (asm . stack)
                       (pack (pack `(movwf POSTINC) asm) stack)))

which can be executed as

> (run (parse macros '(dup dup)) '(()))
(((movwf POSTINC) (movwf POSTINC)))
> 

so in this, compilation is the execution of one program to produce
another program. let's stay in the forth syntax as long as possible,
and rewrite this to:


(define-syntax forth
  (syntax-rules ()
    
    ((_ output () (rwords ...))
     (pack rwords ... output))

    ((_ output (word words ...) (rwords ...))
     (forth output (words ...) ('word rwords ...)))

    ((_ output (words ...))
     (forth output (words ...) ()))))

(define (macros name) name)
(define-resolver register-macro macros)

(register-macro 'nop (lambda stack stack))
(register-macro 'dup (lambda (out . stack)
                       (pack
                        (forth out (POSTINC1 movwf))
                        stack)))

> (run (parse macros '(dup dup)) '(()))
((movwf POSTINC1 movwf POSTINC1))
> 



Entry: dynamic code
Date: Sat Feb  3 12:45:23 GMT 2007

note that once something is run once, it will be compiled in place and
can never be accessed as data again. it is important to make it
impossible to do things like

     (1 2 3 + +) dup run

and then use the 2nd copy. this is easily solved by first creating a
copy though, but sort of defeats the current way the JIT compiler
works. maybe i can make sure that the 'run' word, which is the
interface to the internals, always makes a copy of a list whenever it
encounters dynamic code?

to summarize

- parsed lists are safe. they are always pure code and can never be
  interpreted as pure data.

- anything that's 'run' at run time is not, so here a copy needs to be
  made.

the solution here is that 'parse' is necessary to run symbolic code:

    (1 2 3 + +) parse run

and 'parse' already makes a copy of the list, since it is functional.

NOTE:  (define (list-copy x) (append x '()))

so i defined   interpret <- (parse run)



Entry: bind and stuff
Date: Sat Feb  3 13:17:03 GMT 2007

i think it's starting to dawn on me.. the disadvantage of functions
like the above is that state accumulation is explicit: there is this
chunk of accumulated state on the top of the stack while none of the
functions actually need it to be there. enter the 'bind' concept. in
order to get rid of these arguments, you define a macro according to:
( -- code ), but automaticly lift it to ( code -- code ).

ok.. this seems to work. what i have now is a way to generate simple
substitution macros.


Entry: rewrite macros
Date: Sat Feb  3 15:54:08 GMT 2007

the next step is rewrite macros. this should be done in two steps.  in
order to make a single 'intelligent' macro, different patterns need to
be combined into one fucntion, and one function needs to have
information about different patterns. a sort of 'transpose'.

- make a list of rewrite patterns
- compile it into code

rewrite macros are easier understood as operating on output forth
code.

i don't know. ok.. time to be stupid then. state the previous
solution, then abstract it.

previous solution was explicit

(dup drop) -> ()

drop is a function that discards anything that comes before it and
produces a value (without other side effects: it's important to write
macros so that the last operation is a mutation)

so drop needs to be intelligent.

ok. it's easy enough to implement this in exactly the same way as in
CAT/BADNOP. however, there should be a more highlevel construct that
eliminates the explicit if-then things.



Entry: compositional languages suck
Date: Sat Feb  3 17:07:00 GMT 2007

it's a feast to use them to glue things together, but more complicated
things are easier expressed using lambdas.. i think the approach of
writing the core algorithms in scheme with full fire power, and
keeping the language itself mainly for interaction, is a valid one.

compositional languages are cool because they lift you from the burden
of having to name things, and allow you to think in terms of structure
(more geometricly) vs. random connections in parametrized things.


Entry: more lifting
Date: Sat Feb  3 17:44:56 GMT 2007

i ran into a new class of functions. i already had

  . -> code

which are just constants. now i have

  code  -> code

which are code transformers that need to look at the current generated
code state (never the source code!)

i added the default resolver for macros to be a quote to the forth
output stack. i do need to change the way other types are handled
though. it looks like this is better solved earlier in the process. to
keep the JIT compiler like it is, the parser could be adapted to
already compile constants to quoting procedures.

this works nicely.



Entry: now for the meta stuff
Date: Sat Feb  3 21:16:12 GMT 2007

some questions remain. how to generate more boilerplate for some kinds
of peephole optimizations, and how to check if it is actually possible
to optimize towards a 'core forth' that can be straight compiled to
assembly.

let's find out by systematicly porting some macros.

the main question is: what with arguments?

    123 ldl

means load literal in the top of stack register.



Entry: cat snarf
Date: Sun Feb  4 00:16:56 GMT 2007

porting stuff from old cat to new cat. seems to work really well. not
having state on stack to deal with makes things a lot easier..

but for badnop this means the database needs to be designed in a
proper way. maybe for the assembler we use some kind of dictionary as
a state?

the thing is.. i'd like to keep as much of the 'functional OO' that
was present in CAT. this makes it possible to do parallel stuff and
backtracking in a very easy way, especially now that it's kind of
fast.



Entry: intermediate language
Date: Sun Feb  4 10:51:20 GMT 2007

since i am optimizing for a register machine, it might be best to
write the rewriter in terms of a the register machine primtivies i
used in BADNOP.

the main thing to decide is: is it easier to optimize code like this:

    	 1 2 +

or this
	 (dup) (lda 1) (dup) (lda 2) (call +)

it's definitely easier to do the former.. maybe i should implement the
assembler now, so i can see this a bit clearer.


the problem i'm trying to solve is: rewrite forth in such a way that
assembly becomes trivial. things which make this problematic are
folded constants: constants that are already bound to a machine
operation as a literal. maybe i should just write them as 'pseudo
forth code' but group them, like this:

         1 2 +   ->   dup  (drop 1) dup (drop 2) 

here every grouped instruction is meant to be replaced later by one
machine opcode. the advantage of this approach is that there are no
'self-quoting' things in the code after a first pass.

considering the targets i'm using don't have an instruction to do
dup+ldl in one go, i guess this idomatic approach is a valid one.

it is probably better to do this in more phases:

1 forth based semantic substitution (rewrite)
2 conversion to idomatic representation (compile)
3 direct mapping from idiomatic cells to assembly code (assemble)


because 3 can be made invertable, it's possible to easily decompile,
flatten and semanticly optimize back!



Entry: pattern matching
Date: Sun Feb  4 13:41:49 GMT 2007

having a look at the plt pattern matching code. i really need this
kind of stuff :) the basic thing is

(match  x  (pat expr) ...)

when x matches one of the pat, the correspionding expr is evaluated
with symbols of pat bound to values in x.

ok.. seems to work pretty well. but i still need to find out how to
reverse a pattern.


Entry: next
Date: Sun Feb  4 16:06:45 GMT 2007


* find out how to reverse patterns
* lift rewriters above 'rest'
=> compile patterns

* assembler
* state


Entry: compiling patterns
Date: Sun Feb  4 16:09:43 GMT 2007


i need to make my own pattern language to compile substitutors from a
more highlevel definition like

((dup drop)  ())
((a b +)     (,(+ a b)))
((a b xor)   (,(bitwise-xor a b)))
((a not)     (,(bitwise-xor a -1)))
((a negate)  (,(* a -1)))

((dup dup (drop a) !)  ((,a sta))
((dup (drop a) !)      ((,a sta) drop))

using syntax-case. hence the latter 2 expressions will be merged into
one '!' rewriter macro.

as a preparation i can already try to see if all macros fit in this
category. yes, they do.

but i need to solve the problem of solving type matching first, since
the arithmetic above only works if the numbers are immediate. type
matching is part of the match.ss language, but i didnt figure out yet
how to also bind a matched item to a name..

i think i solved the rewriter problem for the pattern language
above. just need to sort out some macro issues, probably best use
syntax-case with some explicit rewriting.



Entry: syntax transformation
Date: Sun Feb  4 22:48:32 GMT 2007


1. one pattern i've been trying to solve is this

(definitions
	(some (special) structure here)
	(same (special) structure there))

it's easy enough to write the first transformation, but how do you do
the next one without having to explicitly recurse using names etc.. ?

in other words: "now just accept more of the same".
this seems to be the answer:

(define-syntax shift-to
  (syntax-rules ()
    ((shift-to (from0 from ...) (to0 to ...))
     (let ((tmp from0))
       (set! to from) ...
       (set! to0 tmp))    )))

one ellipsis in the pattern for every ellipsis on the same
level. or something like that.. need to explain better.


2. what's the real significance of the argument of syntax-rules ?

"Identifiers that appear in <literals> are interpreted as literal
identifiers to be matched against corresponding subforms of the
input."


3. how to get plain and ordinary s-exp macro transformers using
define-syntax?

i was thinking about something like this:

(define-syntax nofuss
  (syntax-rules ()
    ((_ (pat ...) expr)
     (lambda (stx)
       (match (syntax-object->datum stx)
              ((_ pat ...)
               (datum->syntax-object stx expr)))))))

(define-syntax snarf-lambda
  (nofuss (args fn)
          `(lambda (,@(reverse args) . stack)
             (cons (,fn ,@args) stack))))


but that doesnt work, since nofuss is not defined at expansion
time.. but it should work. there has to be some way of doing this.




Entry: pattern language 
Date: Mon Feb 5 11:30:59 GMT 2007

seems to work. some problems remaining though:
- default clause
- literal/parameter
- fix 'rest'
- type specific match

done


Entry: remaining problems
Date: Mon Feb  5 21:26:33 GMT 2007

two big problems remaining. the assembler and state storage. the
assembler is a bit nasty. lots of tiny rules to obey.. i wonder if i
can make something coherent out of this.

the state store is tightly coupled to the assembler. here i can
probably do another trick of accumulating the dictionaries using some
binding functions.

what are the tasks of the assembler?
-> creating instruction codes from symbolic rep
-> resolving all names to addresses (2 pass?)
-> making sure all jumps have the correct size

the result of assembly is a vector, and some updated symbol
tables. the input is optimized and idiomized forth code that can be
straight translated.

it would be nice to use nondeterministic programming for choosing jump
sizes, but that's probably overkill: moving code down is probably easier.

some observations: 

* if i only shrink jumps instead of expanding them, the effects are
  always local: no bounds are violated, but the solution might be
  sub-optimal. for backward jumps, the correct size is known, only
  forward jumps need to be allocated.

* a proper topological representation which indicates where jumps go
  and where they come from is a good thing to have: a single cell has:
  - list of cells that go from here: max 2
  - list of cells that come here
  - instruction + arguments

* maybe that's overkill. since it can always be generated if analysis
  is necessary.. what about this:

  - assume incremental code: old code is not going to be jumping to
    new code.

  - within a code block under compilation, forward jumps are possible,
    so they need to be allocated: use maximum size. however, they
    should be rare: function definitions could be sorted before hand?

  - recursively work down from the current compilation point, and
    adjust all jumps. backtrack if necessary. this can be done in a list

* yep.. it's probably simplest to just perform the 2 ordinary steps of
  backward then forward address resolution, and add as many passes as
  necessary to resolve the shrinking.

* there are 2 x 2 x 2 types of jumps wrt. a single cell collapse
    abs/rel x start:before/after x finish:before/after

    -> relative : adj if they cross the border
    -> absolute : adj if they end past border



Entry: tired
Date: Mon Feb  5 21:44:26 GMT 2007

probably all a bit over my head atm. feeling a bit sleepy. maybe do a
bit of cleaning. like writing fold in terms of match instead of if
etc..

(define (fold fn init lst)
  (match lst
         (()       init)
         ((x . xs) (fn x (fold fn init xs)))))

(fold + 0 '(1 2 3))

good coffee :)

i'm feeling a bit ambitious.. instead of writing a flashforth based
standard forth, it might be feasible to try to write a functional
programming language for the micro.



Entry: ((dup cons) dup cons)
Date: Tue Feb  6 03:43:13 GMT 2007

just added 'compose', then i found out the quine in the title doesnt
work any more. it does work in joy, so what's up? the problem here of
course is that consing 2 quoted programs does not give a quoted
program: these are abstract data types and not lists.. manfred must be
using some explicit definition of cons on quoted programs somewhere,
or i don't really understand his list semantics.

the problem with mine is that quoted programs are in fact lists, but
they have a header containing a link to the symbol resolver.

so, am i missing something about Joy? is the: "quoted program is list"
necessary? have to check that.

in the mean time, i can get to quines by defining 'qons'.

there is a possibility of embedding lex information inside the list,
so instead of

   (lex a b c) -> ((lex a) (lex b) (lex c))

which might even be better, since it allows for mixing of dicts. this
also makes it possible to use a simpler interpreter, since the lex
state doesn't need to be maintained.

hmm.. tried, but to tired. but something is rather important. having
lists and programs on the same level as in joy is nice, but requires a
single semantics. since i already almost automaticly introduced 3
kinds of sematics for symbolic code (cat, forth rewriter and forth
compiler) this is not really feasible.

so lists and programs should probably be separate entities, where
programs are abstract and not dissectable, but compose and qons are
defined to do run-time composition.

list	 program
concat	 compose (qoncat)
cons	 qons

where quons will take 2 quoted programs, and concatenate the quotation
of the first with the second, or (a) (b) -> ((a) b)

however, if i change the interpreter as mentioned above, the two
columns will be identical, and programs can be manipulated just as
lists, without giving up any other functionality.


so, good to learn, CAT not really Joy because

- i'm using fully quoted lists instead of quoted programs to represent
  data lists: there is a clear separation between code and data.

- joy probably uses numbers directly in the lists, which i can't due
  to different number semantics for cat and purrr. for me, they have
  to be incapsulated in numbers.

- parsing from symbolic -> code is explicit because of different
  semantics: this allows reuse of interpreter for different mini
  languages.



Entry: interpreter cleanup
Date: Tue Feb  6 11:33:15 GMT 2007

instead of using set!-car, it might be better to use delayed
evaluation for the instruction type: everything evaluates to a
procedure. that way it stays functional.

ok. this seems to work: () and nil are the same now. plus there is a
structural equivalence: since compilation from symbol ->
procedure/promise is 1-1, the size of a compiled program (list size)
is equal to the original code list size.

ok.. now ((dup cons) dup cons) still doesn't work!!

the reason is that nested lists in code get executed.. is this still
valid? something smells here..

let's first change the other parser/compilers..

ok, if i swap around the quoting such that new definitions are always
wrapped in a closure that will call 'run', i should be safe.

this works, but i run into a difficulty:
i cannot unquote a program wrapped in a closure (interpret quoted stack)

the solution to this is to change back the semantics of 'parse' -> it
will return a quoted program instead of an executable one, and put the unquoting in 'def'.

i had to change this:

(define-word run     (code . stack) (run code stack))

to this

(define-word i        (code . stack) (run-quoted code stack))
(define-word execute  (code . stack) (run-unquoted code stack))


to only take quoted programs, not pure closures. 'execute' is only
there for completeness, since it is rarely needed (it is equivalent to
(nil cons i))

maybe that's again one of the key points? meaning, to make a very
clear distinction between quoted programs = aggregation of primitives,
and primitives.

ok.. this looks like it's working. this makes the language a bit more
introspective. it looks like the quine works too now.

this was quite surprisingly non-trivial!

so what do i learn?

------------------------------------------------------------------- 
it is necessary to explicitly distinguish QUOTED PROGRAMS from
PRIMITIVES. the latter is a black box, but the former is a list of
primitives. this structure is NOT recursive!
-------------------------------------------------------------------

having quoted programs obeying the list interface adds very flexible
introspection. this probably means that the difference from Joy is
purely syntactic now.


tiens.. ((dup cons) dup cons) broke again...

ok. the reason is that after constructing a program from another
program, you need to 'compile' it before it can be run with 'i', so
the quine is relative to 'interpret' and not to 'i'.

i'm going to switch back to my previous notation and use 'run' instead
of 'i'. the conclusion here is:

       --------------------------------------------
       { QUOTED programs } is a subset of { LISTS }
       --------------------------------------------

i think this is not the case in Joy. whenever you operate on a quoted
program using list construction, you need to 'compile' it to a program
again. this is a projection from the set of lists to the subset of
quoted programs.

so 'compile' really needs to be a projection, meaning (compile
compile) and (compile) are equivalent.


it is possible to change this simply by having run-unquoted cons
everything to the stack that's not executable. however, keeping this
explicit allows more wiggleroom in the semantics of different
sublanguages. or maye better: it is cleaner, since there is no
'default' behaviour (the way the 'cond' is setup in run-unquoted
allows overriding.. that's not so clean).

so, final word. the interpreter implements:
* interpretation of a list of primitives as NOP,TC,RC
* lazy evaluation of primitives (JIT: delayed compilation)

all the rest needs to be implemented in the source transformers.

NOTE: [dup cons] is the Y combinator, sort of..



Entry: assembler
Date: Tue Feb  6 16:48:07 GMT 2007

alright. the assembler. finding instructions is the trivial part. the
hard part is finding addresses of jumps.

1. resolve backward references  (+ find instructions)
2. resolve forward references

if there are multiple size jump instructions, and there are relative
and absolute jumps, extra passes can be added that resolve efficient
allocation of these. allocating all forward references with maximum
range, and adjusting them one by one seems to be the best approach:

3-N. shrink forward references if necessary.

things get complicated if forward small offset relative jumps are used
in compilation, since constructs to work around this are necessary. i
need to find a way to abstract this kind of behaviour. basicly mappinng

A   O-O-O-O-O-O
     \_______/

to
      ___
     /   \ 
B   O-O-O-O-O-O
       \_____/

or the other way around. for PIC18 going from B to A reduces the jump
distance by 2, since the long jump is eliminated.

it can probably be just kept at 1 & 2 for now, with all jumps equal
size, and fix that later.



Entry: bookkeeping
Date: Wed Feb  7 09:49:20 GMT 2007

the other problem is the bookkeeping. for the assembler i basicly need
a symbol table, which means the dictionary object from cat can be
reused, and some binding operations need to be devised.

there are things to separate:
- labels     (write accumulate, read random)
- 'here'     (read/write random)
- asm output (write only)

it is probably easier to do most of this in scheme, together with some
syntax. let's see.

maybe best to do everything in 3 steps:
1. assembly to polish notation, keeping symbol names

2. forward symbol resolve
3. backward symbold resolve

the last 2 are stateful, the first one is just pattern matching.



Entry: lifting problem
Date: Wed Feb  7 10:25:31 GMT 2007

how to call a generic prototype function within the body of a to-be
lifted prototype? this is still one of the bigger problems i had when
writing old cat: cannot execute from stack macros!


Entry: screen scraping
Date: Wed Feb  7 12:41:41 GMT 2007

ok, that was fun. using emacs macros to convert text pasted from the
pdf datasheet into lisp code :) but it doesnt work very well though. i
think i should just get the data from gpasm, hoping it's a bit more
structured. (in the end i just typed it in).





Entry: great success!
Date: Wed Feb  7 14:51:36 GMT 2007

writing the assembler, and i'm realizing something. scheme is really
cool :)

but i'm not sure if scheme is the core of what i'm finding cool. i
think it's pattern matching. since a compiler is mainly an expression
rewriter, this comes as no surprise in hindsight.

the biggest mistake in the previous brood system is to try the problem
without pattern matching constructs. brood's approach (and previous
badnop) is really too lowlevel.

for expression rewriting, lexical bindings are a must. since the
permutations involved are mostly nontrivial, performing them with
combinators instead of random access parameters is a royal pain, and
the resulting code is completely unreadable.

i think this can be distilled in yet another "why forth?" answer, but
in the negative. if the task you are programming involves the encoding
of a very tangled data structure, then a combinator language is a bad
idea, since you have to factor the permutation manually.

so it's about this: forth is bad at encoding fairly random or ad-hoc
permutation patterns like you would find in a language
compiler/translator.


and, don't forget: match & fold are your friends!


Entry: assembler working
Date: Wed Feb  7 21:14:25 GMT 2007

at least the part that's not doing symbol resolution. now for the
interesting part: the assembler has some state associated:

- dictionary
- current assembly point

which has to be dragged along. i was wondering how hard it might be
to solve this with some closure tricks in scheme..


Entry: lambda again
Date: Thu Feb  8 09:47:23 GMT 2007

trying to get my head around this lambda thingy.. there are a couple
of problems, the most important one is the decision of whether lambda
should be form or function.

* form: everything is compiled at compile time. this means lambda has
  to be a parser macro, and the only way to do that consistently is to
  have it be a prefix macro. this would introduce the semantic
  simplicity of the language by intruducing syntax.

* function: lambda does runtime compilation, in which case the lexical
  environment has to be bound to the compiled representation of the
  lambda call. it also introduces runtime compilation. speed wise this
  is no problem, since all dictionary lookups are postponed till later
  anyway, but conceptually it is different again.

maybe the latter is the lesser of the 2 evils. 'lambda' still needs to
be a parser exeption since it needs to capture the parse
environment. so lambda is really delayed parsing. maybe that makes
more sense.

ok, following this:
- the argument needs to be symbolic, not a quoted program. (raw source)
- nested lambda's will work
- the run time part is called 'apply'

now, what does a compiled lambda expression look like?

   '(A B C) '(foo B bar) lambda  ->  (bind-C bind-B bind-A foo B bar)

that's the easy part. now, where is the storage? clearly, storage is a
runtime thing, so we can change the code to:

   (alloc bind-C bind-B bind-A foo B bar)

now 'B', for which code is generated at compile time, needs to know
where to find this storage. what about just putting it on the top of
the stack, and modifying all code that's not accessing the parameters
to ignore the bindings?

some problems here with passing the lexical state to
subprograms.. wait: this is always done by 'parse'. it's ok to think
about lexical scope as dynamic scope of the parser.

but... passing stuff on the data stack is kind of dangerous, since all
subforms which have lambdas will do the same, so how do the inner
forms find the values of the outer variables?

the only real solution is probably to have the interpreter pass around
an environment pointer..

maybe that's a good point to just stop, and leave out lambda entirely.




Entry: monads
Date: Thu Feb  8 19:18:49 GMT 2007

i guess it's safe to say that 'bind' really is 'lift' as i defined it:
take a function that maps values outside into the monad, and turn it
into a function that can be composed.


Entry: lambda again
Date: Fri Feb  9 10:37:59 GMT 2007

let's see.. what does lambda do? actually two things:
* functions as values (delayed evaluation)
* locally (lexically) defined names

i already have the first one as quoted programs. so the problem i
should be solving is not the lambda problem comprising both
subproblems, but only the latter subproblem: lexical variable
binding. this is forth's "locals".

some more ideas: write the interpreter in Y-combinator form
(CPS?). this would allow the interception of invocations, basicly
allowing any kind of binding of the state that's passed around. maybe
this is the interesting problem for today?

btw. i ordered friedman's "Essentials of Programming Lanugages". First
edition got it very cheap on amazon. Now reading "The Role of the
Study of Programming Languages in the Education of a Programmer."
Done. Gives me a bit of good faith that i'm on the right track. I just
need to study and experiment more.. and learn to smell ad-hoc
solutions.

One of the things the paper mentions is that it is a good thing to
learn to implement your own abstractions / language extensions /
... and to invest some time into learning the general abstract ideas
behind language patterns, mainly (automatic) correctness preserving
transformations.

It looks like the approach Friendman suggests is kind of radical. I'm
doing this from a Forth and Lisp perspective for quite a while now,
but it looks like i am getting stuck in certain simple
paradigms. Rewriting BROOD kicked me out of that and made me think
about better approaches, adopting pattern matching, a static language
and lazy compilation.


The idea with PF as one of the BROOD targets is probably a good
idea. It's going to be a hell of a problem to tackle though.


things to try:

- convert the dynamicly bound code in BADNOP to something i can run on
  the new core.ss : this approach seems like a nice one and i can't
  really say why.. there's the idea that dynamic binding is bad, but
  it's quite handy from time to time (i use it in PF C code all over
  the place). why is this? and what should be the proper construct?

- see what CPS can bring. for one, it should make control structures a
  lot easier to implement. so THAT is what i was looking for. obvious
  in hindsight. but how to do this practically?



Entry: re re re
Date: Fri Feb  9 16:20:40 GMT 2007

so next actions.
1. is scoping important / feasible / desirable?
2. should i solve the assembler purely monadic?

one great advantage of NOT using static (or dynamic) scoping is the
independence of context. it does make a whole lot of sense to actually
just write the components as simple functions, and combine them
later.

what i have already is the core of the assembler, generated as simple
n-argument functions generated from an instruction set table. these
functions return a list of opcodes generated from this instruction.

currently this is executed as:

(define (assemble lst)
  (map
   (match-lambda

    ;; delay assembly
    (('delay . rest) rest)  

    ;; assemble    
    (((and opcode (= symbol? #t)) . arguments) 
     (apply (find-asm opcode) arguments))
    
    ;; already assembled
    (((and n (= number? #t)) . rest) `(,n ,@rest)) 

    ;; error
    (other raise `(invalid-instruction ,other)))
   
   lst))


instead of writing this as a map which is independent, i should write
it as a for-each (an interpreter which accumulates state changes).

ok that was easy enough: the interpreter is split into 2 parts: one
that does pure assemblers (independent of state), which are the ones
generated from the instruction set table, and one that does impure
ones.

now for the disassembler. it's probably easiest to organize this as a
binary tree decoder. the argument decoding could be done working on
the binary representation string.



Entry: values
Date: Fri Feb  9 20:23:22 GMT 2007

i never understood why 'values' would be useful. well, i think i
understant now..

to compose 2 functions A and B
A   (x y z) -> (x y z)
B   (x y z) -> (x y z)

one would need to write
(apply B (A 1 2 3))   , with A returning a list

using values this becomes something like

;; values
(call-with-values
    (lambda ()
      (call-with-values
          (lambda ()       (values 1 2 3))
        (lambda (x y z)  (values (+ x 1) (+ y 1) (+ z 1)))))
  (lambda (x y z) (values z y x)))

;; lists
(apply
 (lambda (x y z) (list z y x))
 (apply
  (lambda (x y z) (list (+ x 1) (+ y 1) (+ z 1)))
  (list 1 2 3)))


i'm not convinced about the values thing.. lists are easier for
debug: they don't requires special call. i think what's easier to
read is a straight composition, where every function passes a
list to the next one, which is then appended to a list of
arguments, like this:

(chain `(,ins ())
       (dasm 1)
       (dasm 2))

maps to

(apply dasm
       (append '(4)
               (apply dasm
                      (append '(4) `(,257 ())))))


(define-syntax chain
  (syntax-rules ()

    ((_ input (fn args ...))
     (apply fn (append (list args ...) input)))

    ((_ input (fn args ...) more ...)
     (chain (chain input (fn args ...)) more ...))))

(chain `(,257 ())
       (dasm 4)
       (dasm 4))



ok. i got the disassembler body working. now still need to do the
search..

this binary tree search looks fancy bit is it really necessary?
might even be simpler actually.

ok. i need some binary trees for that.. just made some code, but
it's kind of clumsy: the tree is created on the fly if some nodes
do not exist. less efficient, but easier to do is probably to
generate a full tree, and then just use set to pinch off a
subtree somewhere.

ok.. dasm seems to work.

some minor issues with parsing multiple word instructions
though.. will have to change the prototype.

so the next step is to move some code to runtime, and to unify
the dasm and asm: basicly they do the same: convert between bit
strings and lists. the real 'problem' is the permutation of the
formal symbolic parameters into the order they occur in the bit
string.


Entry: asm/dasm cleanup
Date: Sat Feb 10 09:34:48 GMT 2007

fix the multiple instruction problem: it's probably easier and
cleaner to have one symbolic instruction correspond to exactly
one binary word. all the targets i have in mind are
risc-like. multiword instructions are then handled as multiword
opcodes.

once this is done, the asm and dasm pack/unpack could be combined
into one single 'interpreter'.

ok. maybe it's best to stop here. it's not 'perfectly clean' but
i guess what's left of dirtyness can easily be cleaned up when i
encounter another instruction set that's not compatible with this
approach.

another thing i need to consider, or at least need a 'reason for
ignorance' for, is: "why am i not generating pic assembly
code?". the reasons are 1. full control, 2. have dasm available
in core for debug. 3. easier incremental assembly & linking.

adding support for text .asm output is rather trivial.

ok...

next: branches

the two passes, fairly simple.
1. backward branches can be immediately resolved.
2. forward branches need to be postponed.

this is a combination of the directives 'relative' 'absolute' and 'delay'


Entry: PIC18 compiler
Date: Sat Feb 10 12:44:07 GMT 2007

time for the crow jewel :)

but first, i need to clean up the core.ss register code to accept
an abstract store with default. ok done.

i don't like the way i've got the generic register compiler and
the PIC18 compiler completely separated. it is good to share
code, but in this case, the sharing can probably be done better
by just copy/pasting the patterns, or at least, inserting them
from a common include.

what about keeping the register compiler as a general purpose
example and figure out how to do proper sharing once i have
different architectures running?

yep.. i think it's best to keep that idomatic compiler for other
experiments, and go straight for proper pattern matching peephole
optimizer.


Entry: more PIC18 compiler
Date: Sun Feb 11 09:10:13 GMT 2007

i think i made a mistake by writing it as just a pattern
compiler.. this thing should be a proper language with recursion,
otherwize i can't implement recursive substitution macros and other
lanugage patterns: one machine that maps forth straight to asm.

the only preprocessing stage should be the reducer, which folds
expressions like '1 2 +'. even better, this reducer should be part of
the compiler too, so that expanded macros benefit directly from this.

summarized: separate reduction and expansion phases might lead to
suboptimal performance: it's probably best to condense all this into a
single phase, and make an extensible pattern matcher.

this would be the same design as before. there are more of these: the
little interpreters for macro mode etc.. it was pretty good already it
seems. just the global variable thing was a mistake.

ok.. it probably pays to make the pattern matcher programmable. add a
minilanguage there too.

NEXT:
* control structures
* extend pattern language

the latter is not so trivial in the current implementation since a
nice thing to have would be a 1 -> many mapping. i could use a special
'splice' word for this though. maybe it's best to work around this
though.

anothering i'm thinking is: now that i'm no longer afraid of this
pattern matching business, why don't i write my own? this would make
it possible to do some of this at runtime, making it a bit more
flexible for additions etc..

time to taka break.


ok.. what i have is 2 conflicting operations: a pattern replacement
and a reverse. this needs to be sorted out properly: what exactly do i
want the programmable part to do?

ok.. it seems to work now. needs cleanup. i'm really curious about
runtime though.. probably these are all written in terms of the syntax
expander, and need to be syntax?



Entry: merging dictionaries
Date: Sun Feb 11 14:10:36 GMT 2007

i'm trying to port the intelligent macros now.

long standing problem.. should you merge macro and bottom language
dictionaries, or keep them separate? i think the best way to go is to
manually import or link what you need.

about variables and allocation. i think it's easier to just use
variable names for this, and shadow them when they are changed. then
after a compilation is done, the whole dictionary could be
filtered. the other option is to use a functional store like before,
which might be a good idea anyway.

NEXT: 
- functional store (it's cleaner, and might come in handy later)
- conditionals + optimization
- for loops + loop massage


Entry: stateful macros
Date: Mon Feb 12 01:25:53 GMT 2007


let's see if i actually learned something.. basicly, i have two
options now. to write all the macros as explicitly handling the asm
buffer, or to have them spit out just a list of instructions.

i don't think there is any code that has to look back to the past asm
state: all words that do that are written as pattern matching partial
evaluators.

so, let's write all control structs as producers, just like the other
macros.

so.
i think i sort of disentangled the problem:

------------------------------------------------------------------
If there is a lot of state that has to be dragged along, split all
operations into classes that operate only on substates, or have a
simple, consistent way of operating on state, like concatenation.
Then, lift all these subclasses to a common interface that can be
composed.
------------------------------------------------------------------

The thing i'm using is really the Writer Monad.



Entry: monads
Date: Mon Feb 12 00:53:53 GMT 2007

about a year ago i made a decision to use a functional dynamic store
to solve the problem of state, because i didn't understand the idea
behind monads. this was a mistake, but i guess a necessary one. i
probably wasn't ready for the ideas at that time.

now i think i sort of get it. monads (haskell style) are about
dragging along state implicitly.

the irony is, i implemented that!

what i did was to have an implicit state objected being dragged along
as a top of stack element, invisible to some computations. this is the
'State Monad'.

the mistake is: this too general. it's better to use a smaller gun to
solve the problem at hand on a more local scale, instead of basicly
using a state machine model (albeit one without destructive mutation).

the small gun is mostly related to the 'Writer Monad'. the operation
that's made implicit is 'append'. i call this 'lift-stateful'. this,
together with some other state dragging (if the data stack is not
used, it can be dragged: some operations, like the pattern matching
peephole optimizer, work on the produced code as a stack.)

the thing that's really interesting though is this: if you start to
think about forth as a compositional language, then this whole monad
thing is nothing more than a way to 'lift' words so they can be
composed in linear code.

basicly. if the things you want to compose are operations A x B -> A x
B, but what you have is operations like

A -> A
B -> B
A -> B
A -> A x B
B -> A x B
A x B -> A
...

together with a higher order function (hof) that will correctly lift
them to A x B -> A x B, then what you're doing is abstracting away the
trivial parts of such a map in this hof.

for the writer monad, the trivial part is 'append'. replace 'trivial
work' with 'hard work' and you get this:

http://lambda-the-ultimate.org/node/1276#comment-14113

"By using a monad, a simplified interface to the necessary
functionality can be provided, while the hard work of maintaining and
passing the context is handled behind the scenes."


so, what i need to do is to work out some abstractions so i can
perform this kind of magic in straight cat without having to resort to
scheme code.


Entry: backtracking
Date: Mon Feb 12 08:35:51 GMT 2007

in 2.x there are the for .. next macros that perform an optimisation
for which a decision has to be made early on. does it make sense to
use 'amb' for this?

probably yes, becasue explicit undo is going to be more expensive than
just going back to a previous point and re-running the compilation..

the tricky part is to keep it under control :) in an interactive
interpreter, where state can be accumulated on the stack, having
lingering continuations in the backtracking stack might be dangerous,
since 'fail' effectively erases all changes made since the last
success.

i've provided 2 lowlevel words:
kill-amb!     reset the backtracking engine
amb	      make a nondeterministic choice from a list

the code in amb.ss supports (possibly infinite) lazy lists in case i
ever need them.

so. let's make 'amb' binary. this way it's easier to implement lazy
amb by embedding another call to amb in one (both) of the
alternatives. yep. this looks like a better idea.

haha. keep it under control! i've just been chasing a 'bug' where amb
apparently didn't return properly, however, it was just waiting for
input: the continuation had a 'read' in it, and the fail depended on a
previous read, so it just wanted that read again. so conclusion:

-------------------------------------------
be careful with amb and non-functional code
-------------------------------------------

i fixed the 'cpa' "compile print assembler" loop to read lines instead
of words, so at least the backtracking is ok on a line base.



Entry: commutation
Date: Mon Feb 12 16:44:30 GMT 2007

there are a lot of places where just swapping the order of
instructions might be beneficial. i ran into a bug where it is not
possible, although on first sight the operations seem independent:

	((['movlw f] 1-!) `([decf ,f 0 1] [drop]))
	
because 'decf' has an effect on the flag that's used in the macro for
'next', this is not always correct! drop, being movf, sets the Z,N
flags. however, decf sets the carry flag, so this could still be
used. however, i've disabled the optimization..



Entry: next actions
Date: Mon Feb 12 16:52:47 GMT 2007


- conditions
- variables
- constants in assembler


a variable allocation is just a dictionary operation, so it really
should be an assembler step. i need to think about that a
bit. something's wrong...


Entry: bored
Date: Mon Feb 12 23:18:29 GMT 2007

let's play a bit. 

generators.. a generator is easiest understood as something
which, when activated, returns a generator and a value. in other
words: a generator is a lazy list.

(((3) 2) 1)

is a finite generator

manfred von thun has an interesting page about using reproducing
programs as generators:
http://www.latrobe.edu.au/philosophy/phimvt/joy/jp-reprod.html


i wonder how to do this in lisp?
suppose fn is a state update function

(fn init) -> generator


(define (gen fn init)
  (lambda ()
    (cons init
          (gen fn (fn init)))))

in cat it's quite simple too

(gen (2dup run swap gen) qons qons) ;; (init fn -- gen)

as mentioned by manfred in

this is related to the Y-combinator.
http://www.latrobe.edu.au/philosophy/phimvt/joy/j05cmp.html
basicly, a generator or lazy list is a delayed recursion.

so in cat, applying 'run' to a lazy list, has the same result as
applying 'uncons' to a list.


Entry: misc ramblings
Date: Tue Feb 13 12:06:14 GMT 2007



i'm going to change terminology a bit so it's more Joy like, if
only for the reason that it makes joy code easer to read.

duck -> dip

http://www.nsl.com/papers/interview.htm

There is a ... combinator for binary (tree-) recursion that makes
quicksort a one-liner:

    [small] [] [uncons [>] split] [swapd cons concat] binrec

then for-each:
i need to find the more abstract pattern, which is 'fold'.

what about a fold over a lazy list?



Entry: lazy lists
Date: Tue Feb 13 12:22:52 GMT 2007

right now i use them in (amb-lazy value thunk), where 'value' is
returned immediately, and thunk will be evaluated later.

the question remaining is that of interfaces. if i say "a lazy
list" do i mean thunk or (val . thunk) ?

(there is another question about using 'force' and 'delay'
instead of explicit thunks. for functional programs there is no
difference, but for imperative programs there is. maybe stick to
thunks because they are more general.)

i think 'amb-lazy' should be seen as a 'cons' which contains only
forcing, and leaves the delay operation to the user. i provide 'amb'
to construct a full list from this. unrolled it gives:

(amb-lazy first
          (lambda ()
            (amb-lazy second
                      (lambda ()
                        (amb-lazy third
                                  (lambda () (fail)))))))


for generic lazy lists: maybe using 'force' and 'delay' is
better, since it allows for 'car' and 'cdr' to trigger the
evaluation. this enables the definition of lazy-car and lazy-cdr
without fear for multiple evaluations that have different
results, and it still allows for non-functional lists.

ok.. cleaned it up a bit, and moved most of it to lazy.ss lazy
operations have a '@' prepended to the name of the associated
strict operations. i have @map, but @fold doesn't make sense since it
has a dependency in the wrong way.

i should also change ifold to something else.. there has to be a
proper lisp name for it. i renamed it to 'accumulate'. makes more
sense. (accumulate + 0 '(1 2 3))

the corresponding lazy processor makes sence, but only if it returns
the accumulator on each call. so it's more like
'accumulate-step'. it's better to just create the @integral
constructor, which gives a new list from an old one.


Entry: had this idea
Date: Tue Feb 13 21:28:02 GMT 2007

can you do something like:
1. resolve label
2. oops can't do. save 'cons' but continue
3. run all pending conses with the obtained info.

now, this isn't much different than storing all unresolved symbols in
a table and later fix them, only this stores actions. (don't set a
flag, set the action!)

more specificly, suppose there's the input

     		x x x y z z z

where y is not resolvable. the way to solve this is to have y run z z
z and then try to resolve y and concatenate the results. basicly just
swapping the order of execution.. 

something that could be done is to make the assembler essentially
2-pass, where the first pass performs normal assembly, but on the fly
creates its reverse pass which just resolves the necessary items and
works its way backwards.

talking about overengineering :)
a simple 2 pass is probably good enough.

but still..  this is more efficient, since the reversing which would
happen in an explicit 2-pass is not necessary + the scanning of things
already compiled can be avoided.

so:   x1 x2 x3 lx y1 y2 ly z1

-> (... z1 (ly y2 y1 (lx x3 x2 x1 (...))))




Entry: backtracking -> an argument against dictionaries as sets
Date: Wed Feb 14 10:50:57 GMT 2007

another thing what i didn't think about.. what's the actual cost of
the continuations? i don't think it's much, because the data is mostly
shared: asm is just appended to until it's completely finished, and
the code list is just run sequentially. there's no rampant data
copying going on: the garbage is created only at the compile end.

so, it might actually be better to NOT keep dictionaries stored as
sets, but just shadowed association lists, to make backtracking memory
efficient. (in case i want to create lots of choice points). the
redefining of 'current allocation pointers' tends to re-arrange and
copy things on functional stores..



Entry: bit instructions
Date: Wed Feb 14 15:58:04 GMT 2007


there are a lot of bit instructions that are better handled in a
consistent way. one of the problems with the assembler is that bit set
and bit clear have different opcodes. i think it makes more sense to
handle them as one opcode + argument.

all bit instructions are polar, take another 'p' argument, so they can
be easily flipped as part of the rewriting process. the extra argument
is placed as first one, to make composition easier.

    bcf bsf -> bpf
    btfsc btfss -> btfsp
    bc bnc  -> bpc
    ...

ok, it seems to be solved with a set of pattern matching macros, and a
factored out label pusher :)


 ;; two sets of conditionals
 ((l: f b p bit?) `([bit?  ,f ,p ,p]))  ;; generic -> arguments for btfsp
 ((l: p pz?)      `([flag? bpz ,p]))    ;; flag -> conditional jump opcode
 ((l: p pc?)      `([flag? bpc ,p]))

 ;; 'cbra' recombines the pseudo ops from above into jump constructs
 ((['flag? opc p] cbra)   `([r ,opc ,(invert p) ,(make-label)]))
 ((['bit?  f b p] cbra)   `([btfsp ,(invert p) ,f ,b] [r bra ,(make-label)]))

then we have the recursive macro (if == cbra label) and the pure cat
macro (label == dup car car swap)

a lot more elegant than the previous solution. i like this pattern
matching approach.

NEXT: 
* variable and other namespace stuff
* forth lexer
* parsing words
* intel hex format


Entry: forth lexer + parsing words
Date: Wed Feb 14 21:40:13 GMT 2007

which is of course really trivial. see lex.ss
i'm not doing '(' and ')' comments again, just line comments '\'

i think i know why i always had problems with my ad-hoc parsers and
word termination etc.. splitting in lexing and parsing makes sense,
because the first one is purely flat, while the second one can be
recursive. it helps when in the 2nd phase there are no more stupid
problems with word boundaries..

parsing words are, well, extensions of the parser :)

since these will make things move away from straight 1->1 parsing, the
parser needs to be rewritten as a recursive process / fold.

ok, the scaffolding is there: written in terms of
reverse/accumulate. now i need to really think about how to solve the
'variable' problem.

-> how to solve parsing words?
-> where to do the actual allocation?



Entry: ihex
Date: Thu Feb 15 00:26:02 GMT 2007

this used to be written in CAT, but was a mess. it's one of those
simple things that are hard to express in a combinator language
because they drag along so much state if you want to do them in one
pass. again, they are about merely re-arranging data!

maybe i should just try it again, but using a multipass algo, just
to see if i learned something.. on the other hand, this would be
nice to have as scheme code, so i can use it outside of the project.

ok.. it seems to work fine. got some binary operations for free that
can be used in the loader too.




Entry: parsing words
Date: Thu Feb 15 09:55:27 GMT 2007

so. i need:   

    : variable 2variable constant 2constant 

the thing which is different from the previous implementation, is that
i have a separate compile (parse) and execute phase, so parsing words
cannot be compilation macros.

on the other hand, parsing words are always about quoting things,
mostly names, so probably a simple list of names mapped to number of
symbols is enough. limiting the number of symbols to one makes it even
easier.

sort of got something going here with variables and constants, but
there's another problem:


Entry: dictionaries
Date: Thu Feb 15 12:01:35 GMT 2007

i'm using a hash table to store 'core' macros: those that are
fixed. however, a forth program can create macros, so these need to be
defined somehow..

maybe make that a bit more strict?

the same goes for constants.. i'm using fixed machine constants in a
hash table, and some user defined stuff in other places.

this needs some serious thinking..

constants can be implemented as macros which inline literals. so the
only remaining question is: how to handle macros?

macros are really compiler extensions. they are a property of the
host, not of the target code.

it would be really inconvenient having to split a project into two
parts, so i should aim for macro defs inside source files. however, a
clear distinction needs to be made between host and target things:

target properties are related to on-chip storage == addresses

host properties are related to code generation only

the result is that there are 2 possible actions on a source file:
- reload macros + constants
- recompile = realloc code and data

to track the state of a project, the only thing that needs to be saved
is the source code + a dictionary of target addresses. all the rest
(macros) can be obtained directly from the source code.

actually, this is a lot better than the old approach, where macros are
stored in a project state file.




Entry: new badnop control flow
Date: Thu Feb 15 12:28:56 GMT 2007

in  = project sourcecode
out = compiled target code + dictionary

1. PARSE EXTENSIONS

   Read all source files and extend the compiler to include the macros
   and constants defined in the source files. This effectively builds
   a new special purpose compiler for the code in the project.

2. COMPILE CODE

   Convert all code definitions and data allocations to a form that is
   executable by the CAT VM, and run this code. This generates
   optimized symbolic assembly.

3. ASSEMBLE CODE

   In a two-pass algorithm, convert the symbolic assembly to binary
   opcodes, allocating and resolving memory addresses. This process
   uses the current dictionary, reflecting the state of the target,
   and produces a new dictionary and a list of binary code.



Entry: parse extensions (borked)
Date: Thu Feb 15 13:13:17 GMT 2007

and hupsakee, i'm writing a parser state machine again!

amazing what a not-so-good night's sleep does.. let's do this a bit
more intelligent using my favourite one-size-fits-all hammer pattern
matching!

seriously, the syntax is really simple, so i shouldn't be writing a
state machine, just a set of match rules.

one thing though. how to extend it? previous brood needed parse words
to be written explicitly. i should do that now too.. just a dictionary
of parse words, that output a chunk of cat code and the remainder of
the input stream.



Entry: forth parser - different pattern
Date: Thu Feb 15 18:45:52 GMT 2007

ok. got some sleep.

the thing is that this is a different pattern than all the other
things i've been doing. the previous pattern matching code for the
assembler is basicly a partial evaluatior, which looks backwards
instead of forwards. so this needs new code!

in short i need a different kind of parser or a preprocessor to map
forth -> composite code.


let's try to arrange the thoughts a bit since i feel i'm not seeing
something really obvious..

i have an urge to write the parser as a state machine, or as a pattern
matcher. both of them seem to lead to code with a similar kind of
complexity, but with some obvious redundancy. i can't see the higher
level construct.

ok.. what i'm missing here is elements from SRFI-1

it's quite clear what i want to do: generic list pattern
substitution. so basicly, the prototype is:

(in) -> (in+ out)

with (out) being concatenated.

let's call this the 'parser' pattern, and write an iterator for it.

ok. it needs a bit of polish, but the idea is there i think..



Entry: ditching interpret mode
Date: Thu Feb 15 21:33:13 GMT 2007

what about ditching interpret mode and relying fully on partial
evaluation? i can use the following trick: the partial evaluator does
NOT truncate results to 8 bit during evaluation, only after. so in
principle, there is a complete calculator available with full numeric
tower.

maybe it's good to create some highlevel constructs for the partial
evaluator. literals are still encoded as symbolic assembly, which is
ok, only somehow a bit dirty. this is effectively a second parameter
stack..

to make this more explicit, the macros 'unlit' and '2unlit' are
defined. these will reap literal values from the asm buffer and move
them to the parameter stack. the implementation of these macros is
split into two parts: a pattern matching part, and a generic macro
part '_unlit'.



Entry: more parsing
Date: Fri Feb 16 10:23:00 GMT 2007

so the basic infrastructure is there, now i just need to figure out
how to put the pieces together. this host/target separation needs some
more thought.

the problem i'm facing atm is 'constant'. this should define a
constant as soon as it's parsed, but the value comes from partial
evaluation which happens at macro execution time!

maybe i shouldn't really care about this 2-pass stuff.. i can just
compile code for it's side effects, being the definition of macros..

another thing, which is related to the comment about the asm buffer
being a second parameter stack: why not compile quoting words as
literals instead of loading them on the data stack? this way a simple
pattern matching macro can be used to implement the behaviour of
parsing words..

i have to be careful though, since this arbitrary freedom must have
some hidden constraint somewhere..

the hidden constraint is of course: literal stack encoding is
machine-dependent! it's actual assembler dude!

maybe keep it the way it is, however, 'forth-quoted' feels wrong. also
the combination of literals coming from the asm buffer, and the symbol
coming from the stack, feels awkward. but it does seem to be the right
thing..

anyways.. it seems to work now.


Entry: dictionary
Date: Fri Feb 16 14:07:09 GMT 2007

so the only thing that's remaining is the runtime dictionary stuff:
variables (ram allocation) and associated things.

mark variable names as literals during parsing. done.


i'm still not sure whether the muting operations are such a good
idea.. maybe a separate macro parsing stage is better after all.. as
far as i understand, the thing which makes this difficult is the way
that 'constant' works: it's dependent on runtime data (partial
evaluator), so the definition needs to be postponed..

what about using some delayed eval here? or i can use the same trick:
reserve the name so it can be treated as a literal, but fill in the
value later?

so, on to the fun stuff..  dictionaries. basic functionality seems to
work using the 'allot' assembler directive.



Entry: parse time macro definition
Date: Fri Feb 16 14:53:23 GMT 2007

what if i can:
- define all macros
- reserve all constant/variable names (which are just literal macros)

during parsing only?

and fill them in whenever the data is there?

the problem is how i'm handling 2constant now.. this can be fixed with
a gensym.

ok. this looks doable, but not essential. something for later.




Entry: forth loading and machine memory buffer
Date: Fri Feb 16 17:53:05 GMT 2007

two things i just did:
- added a function to load symbolic forth code
- draft for memory stuff

need to figure out where to do 'load'

load is a quoting parser, then just executes..



Entry: optimizations - need explicit unlit
Date: Fri Feb 16 23:40:27 GMT 2007

i'm running into several conflicting eager optimizations, which is
normal of course.. i was thinking about making this a bit more
manageble, by prefixing operations that have a lot of different
combinations with virtual ops that will just re-arrange things for the
better..

the most occuring mistake is to combine a dup with a previous
instruction so the lit doesn't show any more. i think in 2.x there is
an explicit 'unlit' that puts the drop back..

ok. this pattern matching is definitely an improvement for writing
readable code, but it does pose some problems here and there..



TODO:
- intelligent then
- better literal opti (unlit)
- port the monitor
- device specific stuff
- code memory buffer
- host side interpreter


Entry: optimization choices
Date: Sat Feb 17 10:37:12 GMT 2007

instead of having 'stupid' backtracking, it might be easier to do
'intelligent' backtracking. this means: at some point a choice is
made, but if at a later time it is realised this choice is the wrong
one, then this particular choice needs to be changed.

the pattern i encounter is this:
1. do eager optimization
2. realize later this optimization was not optimal
3. undo previous optimization to perform better one

every time there is an 'undo' this could be solved by an automated
backtracking step. what about a sort of 'electric save' ??

(it would also be interesting to somehow 'cache' the choices that have
been made in the mean time, so when a whole subtree is executed again,
the right choices are made first..)

interesting stuff :)

it looks like the search space is not really a tree, but more like a
snake line: 10010011001, where at some point one of the choices is
deemed wrong, for example 10010x11001. the remaining part 11001 then
needs to be re-done, but using the same pattern might be an
interesting optimization.

another thing is the storage of choices. backtracking needs a stack to
operate. well, i already have one! the asm buffer serves that purpose
quite adequately. this also solves the problem of the backtracking
using mutable state.


on the other hand, working purely algebraicly does have the advantage
of simplictly, but it requires the explict construction of inverse
operators.


Entry: literal opti
Date: Sat Feb 17 11:33:49 GMT 2007

instead of making pe operate on DUP MOVLW, let's make it work on MOVLW
only, so the extra SAVE is not necessary.

hmm.. i'm going in loops. the thing is that i'm using the literals in
the asm buffers really as compile time stack. simply making the
partial evaluation respect 'save' would enable to keep that
paradigm. otherwize the DUP in front of MOVLW (DUP MOVLW) needs to be
handled explicitly every time. this then needs to be handled by a
recombining DROP operation, which is really no different from handling
SAVE properly...

so back to the original solution.

to keep everything as pattern matching macros, i could also run an
explicit recombination after the literal operations.. quick and dirty.

wait a minute.

i can just dump code in the asm buffer, and add a bit to the pattern
macro to check for this, and execute it. then the only problem is:
quoted code or primitives? probably primtives are best, since they are
already packed into one item, and don't need 'run'.

ok. that seems to work just fine :)



Entry: monitor
Date: Sat Feb 17 16:22:13 GMT 2007

ok.. seems i'm almost to the point where i can compile the full
monitor code. some things are missing, like the chip specific configs,
but i can see that the partial evaluator is going to help quite a lot
to keep things simple: more things can be configured in the toplevel
forth source code file instead of a lisp style project file.

something that needs to change though is support for 'org'. this
probably means that assembly code needs to be tagged somehow.

ok. org is simply solved by embedding (org <addr>) in binary code.


Entry: intelligent then
Date: Sat Feb 17 19:59:12 GMT 2007


since i don't exactly remember what the code does, and i can't read
the old 2.x code just like that, let's decipher it.

the problem is something like this:

l4:
	btfsp 	1 TXSTA 1 0 
     r 	bra 	l5 
     r 	bra 	l4 
l5:

which comes from 

      begin tx-ready? until

which expands to

      begin tx-ready? not if _swap_ again then

the important part is the 'then', which should decide that it should
flip the polarity of the skip and the order of the two jumps IF the
first one corresponds to the symbol on the stack.

this works not only for braches, but for any single instruction
following after the forward branch.

ok. implementation. this doesn't fit the pattern matching shoe, since
the label on top of stack needs to be incorporated in the
check. however, it is possible to just compile the 'then', and perform
the optimization afterwards, which is possible using a pattern
matcher.

ok. this works.
i don't check the label though.. should do that, or prove that it
can't be anything else..




Entry: reverse accumulate
Date: Sat Feb 17 20:27:38 GMT 2007

now, something that has been getting on my nerves is the reverse tree
stuff.. there is absolutely no reason for it. the original reason was
to split code into chunks of forth idioms, but i sort of lost that...

this whole reverse tree stuff makes things to complicated so it has to
go.

temporarily i will take out the reverse-tree function.

ok. this seems to have worked.
a lot of code is a lot simpler now..

no there's still a bug. fixed.



Entry: tip
Date: Sat Feb 17 23:44:33 GMT 2007

(require (lib "1.ss" "srfi"))

yep. sometimes it takes a while to figure out the small things..

another thing: srfi 42 is about list comprehensions (loops &
generators). seems worth a look.


Entry: time to upload
Date: Sun Feb 18 08:37:04 GMT 2007

looks like stuff is in place to start dumping out hex files.  so, i
need to make an effort to not fall into the same trap as before: it
would be nice to have cat completely on the background, and do
everything from the perspective of the target system.

the easiest way to do this is to use the current debug interpreter,
and plug in a proper 'interpret' mode for interaction. yes, here there
is some confusion. what about interpret mode?

do i switch to compile mode explicitly? i kind of like the colorForth
approach where there is only editing and commands, no command line
editing.

everything between : ... ; is always compile mode. the tricky stuff is
what's before that, because i completely rely on conditional
compilation for constants etc..

but, constants are really the only exception. if i make an
interpret-mode equivalent of constant, then i could fake that.

oth. a proper compile vs interpret mode might be a better solution. it
is definitely cleaner.

so we converge on this?

-> compile mode = exactly the same as what's in files
-> interaction mode = all the rest

implemented as 2 coroutines.


Entry: state
Date: Sun Feb 18 08:59:04 GMT 2007

at this time, it becomes rather difficult to maintain all the state on
the stack, so i probably need to move to a more general state
monad. basicly what i had before in 2.x, but without executable code.

fist, let's see about what state needs to be accumulated:
- assembler buffer
- target dictionary
- forth code log?

data necessary in different modes:
compile:       asm buffer
assemble:      asm buffer + target dictionary
interpret:     target dictionary

i can probably avoid explicit monads (i don't know how to really do
that: have to lift a lot of code!), and just use a main driver loop
that runs the applications with the dictionary dipped.


what i have is a proper class based system:

- classes are cat dictionaries (implemented hash tables)
- inheritance is based on chaining these dictionaries.
- objects are association lists.


so that's for later. i'm in no need for objects with encapsulated
behaviour. the only thing i need is a local scope, so it's really just
used as a data structure.

this means i can start writing the main loop of the program, which is
basicly written as a method bound to state.

the thing i need to be careful about though is tail recursion. this
works with 'invoke'.

now that i'm here.. looks like this is an interesting way to implement
the assembler too, by writing an object that's a list, and using a
'comma' opcode to compile instructions.

thinking about this, there are really 3 major ways of symbol binding:

- method: aggregate
- lexically nested
- dynamicly nested



ok. brace for impact. going to do the asm 'object' thing.

ok... unresolved yet. this is too convoluted, precisely for the reason
of recursive calls. i'm still thinking dynamic binding here..

but there's something to say for the idea..


trying again..

TODO
- i need a better way to create a compiler for compositions:
(register, parse, name)
- should have a state base clase with just: self self@ invoke


ok. done.


Entry: passing state to quotations
Date: Sun Feb 18 15:32:47 GMT 2007


now for code quotations. how to recurse?

the problem is that if quotations are executed using 'run', they will
not obtain the state, so they need to somehow be wrapped such that
running them passes alog the state. is that at all possible?

yes. using some kind of continuation passing..

instead of wrapping the code simply in 'dip'

so:

- quoted programs need to be parsed recursively

- they need to be modified such that running them results in the
  object being loaded on the stack.

- it is not possible to override every word that performs 'run' to
  incorporate this behaviour.

- this trick is only LEXICAL no dynamic binding of words, only dynamic
  passing of state.



same old same old..
this goes way back :)

the problem is of course in the shielding. as long as every primitive
is really shielded from the state, there is absolutely no way to
access it. so  (blabla) dip  is not a good apprach.

it should be hidden but accessible, and not shielded.

let's do this manually for now: when you want to use quoted code in a
method definition, you have to explicitly parse, compile and invoke
it. the default will be globally bound code only, and shielded
execution for simplicity.

the alternative is to compile quoted code as a method (recursive
parse-state). this is kind of strange since the invokation has to
happen manually. no 'ifte' for example.

so unless i find a way to solve the 'ifte' problem and other
implicit 'run' calls, there is no way to do this automaticly: this is
really a modification to the core of the interpreter.

so i am going to let go of the scary bits, and conclude:

* only flat composition done automaticly
* recursive composition possible using 'invoke'
* quoting method code is done manually using special parser/compiler


so it all remains pretty much a state monad. some special functions
can be thrown into the composition to act on the state through some
interface, while the rest is 'lifted' automaticly.



Entry: fixing amb
Date: Sun Feb 18 16:15:23 GMT 2007

postponing the real work, i can try to fix amb to make it operate
only on the assembler store. what i need to do to make this work is to
return the continuation explicitly. so amb will do:

amb-choose   ( c1 c2 handle -- c1/c2 ) + effect of handle

here handle will store the continuation on a stack somewhere if c1 is
chosen. if this continuation is called, c2 will be chosen without
handler.

ok. looks like it's working.
still need to strip out the continuations in the assembler though.
done.



Entry: the app
Date: Sun Feb 18 18:14:22 GMT 2007

time to write the main loop.
- based on the store monad containing:
   - asm buffer
   - forth input stream (per line)
   - state memory
- written from the target perspective
- compile mode / interpret mode

ok..
seems i'm at least somewhere. now i need to think about the design a
bit more.. the state stuff is encapsulated in a small driver loop, the
rest is still functional.




Entry: byte continuations
Date: Sun Feb 18 19:07:52 GMT 2007

i was thinking about a way to use more highlevel functions in the 8bit
forth. obviously, a jump table can be used to encode jump points as
bytes. but why stop there? the return points can be mapped also,
giving the possibility of encoding return stack in bytes too, as long
as code complexity is small enough.

the compiler could do most of the bookkeeping.

this would make sense in a setting where the code is simple, but the
number of tasks is big. since that needs a ram-returnstack, which is
better implemented as a byte stack anyway.



Entry: application
Date: Mon Feb 19 09:30:44 GMT 2007

some remarks.
bin needed?
probably not.. just keeping the assembler and generating assembly on
the fly is probably best.

the basic editing step is:
- switch to compile mode, enter/load forth code
- switch to interpret mode -> code is uploaded

cat should only be for debugging

ok, so

CPA = forth compile mode. this is to edit the asm buffer using forth
commands. the asm buffer is stored in the 'asm file. in CPA mode it is
possible to test the assembly by issuing 'pb'. however, this doesn't
use the stored dictionary.. need to fix that.

ok, what i have now are 2 modes, switched using ctrl-D

* compile mode = compiled forth semantics ONLY
  not even special escape codes for printing asm etc

* interpret mode = simulated target console. target is seen as what it
  actually is + some interface to a server. the language used is forth
  lexed, but piggybacks on cat words.

looks like it's working fine this way. let's keep it.



Entry: literals again
Date: Mon Feb 19 12:05:39 GMT 2007

ok, i need to do this properly. back to the unlit strategy. basicly:
try to recover literals one by one, instead of massive combined
patterns.

let's try this:

lits asm>lit asm>lit

ok. seems to work. still needs some explicit code that might be
optimized, i.e. the literal patching. but i can live with it like it
is now..



Entry: inference
Date: Mon Feb 19 14:27:02 GMT 2007

it should be possible to infer some more about the state of the stack
given there are no jumps from arbitrary places, which is a sane
assumption.


Entry: another day over
Date: Mon Feb 19 18:01:19 GMT 2007

and i'm running it in the MPLAB simulator. it generates correct code
at first sight. so, time to hook it up :)

still some features missing: one of them is proper byte/bit
allocation.

so TODO:

- host side monitor
- state save/load

ok, i'm getting bytes back from the monitor running on the chip. time
to start writing the monitor code.


Entry: dynamic code
Date: Tue Feb 20 00:41:11 GMT 2007


cleaning up a bit now. funny, what i need now is dynamic code :)
anyways. it's easy enough now that i have a general purpose store. all
kinds of hooks can be added here, which can be saved later. they all
go in symbolic form. to make them full circle (symbolic words in
symbolic words) i need to add some kind of explicit interpreter
probably..


Entry: parse
Date: Tue Feb 20 09:55:04 GMT 2007

to wake up today, i'm going to change all the 'parse' stuff to
'compile', since that's what it really does: parse+compile. 'bind'
would be better maybe. thesaurus.

well, 'compile' is relly quite understandable.. so let's keep that.

maybe i better make compile = (bind + parse), and turn 'bind' into a
proper CAT function? this way the whole semantics and parsing thing
can be handled in CAT code.

the other thing to think about is CPS. does it make sense to use that?
i'm still thinking about run vs invoke. maybe it's better to just keep
it explicit until my current approach takes more shape and patterns
fall out..

change 'unquoted' to 'primitive'


parse:     ( source binder -- compiled )
find:      ( symbol -- delayed/primitive )


i changed names to the following protos:

a couple of syntaxes:
  cat-parse state-parse

a lot of namespaces:
  cat-find  <whatever>-find


ok, need to do clean up this stuff later.. maybe tonight.



TODO:
- fix the toplevel interpreter stuff + reload
- on reload, macros should be reloaded from source files also. means
  compile + ignore asm.
- fix proto of binder (+ parser?)
- CPS with dynamic variables?




Entry: duality
Date: Tue Feb 20 13:54:52 GMT 2007

something interesting happened here.. 'state-parse' is now implemented
as a delayed parse operation, which exposes the semantics:

parse:   list of things -> list of primitives
find:    thing          -> primitive

generalizing find's symbol -> primitive semantics. i could probably
find a better name, but let's stick to this since it's all over the
place. from now on 'find' means: map a "thing" to primitive behaviour,
and 'parse' means: map a collection of "things" to a LIST of primitive
behaviours, representing the functional composition of these
primitives.

in case of a 1-1 relationship between source syntax and compiled code
in list form, parse is really just (map find source). this is one of
the properties of CAT source code.

so there is something very simple hidden in all this..


---------------------------------------------------------------------

* PARSE: handle the structure or SYNTAX of source code.  this will
  translate source code to to a very basic COMPOSITE CODE
  representation, which is a list of primitive code elements,
  effectively reducing any form of syntax to a simplified one. in
  doing so, parse can use 'find' recursively to translate primitive
  source objects to primitive machine code.

* FIND: handle the meaning or SEMANTICS of source code. this will
  translate a source code atom, and translate it to PRIMITIVE CODE,
  possibly using 'parse' recursively to translate atoms comprised of
  structured source code.

this is the source code / compiled code duality.

       parse code collection        <-> interpret primitive code list
       find semantics of code atom  <-> run primitive machine code

---------------------------------------------------------------------

here 'machine code' is the code representation of the underlying
machine, which in this case is scheme, with primitives represented as
functions operating on a stack of values.

this is just eval/apply in disguise. the difference being that for
lisp, the functionality is represented by the first element in a list,
while here it is a composition.

eval:   (head more ...)  ==  (apply (eval head) 
                                 (list (eval more) ...))



Entry: next actions
Date: Tue Feb 20 17:13:23 GMT 2007

run time state? or where to store the file handler? do this
non-functionally, since it's I/O anyway.. why not?

that seems to work. got ping working too. and @p+ next couple of
things should be really straightforward, but i am missing one very
important part: I CAN'T USE QUOTED PROGRAMS!!!

so i need to do something about that..

again, as far as i understant, the problem is in 'run'. if you hide
information by 'dipping' the top of the stack, there is no way to get
it back, unless you can bypass this mechanism somehow.

the thing that has to change is the interpreter.

ok. it should be possible by doing something like

    '(some app code) compile-app (for-each) invoke

making sure that the dict gets properly tucked away.

the nasty thing is this is dependent on the number of arguments the
combinator takes.

(invoke-1 swap run)
(invoke-2 -rot run)


invoke is bad for the same reason...

something is terribly wrong with the way i'm approaching this.. no
solution. too many conflicting ideas.

1. i need combinators to "just work"
2. i need to be able to run non-state code properly

possibilities:
- patch all quoted code -> parsed as state code
- do not patch combinators

maybe i should just try?

this is crazy...

i just don't get it.

heeeelp!


i don't know how to solve it.. but i can work around it :)

basicly, the problem i have is that i can't use higher order functions
in combination with the state abstraction: basicly, because the
abstraction effectively uses a different kind of VM. to solve it, i
need to either accept i have to change the VM, or just make the data
i'm using persistent. there are several options:


* turn the n@a+ and n@p+ into target interpreter instructions. this
  just makes them static, so i do not have to use references to
  dynamic state in the core routines. might be the sanest practical
  solution.

* just forget about the functional approach to the dynamic state and
  store this in a global variable. a bit drastic, and i will probably
  regret that later, since it feels like giving up on a good idea at
  the first sight of real difficulty...

i will go for the first one so i can at least finish the interaction
code.. this has the advantage of making the monitor itself a bit more
robust, since it will provide full memory access.

one thing i didn't think about though: making ferase and fprog
primitives will make them a bit less safe (ending up sending random
data). i should add a safety measure.

ok, that seems to work.


Entry: monitor update
Date: Tue Feb 20 21:52:02 GMT 2007

triggered by some unresolved conflict between hidden dynamic state and
the interpreter, i made most of the functions in the monitor available
as interpreter bytecode. this makes it a bit more robust and
apparently a whole lot faster also.

still to fix is some kind of safety measure to prevent the erase to be
triggered accedentally by some unlucky combination of input
data. a password if you want :)


Entry: monitor progress
Date: Wed Feb 21 11:46:43 GMT 2007

got most of it working this morning. next actions:
- variable/bit alloc
- save/restore state
- sheepsint core compile + macros

i do rely a bit on parsing macros in the original sheepsint 3.x
code. that's not so good. time to thing about working out some
abstractions a bit better.

for isr:

flag high? if flag low handler retfies then

now variables/bits

ok, no bits.. do that later, sheepsint doesnt use them: explict
allocation.

next:
- state loading on startup
- interrupt handlers



Entry: getting tired
Date: Wed Feb 21 23:55:34 GMT 2007

yes, time to get it done.. overall, i'm quite happy with the
result. it's a lot better than the previous two. i can't really see
much further from here, other than elaborating towards higher
abstractions (different language), and fixing some jump related simple
optimizations.

the bad guy is quoted method code, which has a strange conflict of
concepts. more on that later.

another thing i miss is inline cat code, i.e. for generating
tables. i think i better do this in a different file, and only in
scheme: no more intermediate cat-only files. 1 1.1 16 table-geom

then the lack of proper run-time semantics is kind of weird. the
partial evaluator replaces this, but in an implicit manner: not
everything is accessible, and the bit depth is different.

about literal opti: still not completely happy, since the patterns
should do the literal preprocessing automaticly.

looking at pic18.ss gives me a warm fuzzy feeling :) most of the
knowledge is encoded in 2 patterns: assembly substitution patterns and
recursive macros. language support is encoded in 2 more: some asm
state monad and writer monad.

the thing which would help a bit is reducing the redundancy for the
rewriter macro specification. the way it is right now is very
readable, but maybe a bit too much clutter. on the other hand, it
might be a bit overengineering.



Entry: monads again
Date: Thu Feb 22 00:57:22 GMT 2007

http://en.wikipedia.org/wiki/Monads_in_functional_programming

Alternate formulation

Although Haskell defines monads in terms of the "return" and "bind"
functions, it is also possible to define a monad in terms of "return"
and two other operations, "join" and "map". This formulation fits more
closely with the definition of monads in category theory. The map
operation, with type (t -> u) -> (M t -> M u), takes a function
between two types and produces a function that does the "same thing"
to values in the monad. The join operation, with type M (M t) -> M t,
"flattens" two layers of monadic information into one.

The two formulations are related as follows:

(map f) m ≡ m >>= (\x -> return (f x))
join m ≡ m >>= (\x -> x)

m >>= f ≡ join ((map f) m)

--

isn't that what i'm doing? 

'map' is my 'lift', it lifts a function operating on only a stack to
one operating on a stack + state information.

'join' is i.e. concatenation of lists in the writer monad i'm using
for assembly,

'return' i don't use? yes i do. it's how i initialize state, i.e. by
loading an empty assembly list on the stack, and how some functions
return a packet of assembly code.

http://citeseer.ist.psu.edu/wadler92essence.html
the basic idea  in monadic programming is this: a function of type
a->b is converted to one of type a->Mb (monadic form)

i.e. assemblers:
a function '(movlw 123) is converted to '((movlw 123))

'bind' is there to compose 2 functions in monadic form.

in the example of assemblers, 'bind' will do the concatenation of the
assembly.



Entry: higher order pattern matching
Date: Thu Feb 22 09:56:59 GMT 2007

meaning: match pattern generation based on templates. it seems to
work, but involves double quoting, which is a bit hard to wrap your
head around.. there's one thing i've been trying to understand for a
while, is how to do this:

`((['dup] ['movf a 0 0] ['lit] ,word) (,'quasiquote ([,opcode ,',a 0 0])))

without having to use the "quasiquote" symbol. maybe i should have a
look at paul graham's "on lisp" again...

ok, i think i got it:

;; ORIGINAL: explicit quoting of the quasiquote symbol
`((['dup] ['movf a 0 0] ['lit] ,word) (,'quasiquote ([,opcode ,',a 0 0])))

;; WORKS: using a name binding to avoid double quoting
`((['dup] ['movf a 0 0] ['lit] ,word) (let ((opc ',opcode)) `([,,'opc ,,'a 0 0])))

;; MAYBE WORKS: pattern generated is (quasiquote (unquote (quote
;; thing))) instead of (quasiquote thing)
`((['dup] ['movf a 0 0] ['lit] ,word) `([,',opcode ,,'a 0 0]))

yep it works..

----------------------------------------------------------------------------------
the trick is to generate this:     (quasiquote (... (unquote (quote thing)) ... ))
instead of attempting to generate: (quasiquote (... thing ...))
----------------------------------------------------------------------------------

to really understand this, it might be interesting to implement quasiquote.
http://paste.lisp.org/display/26298

another thing to note: this merging of quoted/unquoted stuff is what
the syntax macros actually do a lot better automaticly..



Entry: interpret mode
Date: Sat Feb 24 10:15:38 GMT 2007

i got the synth core to run. next actions:
- interpret mode
- setting interrupt vectors
- note table
- figure out line voltage + impedance
- identify


interpret mode seems to +- work

i'm using overriding: if a word is not in the target dictionary, it is
executed on the host. maybe this will lead to some obscure problems?
maybe i really need to separate the 2 a bit better, and use an
explicit debug mode.

then state association. i should have an 'identify' command, so a
connected target chip can tell the host which state file to load. but
how to implement this? i could reserve some space in the bootsector
for this actually.

so, applications..

i was thinking about keeping the monitor independent. i don't think
this is a good idea, since the boot code is really application
dependent. so an application is everything, including monitor.




Entry: syntax macros
Date: Sat Feb 24 21:45:46 GMT 2007

been playing a bit with macros.. i don't really understand them fully
though.. especially the use of local syntax in syntax expansion etc..

also, it's really better to move the preprocessor hook out of
pattern.ss DONE.



Entry: 16bit code
Date: Sat Feb 24 23:19:13 GMT 2007


looking at the sheepsint controller code.. there is no real reason to
not make it 16-bit. all computation i'm doing is on 16bit numbers, and
the overhead of switching everything to 16bit is probably minor.

todo:
- 16bit interpret mode
- way to map symbols



Entry: toplevel workflow
Date: Sun Feb 25 08:41:42 GMT 2007

mainly about program organisation. a program consists of these parts:

1. boot block (first 64 bytes)
2. monitor
3. fixed application code
4. variable application code


a project is a directory. 2. should be made as standard as possible,
and i really shouldn't care about the size, since that only matters
for code protection. 3. should, if possible, stay on the target too
(mark). 4. should have a 'scratch' character.

empty = erase till previous 'mark', but no further than monitor code.
-> replace dictionary with saved dictionary
-> round 'here' up to the next 64 block
-> erase from there
DONE

also, the reset vector should jump to #x40
DONE

and i need to find a way to update the monitor code on the fly.
-> either copy the monitor as a whole (since it's place-independent)
-> copy a minimal copy routine.
as good as DONE


reloading core with minimal effect on state = 'core'

setting interrupt vectors : should save 'here' etc.. -> interesting,
since it really involves a run-time assembler stack. maybe it does
make sense to de-scheme the assembler..


Entry: state monad
Date: Sun Feb 25 10:46:05 GMT 2007

http://www.ccs.neu.edu/home/dherman/research/tutorials/monads-for-schemers.txt

let's see. the problem i had was to use 'for-each' in state
code. because of the way the state needs to be passed, all higher
order functions need to be aware of it. i just need a special
'for-each'.

the way around this is to use the stateful functions ONLY to access
the data, and use pure functions to do manipulation. i.e. 'logic' vs
'memory'. currently in the interpreter it works fine.

this can seem like a drag, but in fact it is a good thing: functions
are not unnecessarily infected by state.

so, monads are about order of execution, and really central to a
compositional language, where there is only order! this in contrast to
lambda languages, where there is an intrinsic parallellism in the
order of evaluation.

about interpretation: if you see a concatenative language as a series
of sequential operations, it is 100% serial (the way it is
implemented). however, if you see it as composition of functions,
there is no evaluation order, because there is no evaluation, only
composition.

i need to look into list comprehension etc...



Entry: call conventions : de-scheming
Date: Sun Feb 25 10:56:48 GMT 2007

instead of doing some real work today, i'm going to have some
fun. make the interpreter more reflective, meaning, converting all
important routines to operate on a stack.

actually, that might not be a good thing.. i'm using non-stack
functions for convenience, so primitives are simpler to code, and i
can use the lambda abstraction instead of combinators.



Entry: time to do some work
Date: Sun Feb 25 13:36:34 GMT 2007

- interrupt vectors DONE
- a/d converter for board
- 16 bit interpreter
- constants and variables


Entry: alan kay oopsla lecture and stuff
Date: Sun Feb 25 21:36:56 GMT 2007

- 'core' should not distroy any DYNAMIC state AT ALL, only static
  background.

- every restart is a failure.

- need better debugging.


i need to be more observing of things that are annoying, and fix them
immediately instead of some short-term goal. the thing i'm trying to
do is to build a better tool, not to finish some product. i need to
always try to distill the important core idea, instead of bashing away
to 'just make it work'..



Entry: variables
Date: Mon Feb 26 09:52:27 GMT 2007

i've got a problem with variable names: the dictionary does not make a
distinction between flash and ram names, but the interpreter does need
to treat them differently. this is solved properly by using two
dictionaries, or a nested dictionary.

probably 2 dictionaries is better.. requires a little rewrite though.

ok. i started to rewrite the assembler to separate assembly from
dictionary operations, so it's easier to make the dictionary an
abstract object.

then i need to change to recursive operations '(ram here) instead of
'here etc..

seems to be fixed. the implementation is abstracted, and currently
solved as a simple sub-dictionary. maybe move this to 2 separate
dicts..


Entry: analog -> digital
Date: Tue Feb 27 08:25:45 GMT 2007

i don't know what's wrong, but it doesn't work properly. but first,
documentation. let's copy and paste the previous one.



Entry: reasons
Date: Tue Feb 27 12:12:44 GMT 2007

all i want is lisp, but:

- cat is terse
- cat is more editable
- forth works on small things
- forth with linear data is predictable

the first to are from the point of writing software, and interacting
with a system, the last two are practical solutions to needing a lot
of programming power under constraints (small or RT)

something i learned though: it's bad to waste time writing
combinators. in BROOD 3.x i solved this by writing some core things in
scheme. basicly they are combinators: interpreters for certain kind of
code propagating state.

in PF and previous forth experiments i ran into this problem several
times: trying to express something which really needs 'hidden
state'. mostly i solve this using global variables. which is ok as
long as there is no pre-emptive multitasking going on.


SCRIPT    = toplevel organization: large amount of trivial code
ALGORITHM = small amount of  nontrivial code

the idea is to use a scripting language to glue together algorithms,
while the nontriviality of the algorithms is hidden, and the
connectivity between them is made managable by the features of the
scripting language.



Entry: linear lists -> PF
Date: Tue Feb 27 12:18:46 GMT 2007

yes. it does make sense to rewrite malloc. malloc is not what i need
if i'm using linear data structures. i don't need free, only free
lists. and yes, it would be cool to have access to the page
translation table too :)



Entry: compilation is caching
Date: Wed Feb 28 01:41:21 GMT 2007

compilation is really caching.. maybe i should find a way to add
dynamic loading of code without full image reload, by using a custom
made 'promise'. one that can be un-cached whenever a new word (or
group of words) is defined, so code can be re-bound.

more about the caching.. this means that symbolic code is really the
only representation of code. the compiled representation is an
invisible optimisation, and should be hidden from the programmer. if i
replace all atoms with a struct containing their symbolic version and
a possibly cached behaviour, i can re-interpret on the fly..

this should give all the benefits of late binding, without the
drawbacks of having to reload the whole image all the time.. however,
cache invalidation probably needs to do this anyway: invalidate a
whole dictionary of code, unless all references can be found
somehow.. probably not.

so what's the difference? what would a proper cache offer?
uncompilation for one.. it's probably good to keep the symbolic data
and environment around..



Entry: no more quotation
Date: Fri Mar  2 14:08:38 GMT 2007

quotation sucks.. and it's really not necessary if i install a default
semantics. my previous argument was: no default semantcs (no
defaults!) because i need more than one.. however, everything will run
on the VM as primitives, so there is no real good reason to have no
defaults: the symbolic representation might be "the bottom line", with
compilation viewed as optimization/caching.

what needs to be done to fix this? i probably need a better object
representation, a more abstract one. an object has properties, one of
them being its cached rep.

so.. what is an object?
- syntax, form.. this is the 'data' part
- semantics in the form of an associated interpreter object

optimizable properties:
- cached semantics.

this is really just OO. need to look at smalltalk.. maybe it's good to
have some ideas propagate. data=object data, interpreter=class

summary: the idea is to parse into something which retains the
symbolic representation, so semantics can be late bound, and
compilation is still possible, but is done with memoization.

clearing the cache is then possible by scanning the entire memory from
the root and invalidating some bounds.

this trick can also be used in PF. a linear language with late binding
but aggressive memoization.

hmm.. i read something in "the essence of functional programming"
http://citeseer.ist.psu.edu/wadler92essence.html
about values versus processes. to paraphrase:

- in lambda calculus, names refer to values
- in compositional languages, names refer to functions

the first one only has values, (while functions are a special case of
values), while in compositional languages there are only functions,
with values represented as functions.

going the intuitive route: a name is a function, and only a
function. an object is only a function. it has an associated
action. data is represented by a generator.


Entry: a new PF
Date: Fri Mar  2 15:22:12 GMT 2007

summary:

- object oriented: objects are functions. each object has a 'syntactic
  representation S' and an 'associated interpreter I'. (the result of
  applying I to S is X, an executable program which acts on a data
  stack.

- the basic composite data structure is a CONS cell.

- composite data is linear: no shared tails.

- the interpreter needs to be written in (a subset of) itself, to
  allow easy portability (to C).


problems:

all the problems are related to the linearity of the language. to make
things workable, some form of shared structure needs to be
implemented. however, this can lead to dangling references.

-> continuations / return stack
-> mutual recursion

if i clean up the semantics such that dangling pointers are allowed in
some form, like 'undefined word', this should be managable. to keep
things fast, this needs to be cacheable: it should be possible to
detect whether an object is live etc..

to rephrase: looks to me that a completely linear language is really
unpractical. how do you tuck away non-linearity so behaviour is still
real-time?

i keep running into the idea of 'switching off the garbage
collector'.. decompose a program into 2 parts: one that uses a
nonlinear language to build a data/code structure, and a second one
that runs the code: trapped inside the brood idea: tethered
metaprogramming.

-> a predictive real-time linear core (linear forth VM + alloc)
-> a low priority nonlinear metaprogrammer (scheme)

together with the smalltalk trick to simulate the real-time linear
core inside the metaprogrammer.

the VM:
- no if..else..then: only quotation and ifte
- no return stack access: use quotation + dip

this can be a lot more general than for next gen PF. i can run this
kind of stuff on a microcontroller too, to have a different
language. one with quotation, and no parsing words.. the idea is to
make the VM as simple as possible: i already have a way to implement a
native forth, maybe the catkit project should be just that: CAT is
that thing that runs on the micro? linear CAT?


Entry: linear CAT vm
Date: Fri Mar  2 15:59:41 GMT 2007

- run:     invoke interpreter
- choose:  perform conditional
- quote:   load item from code onto data stack

- tail recursion: this is really important
- continuations (return addresses) are runnable

using variable bit depth? code word bitdepth is determined by the
number of distinct words. an 8 bit machine is for small programs,
while a 16bit machine is for larger programs and/or programs that need
to do more math. something inbetween is also possible. most practical
is 12 bit. but the most important thing is: the data stack needs to be
able to hold a code reference.

for the 18f, i think it's best to go to 16bit. the forth is for
inconvenient features, while the highlevel language should be that: a
highlevel language.

in order to properly implement tail recursion, the caller should be
responsible for saving the continuation.



Entry: direct threading
Date: Fri Mar  2 16:33:47 GMT 2007

i'm trying to write an interpreter with these properties:
- proper tail calls (caller saves continuation)
- continuations can be invoked by 'RUN'
- direct threading.


in direct threading, threaded code is a list of pointers that points
to executable code, and a continuation is a pointer that points to a
list of such pointers. so yes, these constraints can be satisfied:

- composite = array of primitive
- continuation = composite
- a composite code can be wrapped in a primitive using a simple header

TBLPTR -> composite code
PC     -> primitive code

see direct.f -- summary: the most important change is threaded code +
proper tail calls by moving the continuation saving to the caller.



Entry: linear languages
Date: Fri Mar  2 19:13:23 GMT 2007


http://home.pipeline.com/~hbaker1/Use1Var.html

"A 'use-once' variable must be dynamically referenced exactly once
within its scope. Unreferenced use-once variables must be explicitly
killed, and multiply-referenced use-once variables must be explicitly
copied; this duplication and deletion is subject to the constraint
that some linear datatypes do not support duplication and deletion
methods. Use-once variables are bound only to linear objects, which
may reference other linear or non-linear objects. Non-linear objects
can reference other non-linear objects, but can reference a linear
object only in a way that ensures mutual exclusion."


what he describes a bit further on is an 'unshared' flag. a refcount =
1 flag, but it looks like this is more in the context of a mark/sweep GC.

an attempt to make some patterns automatic? reverse list construction
followed by reverse! is an example of a pattern that might be
optimizable if the list has a 'linear' type: the compiler/interpreter
could know that 'reverse!' is allowed as a replacement of 'reverse'.

so as far as i get it, baker describes a 'linear embedded
language'. linear components are allowed to reference non-linear ones,
but vise versa is not allowed without proper synchronisation. so in a
RT setting, this means the only thing that is allowed to run in the RT
thread is the linear part, while the nonlinear part can maintain it's
game outside this realm.

so, again:
- high priority linear RT core (forth)
- pre-emptable nonlinear metaprogrammer (scheme/cat)

the linear part contains only STACKS + STORE. the nonlinear part can
contain the code for the linear part. the compiler runs in the
nonlinear part. the nonlinear part is not allowed to reference CONS
cells in the linear part.

this can be implemented entirely inside of PLT. on the other hand,
having this structure independent of a PLT image makes it more
flexible: the core linear system should be able to do it's deed
independent of the metasystem's scaffolding.

baker calls my 'packets' nonlinear types: names with management
information (reference counts): a strict distinction is made. this
allows a nonlinear type to be decouple from it's (possibly linear)
representation object.

in PF this means: packets are references to linear buffers. the result
is that underlying representation can change ala smalltalk's 'become'.

conclusion:
- cons cells are linear
- packets are nonlinear wrappers for linear storage elements
- packet access: readers/writers access protocol: mutation is only
  allowed when there are no readers (RC=0). (functional ops)
- 'accumulation ops' use shared state + synchronized transactions.




Entry: standalone forth
Date: Sat Mar  3 13:25:46 CET 2007

maybe he didn't get it, but writing this compositional language and a
standalone forth are conflicting ideas..

it's not so hard to give up on parsing words, other than true quoting
words: there will be only one left, let's call it '{'.

what's worse is that i need to dumb it down a bit. i'd rather define a
new language, but an ANS forth might be better for teaching. for the
simple reason that i don't need to write such an extensive
manual. maybe it still makes sense to run both languages on the same
VM ?

another forthism that's not really necessary: since i'm sharing code
between the lowlevel subroutine threaded forth and the direct threaded
forth, why not make the VM primitives equal to subroutine threaded
forth, instead of them being directly linked to a NEXT routine. in
other words, why not have an explicit trampoline? this will be
slightly slower, but uses less code since the primitives don't need a
separate binding, which would just call the other code anyway.


conclusions:
- interpreter loop allows primitives == native code (STC forth)
- 'enter' uses short branch -> code needs duplication
- primitives need no IP saving!! (compiler needs to distinguish
  between primitives and highlevel words)


the last one is a consequence of doing continuation management on the
caller side: caller cannot be agnostic! it should be possible to pass
this information to 'enter' somehow, so enter can save/restore
depending on some flag. carry flag? that's ok, a long as this machine
state is guaranteed to be saved..

however, in this case, the primitive needs to call 'EXIT' in case the
carry flag is not set! so still, some compiler magic is necessary, or
all words need to terminate with an EXIT call, independent of whether
they terminate with a tail call.

this is a bit messy... let's try to summarize: the flag is called the
NTC flag: non-tail-call. 

- EXIT = leaves current context
- WORD -> ENTER conditionally saves the context (carry flag)
- PRIMITIVE: needs EXIT if TC flag is set.


again.. there are 4 cases:  PRIM/COMP and TC/NTC. what i'd like is to
solve the PRIM/COMP completely in 'enter', such that the interpreter
can be agnostic about highlevel words.

an instruction =   primitive + NTC flag

what does the NTC flag mean for the interpreter? nothing. it's just
extra information passed to 'ENTER': it means the rest of the code
thread can be safely ignored.

the interpreter completely ignores it, and just runs forever, assuming
the code stream is infinite. all threading changes are implemented by
other primitives.

so, given the current implementation, a solution is to always compile
EXIT, together with a bit that indicates an instruction is a tail
call. this is not very clean.. the exit bit should be universal.

semantics: the bit indicates that the current thread can be discarded
BEFORE passing control to the primitive. then the primitive can always
just save the continuation. (a possible optimization is to overwrite
the continuation, but let's to the former first since it's
conceptually simpler). this is different in that the interpreter is
not agnostic about the return stack, but effectively implements
'EXIT'.



Entry: is code composite? run or execute? yin or yang?
Date: Sat Mar  3 16:37:37 CET 2007

in CAT it seems i've converged on only using composite code = list of
primitives as the quoted programs that can be passed to higher order
functions. however, original forth does not use this stance: threaded
code is a list of execution tokens, and execution tokens are the
canonical representation of quoted code, when treated as data.

this is wrong. why? reflection becomes more difficult.

the stuff on the return stack is the saved IP. this should be "a real
program". the inner interpreter deals with arrays of primitives, and
such arrays can be wrapped in a primitive by prepending them with
ENTER. however, the data representation of code should really be
composite, so no primitive address, but a composite address.

primitives == internals. it's better to treat primitives as singleton
composites, than to treat composites as primitives. in the inner
interpreter, the reverse view is better.

i think this view originates in original forth, and is mainly
historical: primitives came first. composites were treated as
primitives. i can't think of another reason really..


conclusion: 

------------------------------------------------------
programs are composites = array of threaded primitives
------------------------------------------------------

i'm going to reflect this in the following change:
- execute is reserved for primitives
- run is reserved for composites


also, if you look at native code, the picture is pretty clear:
primitives are machine instructions, and you simply cannot 'execute'
them, they always need to be inside a code body.. composite code =
list of instructions, referred to by an address. it's just the same...



Entry: reflection
Date: Mon Mar  5 02:26:40 CET 2007

have to think about this a bit more. something strange going on with
this primitive/composite thing. what about only having highlevel code:
composite code just links to more composite code.. there's no way to
plug in primitives here. for purely pragmatic reasons, using
primitives, and highlevel words wrapped in primitives is worable..


Entry: essentials
Date: Thu Mar  8 16:57:26 EST 2007

* symbolic -> ast + room for it
* possibility to 'uncompile' an AST
* use abstract types (structures) and 'variant-case' in AST


Entry: delayed list interpretation
Date: Fri Mar  9 10:48:54 EST 2007

thinking about eopl : i need more data abstraction. car and cdr is
nice, but they really are quite lowlevel. there's too much
implementation leaking through. the asm monad is a good example of
abstraction, but code probably needs it too..

about using symbolic code and caching: parsing a list, it can be
either code or data, depending on how it is accessed. maybe it should
really have these 2 identities? if accessed using list processing, it
behaves as a list, but if accessed using 'run', it is behaves as code
-> jit compilation cache. the benefit of this is that the semantics
can change.

so a list is really an object with different identities. all list
processors should be modified to take an abstract list object.



Entry: platforms
Date: Mon Mar 12 18:54:28 EDT 2007

ai ai ai... i'm spending money again, surfing on ebay.. discovered
this nice ARM7 board on sparkfun made by olimex. it has a 128x128
color lcd, usb client mode and ethernet. a dream platform for brood,
especially since this is THE standard 32bit chip, getting really cheap
too.. so i want: 8 bit PIC18, 16 bit DSPIC30 and 32bit ARM7.


Entry: itch
Date: Tue Mar 13 12:47:23 EDT 2007

want to start changing some abstract data type implementations.. the
most important one is probably 'quoted program' or 'composition'. this
has to be distinguished from a chain of cons cells in that it has more
structure. a composition can always be converted to a chain of cons
cells, and a chain of cons cells can be converted to a composition if

1. it's a proper list
2. some interpreter semantics is attached

so a quoted program is the above: a proper list with attached
interpreter semantics. the changes this requires are:

- all data operations that modify lists need to accept the
  'composition' data type and convert automaticly.

- the parser needs to produce compositions instead of chained cons
  cells.


it's getting old.. but the structure of program (source compile
cached) is probably better written as (primitives) where each
primitive has its own semantics and cache: word (atom compile
cache)

there are several options for word
(source thunk/cache)
(source thunk cache)
(source compile cache)
(source compile env cache)

in general: do we want the environment to be explicitly specified, or
is an abstract representation of the binding operation enough?

one of the requirements is the ability to rebind, so at least cache
and binder need to be separate, since the binder uses (in the current
implementation) some mutable state. 

probably the following model is close enough to the current one
(basicly the same as 'delay' but with possibility to clear the cache)
to not need a lot of changes (or enable incremental ones) and will do
what's required:

prim = (source atom, interpretation thunk, cached compilation)

so:
- code can be re-interpreted by just clearing the cache
- all code, independent of semantics, can be specified as lists
- a 'data' mutator can be defined that strips a list from all
  executable semantics

so i guess the conclusion is:
- composite code is a concrete list of abstract primitives
- primitives contain memoization info

this brings me to restate that in BROOD, the compiler itself is
written using OO techniques with mutable state, but the target
compilation is completely functional.

the reason is this:
- the host language is mostly about organization  -> mutable OO
- the target compiler is mostly about algorithms  -> functional + monads

the main practical reason for using functional approach in the
compiler is the ability to work with continuations for very flexible
control structures. the 'constant' part is implemented as an OO
system.


Entry: reflection
Date: Tue Mar 13 13:46:25 EDT 2007

another thing i keep running into is the mixed use of 2 calling
conventions: scheme N->1 and cat stack->stack

it would be nice to have scheme only to provide primitives, and have
all other utility code be out in the open. however, given the way some
algorithms are implemented now, that is impractical. i can have all
the reflection i want, but not necessarily from CAT, since that would
make it harder to use scheme to implement the core of things..

maybe it's good to keep this in mind: CAT is just a minilanguage
inside scheme, and all the things i need to bring out can easily be
brought out if necessary. full reflection is not necessary yet.

probably CPS chapter in EOPL will make this a bit more clear..

ok.. getting rid of the parse/find abstract interface. it complicates
things too much..

one thing i didn't think of is that 'find' maps  thing -> word. so the
compiler for a symbol + find is something that looks up a word AND
dereferences the implementation.

yep. i noticed it is really a good thing to use closures instead of
explicit structures.. of course, this does mean that all the red tape
moves to the other side: all things that provide closures need to do
the binding.

it's not going so smooth as expectected: all through the code it is
assumed a primitive is either a function or a promise. so i guess it's
a good idea to change it now. the main problem is the 'find' as
expressed above, since i have an extra level of indirection that
distinguishes 'find' from 'compile', with explicit delayed compilation
(interpretation) instead of implicit.

i can probably work around it by providing:
- compile lifting
- special primitive registrars


i need to sleep over this..

current problem = pic18-literal: used in a lot of places. produces
primtive but should produce word. i changed this, need to check. the
rest should be straightforward..

also writer-register! is wrong, due to the lift not being
wrapped.. maybe better just lift words instead of prims?
all this freedom!!


Entry: spaghetti
Date: Wed Mar 14 08:33:03 EDT 2007

the change above brings up some conceptual confusion.

- a word is a representation of an atomic piece of code. it retains
  its source representation, and a translator which defines its
  semantics.

- lifting is done on the primitives, not on the words. maybe that
  should change? NO

- the pattern (register! name (atom->word name compiler)) has 2
  occurences of name. this is ok. the first one is an index, while the
  second one is there to recompile the word if necessary.

ok, it seems to work now..




Entry: reload
Date: Wed Mar 14 09:48:10 EDT 2007

i run into problems with redefining the structs: data lingering after
a reload is not compatible with type predicates and accessors. this
means code cannot survive a reload. this is a bit ill defined, so i
need to make a descision.


- if i don't redefine 'word' on reload it can never be redefined on
  reload, which is a bit of a nuisance.

- the other solution is to redefine 'word' and chainging all the data
  on the stack to reflect this change: the stack is the only thing
  that survives a reload, so it needs to be properly processed. it
  looks like i need a temporary structure to solve this.

- find a better way to implement reload.


the struct thing is really annoying. i need to find a solution to that
soon. all the rest works with reload, even using a 'symbolic'
continuation passing: after load, the repl loop itself is recompiled.

maybe i should separate the files into init and update. that way it is
possible to perform incremental updates.

ok. the solution seems to be to install a 'toplevel' continuation that
is passed the entire application state (stack). 'load' can then be
called with a symbolic code argument == continuation.


Entry: TODO
Date: Wed Mar 14 11:01:25 EDT 2007

got it hooked up. lots of things to fix:

- pic programmer endianness bug for high word?
- fix reload / scheme modules (using different files with include) DONE
- create a monitor/compiler for the 16bit threaded interpreter


a compiler for the threaded code would:
- map a list of words to their respective addresses
- perform tail call optimization

i should go straight for cat-like code with code quoting.

yep. the most important things to tackle now are modularity and
platform independence. aspect oriented programming :)

maybe is should leave the module stuff for later, since reloading is
not that easy... loading inside a module namespace might be possible
though.


Entry: languages
Date: Thu Mar 15 09:40:09 EDT 2007

how to combine 2 different languages in one project? i'm trying to
write the purrr language in terms of purrr/18, and i need an easy way
to switch between them.

i need a methodology.. what is a threaded forth? basicly a set of
primitives. so what i need is:
- a list of primitive names
- a way to compile 'enter'

things to look at:
* unify all toplevel interpreters, so i can have more
* separate console.ss into machine specific things.


a toplevel interpreter is
- a string interpreter
- an exception handler
- a continuation


what about i store all the modes in the data store, symbolically?
seems to work.

now, about the VM. i think it's best i standardize on the VM mentioned
above: call threaded with return (jump / tailcall) bit. this can be
written in C too, so should eliminate most porting problems, with only
optimization problems remaining.

ok. back to where i started. should i allow 2 different languages on
one attached system? why would i want to do that? debugging of course,
but what else? there's a bit too much freedom here. 2 languages:
native + ST forth will compilicate things, but will also make things a
lot easier to use.. and a threaded forth compiler isn't so incredibly
hard to build..

so, i work with one core language purrr/18 and build a threaded forth
on top of that using a different mode.

so.. next problem = representation for words. i'm using a simple name
prefix = underscore. maybe there's a better way to do this? name
prefixes allows the use in the lower level language.

ok.. rambling on. the way to do it is to just translate highlevel
language into lowlevel forth, and pass it on to the compiler.

(1 2 +)  -> (ENTER ' _lit 1 ,, 2 ,, ' + EXIT)

here EXIT is a special word that installs the return bit in the MSB of
the last word.



Entry: TODO
Date: Thu Mar 15 13:17:53 EDT 2007

- variable abc abc 1 +
- flash addresses as literals
- exit bit
- write a paper about the absence of '[' and ']' and the relationship
  between literals and (dw xxx)


the first 2 are similar: it would be nice to partially evaluate some
code that uses words from the ram and flash dictionaries next to
constants.. this introduces another dependency.

currently the partial evaluator only resolves constant symbols. it
requires a new dependency [ dict -> compiler ] to resolve this
problem. there is a possibility to delay the evaluation of the
optimization until assemble time, by using closures..

there's a deeper problem here: name resolution needs to be fixed..

let's see..  partial evaluation can't fail in a sense that is
recoverable: if the literal optimization fails, it's a true error that
fails the entire compilation. this means the evaluation itself can be
delayed until the environment is ready, since the control flow does
not depend on the result.

a delayed evaluation has the form   \env -> value
whyle env is: name -> value.

the more i think about this, the more better i like the idea.

ok, so the first 3 can be solved using some form of delayed evaluation
until exit.



Entry: delayed evaluation forced in assembler
Date: Thu Mar 15 15:56:08 EDT 2007

there is already one kind: symbolic constants. the addition that needs
to be made is generic expressions. there are several forms to
choose:
- symbolic lisp expressions
- symbolic cat expressions
- scheme closures

the first one are nice since they are symbolic, so easier
debugging. the last one might be simpler to implement. lisp style
expressions make more sense here since they have a single value, not a
stack.

now, this can be combined with the paper on partial evaluation:
partial evaluation should then be transformed to compile time meta
code evaluation.

actually:

1 2 +   ->   [ 1 2 + ]L

following colorForth, executed code always results in a literal on the
->green transitions.

ok, so this fixes the question above: 
* delayed code is symbolic cat.
* assembler does final evaluation of this code


so what is the context?

machine constant -> number
variable name -> data addresses
forth words -> code address

operations come from some dictionary, probably cat, but need to be
escaped somehow. let's say: search meta first, then variables, words,
constants.

this needs some changes: 

* the assembler needs to be a CAT word, so the stack can be used as
  context.

* it's probably better to wrap all symbolic names in a list, so the
  evaluation is uniform: either numbers or lists.


this seems to work pretty well.


Entry: 16bit threaded forth compiler/interpreter
Date: Thu Mar 15 18:06:23 EDT 2007

let's give them a name: the highlevel forth is PURRR and the lowevel
forth is PURRR/18. here i use shorthand names threaded and native
resp.

first problem is the parser, since the forth needs its own parsing
words. that should be the only real problem. since this forth is
mainly for higher level stuff, i don't need machine constants: all
machine access is solved on a lower layer. actually, the different
namespace is a nice excuse for some simplificiation.

second problem is running code from the brood console. this needs a
little trampoline, since the only way to get out of a running
interpreter is to call 'bye':

' bye -> IP
<addr> _run CONT



Entry: added pattern debug
Date: Fri Mar 16 10:13:07 EDT 2007

the pattern compiler now has a debug method which dumps the source rep
of the patterns into the asm buffer for inspection. this is
implemented in the form of a match rule that matches ['pattern].


Entry: added robust reloading + logging
Date: Sat Mar 17 09:38:56 EDT 2007

error on reload: console waits for <ENTER> then reloads. this is to
give a chance to correct syntax errors without loosing state.

also added 'cat.log' output, which enables the use of emacs'
compilation-mode. just run 'tail -f cat.log' as compile command.


Entry: trampoline
Date: Sat Mar 17 09:40:45 EDT 2007


ok, i got something wrong...

words stored in the dictionary are primitives. invoking a highlevel
word from within a lowlevel native context requires the use of these
primitives.

remember: highlevel 'run' ONLY takes composite words, while lowlevel
entry ONLY takes primitives. IF a primitive contains a highlevel
definition AND the primitive points to an ENTER call, THEN the rest is
highlevel code.

the correct way to build a lowlevel -> threaded trampoline is:

* set the highlevel continuation, saving the current one, to highlevel
  code that does (bye ;)
* call the primitive
* call the interpterer to invoke the continuation



Entry: delayed eval in assembler
Date: Sat Mar 17 10:18:14 EDT 2007

i should go to an architecture where the number of passes in the
assembler is not fixed, but just enough so all expressions can
evaluate with correct dependencies. maybe one pass is necessary to at
least find out if all the labels are defined.




Entry: literals
Date: Sat Mar 17 12:06:44 EDT 2007

should literals be handled in the interpreter?

problems: 

* a _lit instruction cannot drop the current thread, because the value
 needs to be accessed from the current thread. so the code "123 ;"
 translates to "LIT 123 NOP|EXIT"

* moving this to the interpreter by encoding the value in the opcode
  is possible, but then large words need 2 words. it also requires 2
  different lit instructions, implemented as:
  
  - LOAD + LOADHI
  - LOADLO + LOADHI
  - DUP + LOADSHIFT
  - LOAD + LOADSHIFT



the one with DUP requires only one explicit lit, but it always needs 2
words, even in cases where the data would fit into a single word.

i think code density is more important than any other constraint,
except conceptual simplicity of the language, which is independent of
the VM implementation. so it's probably going to be 2 different lits.

so how to implement that?

it's easy to detect if the high byte of an address is zero, since the
flags will be set. this could be the clue. the address space this
overlays is just the boot monitor, so that's no problem.

some bitneukerij. 'x>IP' uses movff so it doesn't affect flags, this
means the testing of the zero flag can be done after testing of the
carry flag.

are literals important enough to give them half of the address space?
the answer is probably yes:
- they occur a lot
- if they are as cheap as constants, you don't need constants
- 14 bit signed words will cover most use for numbers (counters/accumulators)
- other literals are addresses: make sure memory model respects this opti


so how do we give them half the address space:
- effectively: use only 32 kb -> enough for now
- align words to 4 byte boundary (and possibly reclaim the storage..)

let's go for the first one: only 32kb address space. the other half
could maybe be used for byte access?

so, encoding primitives as

[ EXIT | LIT | ADDR/NUMBER ]

gives  EXIT -> c
       LIT  -> n  afer one shift

this looks nice:

\ inner interpreter loop
: continue
    prim@/flags                 \ fetch next primitive + flags
    exit? if x>IP then          \ c -> perform exit
    literal? if 14bit ; then    \ n -> unpack literal
    execute continue ;          \ execute primitive

\ interpret doubleword [ 1 | 14 | 0 ] as a signed value.
: 14bit
    _c>>                 \ [ c | 1 | 14 ]
    1st 5 high? if       \ sign extend
	#xC0 or  continue ;
    then
	#x3F and continue ;


after fixing a bug in 'd=reg' it seems to work


Entry: parsing
Date: Sat Mar 17 19:00:52 EDT 2007

just fixed the interpreters and direct->forth translators using
parser.ss

somehow something doesn't feel right though.. parsing words feel
'dirty'. i'll try to articulate why, since i don't think anything can
be done about it.

internally, quoting is no problem: you just build a data type (word)
that supports quoted functions/programs/symbols.. in CAT this is done
by creating primitives that map  stack -> (thing . stack)

however, in program source code it is problematic: non-quoted
compositional code has a 1->1 correspondence symbols<->semantics, and
the semantics of successive words is not related. quoting is about
modifying the semantics of symbols.

one example where this is done very nicely is colorForth: here the
color of a symbol is part of the source code, and represents
information about how to interpret a symbol name. in textual from this
would be something like

(red drie) (green 1) (green 2) (green +) (green ;)

here a pair of (color word) represents a single semantic entity. in
ordinary forth however, it's not done this way: not all words have a
prefix (a color). an other way to say it: most words use the default
'color'.

so, in a sense, the thing that is 'dirty' is the default
semantics. this is not so bad for convenience sake, but does requires
a parser that introduces the semantics. otherwise we would have

: drie  number 1 number 2 word + word ;

which is really what it is parsed to in the end.. the thing i'm being
anal about is that CAT has a 1->1 correspondence between syntax and
semantics, inspired by Joy. although, this is not entirely true. a
syntactic shortcut in the form of (quote thing) is introduced to be
able to quote lists and symbols. but this is not entirely necessary:

'(1 2 3)  ==   (1 2 3) data
'foo      ==   (foo) data bar

with these operations being a bit less efficient. that concludes the
rant.


Entry: quasiquote
Date: Sun Mar 18 09:00:31 EDT 2007


which leads me to the following. it does make
sense to have lists of programs in CAT, where quasiquote would come in
handy.

`(,(+) ,(-))


Entry: program->word
Date: Sun Mar 18 09:09:50 EDT 2007

some nitpicking about constant->word. before i had quoted programs
wrapped using constant->word. this doesn't make sense, since the
'constant' is really a parsed thing, and not a source representation.

however, it does enable 'data' to do its work. but why don't i just
quote the source of the entire program, and store the parser as
semantics? that would be cleaner, but something doesn't feel right
there either..

well, actually. i can just delay parsing completely! that seems like
the right thing to do: the source can just be retained in its original
form, and initial recursion during parsing is avoided, which directly
solves the problem of setting! an atom's semantics.



Entry: lazy eval
Date: Sun Mar 18 10:06:18 EDT 2007

i think i start to see why a lazy language can be so convenient.. i
spend quite some time trying to figure out when it's best to evaluate
some expression. if this is always "as late as possible" this work
should disappear. nevertheless, it's an interesting exercise.

for the assembler it might be interesting to write it completely
lazily, including the optimizations necessary for jumps, which i still
need to implement.


Entry: disassembler
Date: Sun Mar 18 14:40:43 EDT 2007

disassembler needs to be smarter. i probably need to add some
semantics to the fields, and have a platform-specific map translate
them:


resolver closure + asm code -> [ shared code ] -> disassembled ->
prettyprint.


Entry: open files
Date: Sun Mar 18 17:15:50 EDT 2007

something is terriby wrong with the open files.. fixed by manually
closing. i think i need to read about how ports get garbage collected,
or not.. indeed. they are not, need explicit close or make an
abstraction:

http://list.cs.brown.edu/pipermail/plt-scheme/2004-November/007247.html



Entry: where to go from here?
Date: Tue Mar 20 01:01:02 EDT 2007

enough mudding about. roadmap:
- get dtc working with host interpret/compile
- make it self hosting
- combine with synth

- dspic asm + pic18 share/port


Entry: a safe language?
Date: Tue Mar 20 01:22:58 EDT 2007

[ 1 + ] : inc
[ 2 + ] : inc2

is it possible to make a safe language without too much trouble?
something like PF. without pointers.. 

[1 2 3] [1 +] for-each

the interesting thing is that i can use code in ram if i unify the
memory model. i think it's time to start to split one confusing idea
into 2:

- a 16/24bit dtc forth for use with sheepsynth dev: control computations
- a self contained safe linear language for teaching and simple apps


safe means: 
* no raw pointers as data
* no accessible return stack, so it can contain raw pointers
* no reason why numbers need to be 16 bit: room for tags
* types:
  - number   [num  | 1]
  - program  [addr | 0]

features:
* symbols refer to programs, special syntax for assigment
* assigning a number to a symbol turns it into a constant
* for, for-each, map, ifte, loop

[ 1 + ] -> inc
1 -> inc

[[1 +] for-each] -> addit


now.. lists?
the above is enough for structured programming, but map and for-each
don't make much sense without the data structures.. so programs should
be lists, at least semanticly. since flash is write-once, a GC would
make more sense than a linear language.. so what about:

purrr/18 -> purrr -> conspurrr

maybe it's best to stay out of that mess.. cons needs ram, not some
hacked up semiram.

what about using arrays? if programs are represented by arrays instead
of lists, not too much is lost:

[1 2 3 4] [PORTA out] for-each   ;; argument readonly = ok
[1 2 3 4] [1 +] map              ;; argument modified in place (linear)

the latter one needs copy-on-write.

[[+] [-]] [[1 2] swap dip] for-each

what about

[1 2 3] [1 +] map -> test

1. arrays are initially created in ram, as lists?
2. when assigned to a name, they are copied to flash
3. assignment is a toplevel operation, effectively (re)defining constants
4. flash is GCd in jffs style.
5. words can be deleted.

in ram: one cell is 3 bytes: 2 bytes for contents + 1 byte next
pointer. this leaves room for 256 cells, or 768 bytes.

it might be interesting to make assignment an operation that's valid
anywhere: persistent store.. on the other hand, that encourages misuse.

so..
- free lists make no sense in flash
- they do in ram
- persistent store rocks

in order to make this work, i need to write a flash filesystem first.

problem: does redefining a word redefine all its bindings? it
should. so each re-definition needs to be followed by a
recompilation. nontrivial. this gets really complicated...

can't we represent code as source, and cache it in ram? it looks like
variable bindings should really be in ram. but what with persistent
store?

damn. dumbing it down aint easy. i think maintaining the late binding
approach is infeasible. maybe it's good enough to clean up the
semantics a bit? 1->1 syntax->semantics mapping (i.e. choose is the
only conditional) so code can be used as data using 'map'. maybe that
does make sense.. 'map' as 'interleave'.

ok, that's enough. 


Entry: language in the morning
Date: Tue Mar 20 09:02:11 EDT 2007

after 4 hours of sleep: it's hard to say goodbye to nice ideas when
they don't work for practical reasons.. still there's something here.
i think i just need to read the PICBIT paper by Marc Feeley and Danny
Dube, and bas it on that. it looks like i just need to be wastful:
everything is source code, the flash is a filesystem, and the ram
contains executable code. the most important of all: it should be
towered:

purrr/18 -> purrr -> cat/18

so to distill again:
- cons cells in ram
- a flash file system


it is interesting how the linear/nonlinear language thing i'm using,
and this linear ram and nonlinear flash memory model coincide.

the approach in PICBIT seems interesting: using fixed size cells of 24
bits = [2 | 11 | 11], with the types:

00 PAIR
01 SYMBOL
10 PROCEDURE
11 one of the others


Entry: distributed system
Date: Tue Mar 20 09:53:19 EDT 2007

i was thinking: this tethered approach makes a whole lot of sense in
the case of one host controlling a huge amount of identical cores.


Entry: back to dtc
Date: Tue Mar 20 10:42:24 EDT 2007

got compile + interpret working. time for control structures. i'm
seriously considering only using code quoting. but how to implement?
same as in PF?

it's actually not so hard:

x x x { y y y } z z z
      |
   
x x x quot L123 ; y y y ; : L123 z z z


this does require a stack / recursion to associate the lables. another
way to deal with it is to solve it in the parser, and use real
lists. or the lowlevel forth could be extended to use something like
this, which is probably easiest.


Entry: hands on pic hacking
Date: Thu Mar 22 17:10:19 EDT 2007

playing with the synth board. it resets from time to time. found that
touching the PGM pin causes this. this pin is floating in my board, so
i guess that's where the problem is.

datasheet says:

CONFIG4L 300006 bit 2 LVP enable1 / disable0

indeed, this is on. as long as this is enabled, normal port functions
are disabled. moral of the story: disable it, or tie it high, or
enable weak pullups.



Entry: sheepsint
Date: Thu Mar 22 17:38:38 EDT 2007

after fixing the PGM bug (LVP disabled now), it still crashes from
time to time. i suspect it's some kind of interrupt thing.. lets
disable stack reset and see if it still crashes.

tss... watchdog timer was on. stupid.


Entry: modeless interface
Date: Thu Mar 22 18:55:22 EDT 2007

- modeless interface (unix socket) to send brood commands for emacs
- normal boot vs interpreter based on activity on reset


i should find a decent protocol to interrupt an app: to attach a
console easily, but to have it running most of the time.


Entry: partial evaluator
Date: Fri Mar 23 00:50:37 EDT 2007

i'm probably just getting tired, but isn't it a lot better to do
partial evaluation on source instead of assembly code? there is some
elegance to the greedyness of the algorithm. somehow, this feels
ok.. but if i type 1 2 +, it's always going to be equivalent to 3.. if
literals can be identified at the time they are compiled, their
compilation can also be postponed..

i don't really have a good explanation. what i do know is that this
works because it is fairly decentralized.. the price payed is "literal
undo" which is not so hard, and also works for pure assembler.

don't know if this is going to make sense.. a symbol's semantics is
only defined by what machine code it will be compiled into. (concrete
semantics) for forth, this is either a function call or some inlined
machine code. since the latter is highly machine specific, it doesn't
really make much sense to separate that out into partial evaluator +
optimizer, since the optimizer is going to add some bit of partial
evaluation anyway.. it's better to put some effort into making the
code separable: some patterns go for all register machines, some go
for all pic chips, ...

as i found so far, 

1. abstractions will arise whenever they are hinted by redundancy or
   "almost redundancy".

2. if you build an abstraction you don't use later, you
   loose. abstractions make code more complicated, and are only
   justified by frequent use.

3. don't hesitate to keep towering abstractions until the redudancy is
   gone. some problems really do need several layers to encode
   comprehensably.

what i'm intrigued by:

4. solve only one thing per layer. (one aspect). if the abstractions
   do not stack, find a way to disentangle them, and weave them back
   together automaticly.


Entry: compiler compiler
Date: Fri Mar 23 10:00:04 EDT 2007

seems you can't really use macros to write macros without extra
effort im mzscheme. it defines level 0 and level 1 environments
(normal and compiler), but a level 2 (compiler compiler) cannot be
easily used without the use of 'require-for-syntax'

the thing i ran into is this: i want to use a macro to generate a
pattern matching expression inside a define-for-syntax function that
is used implement a macro that generates a pattern matching
expression.

maybe it's best i just switch everything to using modules, and reload
the full core when i'm reloading. i'm getting a bit tired of these
kind of problems.

questions:
1. is it possible to reload a module?
2. how to only recompile what's changed to reduce load time?



Entry: cat as plt language?
Date: Fri Mar 23 13:54:06 EDT 2007

ok, but what is apply in that case?

(apply fn args) == (run-composite stack composition)

in other words, exchange single code multiple data to single data
multiple code. apply then still means means: convert a data + code
into a data.



Entry: modularizing cat
Date: Fri Mar 23 14:35:26 EDT 2007

brings up a lot of problems.. some of the macro's i'm using like
snarf-lambda are not very clean wrt names and values..

i also 'communicate using global values' which is not a very good
idea.. so it's going to take a bit longer than expected, but the code
should be a bit cleaner when it's done.

ok, now for the big one. pic18.ss

generic forth stuff: need to spend some time to separate out the
sharables, which is a lot..

i do wonder if i really need both writers and asm state monads.. it is
cleaner, but also a bit of a drag..

i need a proper mechanism to do this separation.

but first, get this thing to load properly.. got some bugsies here and
there. seems the compiler works fine, but the assembler has got some
problems.

ok. seems to work now. also compilation seems to work.



Entry: macro namespace
Date: Fri Mar 23 19:02:35 EDT 2007

there is really no reason to have multiple macro namespaces. i mean:
namespaces are defined using hashes. it's easier to just load the
generics, then overlay the specifics instead of having a lot of
special names in the dictionary.. in other words: the pic18* words
should be replaced by global unique things, denoting the fixed
functionality:


* machine constants
* simple/full forth parser
* macros
  -> recursive              
  -> pattern matchers
  -> writers
  -> asm state modifiers


all specific functionality is added on top by overlaying the
code. this used to be done with "load" but is now done using
"require". order of execution is preserved in require ???



Entry: double postpone
Date: Fri Mar 23 21:16:39 EDT 2007

i'm running into problems with macro generating code.. fixed
some. cleaned up some in vm.ss

now i have an interesting problem with delayed eval: macro defs (side
effects) get delayed till after the macros are used..

ok i think i got it..

what about tagging names that are supposed to be cat semantics in a
certain way?

ok.. this concludes a long run. from the top of my head, things are
better now because:

- badnop is better defined as a forth compiler with fixed
  functionality mentioned above

- code makes clear indication if functions are used as cat semantics
  == code that compiles something into a stack primitive.

- 'compile' and 'literal' are now CAT macros

- the state monad uses a more highlevel wrapper


things to do still:

- constants for disassembler
- disassembler
- core restart
- clean up source file layout, maybe split in more modules + docu

funny... running into an evaluation order problem again.. maybe i
should use some kind of module / scheme namespace trick to get rid of
this? because load/eval/parse order is kind of arbitrary now.. ->
nothing to worry about. it was a stupid typo.

got the meta-patterns macro working too. this is actually an
interesting idiom: just wrap a single macro around a body of 'define'
statements to alter the way they are used: it allows proper syntax
hilighting + individual testing.



Entry: so what is badnop?
Date: Sun Mar 25 16:02:37 EDT 2007

a native forth compiler for register machines, with provisions for
harvard architectures, and provisions to build a dtc interpreter on
top of a native wordlength forth.

the platform specific part are: assembler generator, pattern matching
peephole optimizing code generator, and some recursive macros.



Entry: persistent store
Date: Sun Mar 25 18:59:45 EDT 2007

so.. it would be way easier to just have the compiled forms cached on
disk. but i guess if that's really necessary i can always write out
scheme files and compile them. for the rest: all persistent data
should be SYMBOLIC.

this means:
- no compiled CAT code (word)
- no continuations in asm

this seems really important.. an area where compromise leads to
unnecessary complexity. i'm going to leave it open, and implement
restart by reload, giving only the parameter. 

this is turning into a "where to put stuff" quest again.. ok. keep it
like it is, and put the data stack in the state store + perform some
checking to see if data is serializable before writing it out.



Entry: debugging tools
Date: Mon Mar 26 15:10:12 EDT 2007

need more debugging tools:

- some safe way of dealing with the bootblock (mainly isr) OK-
  on-demand console: interrupt app OK
- proper disassembler
- 'loket'
- documentation: how to document the language?

dasm needs some thought. the interrupt app is as simple as polling the
rx-ready flag, i.e.  "begin params rx-ready? until"


Entry: i need something new
Date: Tue Mar 27 10:22:26 EDT 2007

the dasm might be interesting.. maybe i should do that. but i'd like
to do something exciting today :)

wrote some badnop docs, changed some names.. maybe i should have
user definable semantics accessible in CAT itself? (more reflection?)



Entry: the road to PF
Date: Tue Mar 27 11:13:48 EDT 2007

ok. time to write PF in forth, by gradually bootstrapping into
different languages. the lifts are:

1. vector -> linear lists
2. non-managed -> refcount managed
3. untyped -> type/polymorph
4. proper GC
5. scheme


the first lift is the same as the one i already did, which is lifting
native code to vectored rep. the lower interpreter's composites become
the higher interpreter's primitives. however, if data is also being
lifted, the change is in no way trivial: primitives won't accept the
data until it's moved to a linear stack.

so maybe this needs to be separated. the lift to lists is different
for data than it is for code. on the other hand, it does look like a
nice place to insert some type checking code.

need to think a bit more.. 



Entry: multimethods
Date: Tue Mar 27 11:49:14 EDT 2007

i had this idea of representing types using huffman coding, in a
binary tree. this requires a set of fixed types and some information
about which ones are used most, but it might be quite optimal.

there is a lot lot of room for optimization here, moving type checks
outside of functions etc.. but it will probably require some type
specs.



Entry: poke
Date: Wed Mar 28 10:57:34 EDT 2007

let's write poke again, the PF vm. the first thing i need to do is to
generate C code from some sort of s-expression.

expression conversion seems trivial, just need to distinguish between
the bultin infix operators, and prefix expressions with comma
separated argument lists.

statements are more problematic. bodies are straightforward, but how
to handle special forms like for/while/do ?

seems i got most of it running now. main features:
- an s-expression interpreter with a primitive and a composite level
- used to implement 2 interpreters for statements and expressions

now i was thinking if it would be possible to create some kind of
downward lambda. i can't use the gcc extension..

yes, but i do need to allocate ALL functions in structures, meaning
explicit activation recors, and use lexical addresses. if this is
used, it's better to completely forget about any local C variables.


Entry: downward funargs
Date: Thu Mar 29 16:12:22 EDT 2007

so, attempt to create a 'downward lambda' for poke. allocating on
stack for now, with later possibility to allocate on heap. how hard is
this to have in some form?

simplifications:
- all cells are the same size
- values are pointers to 'object'

this needs quite a bit of support:
- environments
- closures

the function bodies themselves take:
- environment
- arg list (part of environment?)

a function invokation is:
- create environment extension
- run function
- cleanup environment extension


{
	object_t env[3]; // parent + 2 variables

	// invoke a function 'FUN'
	({
		// create new environment
		object_t ext[2];
		ext[0] = env;     // link parent
		ext[1] = 123;     // init first and only arg
		FUN(ext)          // invoke fun
	})
}


this resembles PICO.


ok.. going a bit too far here. what about introducing these features
when they are really needed?

one question though.. if only downward closures are needed, why not
use dynamic binding instead?

nuff.



Entry: back
Date: Thu Mar 29 17:50:44 EDT 2007

back to the code generator. the reason i wrote this was twofold. one
is to have a portable target for brood forth. the main idea there is
to rewrite mole into someting more graceful, and have a basis for
(re)writing PF.

and two: i need a language for expressing the signal processing code
in PF. this should not be forth, but a multi -> multi dataflow
language. maybe just forth + protos?

so. i think the next step should be to transform current cgen (poke)
so it has an extensible name space.

maybe it is a good time to look into defining new languages inside
PLT, since that's what i'm doing basicly, instead of mucking about
with explicit environment hashes and interpreters.

something to iron out: it's not a new language, it's a cross-compiler:
you want to define functionality accessible in one name space using
functionality accessible in another name space.



Entry: extendin cgen name spaces
Date: Fri Mar 30 10:31:15 EDT 2007

i don't really need to make the hash tables available. it's much
easier to just create a new interpreter function which falls back on
the basic one defined in cgen.ss

hmm.. i got myself in trouble again. the above doesn't work since
statement/expression are mutually recursive. in addition to that,
statement uses closures. maybe i do need a hash?

ok. i think i got it ironed out a bit. using a hook for both the
expression and statement formatters, and calling this hook recursively
does do the trick.


Entry: compiler structure
Date: Fri Mar 30 15:01:41 EDT 2007

so.. basicly, a compiler/assembler/whatever has the following
'natural' structure:

T = target language
S = source language
C = compiler language

it's best to separate the S -> T map into:

primitive macros  S -> T  (small)
composite macros  S -> S  (big)

you want to write both S -> T and S -> S maps in C. the reason you
want an S -> S map is because it contains higher level code than a S
-> T map.

one pitfall is to shield functionality in C by not properly mixing in
the T name space. the most straightforward way to implement both maps
is quasiquoting: quoted S or T and unquoted C. including the compiler
language is more precise:

primitive:   C,S -> T
composite:   C,S -> S

badnop is already organized this way: the primitives are peephole
optimizing pattern matchers, where C is scheme. writers and state
modifierd are composite, with C being cat. and the recursive macros
are a cleaner S -> S map, with C empty.



Entry: lifting
Date: Fri Mar 30 15:22:03 EDT 2007

now for the ambitious part. the thing that got my whole forth/PF thing
started is a desire to generate automatic control structure for video
DSP building blocks. basicly:

IN:  a highlevel description of how pixels are related through
     operations

OUT: a compiled representation processing images / tiles

the core component here is loop folding:

(loop { a } then loop { b }) -> (loop { a then b })

the win is a memory win: intermediates should not be flushed to main
memory. 

so compilation generates the control structure. compilation 'lifts'
the pixel building blocks into something interwoven with the control
structures.



Entry: grid processing
Date: Fri Mar 30 16:01:57 EDT 2007

the possible optimizations depend tremendously on the amount of
information available on the individual processors, so the idea is
to keep the primitive set really simple, and look at their
properties.

* associative    (n-ary op consisting of n-1 binary ops)
* commutative    (binary op)
* linear/linear1


+   l a c
*   l a c
/   l1
abs


the typical structure to look at is a one dimensional FIR filter,
since this can be extended to 2D (space) and 3D (space+time)
filters.

(* gain
   (+ x
      (n x 0 -1)
      (n x 0 +1)))

let's analize. 1 and 3 are constants, so (/ 1 3) can be
evaluated. x is used in a 'n' expression, which we use to denote
membership of a grid. let's make all parameters into grids

(* (gain)
   (+ (x 0)
      (x -1)
      (x +1)))

so (gain) is a 0D grid, (x 0) is a 1D grid (x 0 0) is a 2D grid etc..

composite operations can be specified, for example

(processor (a b)
  (+ (a) (b 0) (b 1)))

this means all parameters need to be declared, since we need to know
the order. the syntax i'm using here requires ordered parameter
lists. i prefer this over keywords, since it is more compact, and we
need to fill in all inputs anyway (no explicit defaults).

another ineresting operation on an expression is to compute it's
reverse: an expression represents a dependency graph, which can be
inverted. however this is only iteresting for multiple inputs, which
we won't use yet: apply explicit subexpression elimination and graphic
programming.

ok, so we need parameter names.

another interesting operation is fanin: how many times is a single
value used? this is important for memory management
(linearization). note that linearization and operation sequencing is
almost equivalent to translation to forth.

maybe it's time to go for the first iteration binder. we map a single
function to an explicit iterator. i.e.

(+ (a 0) (a 1))

it has a single 1D grid input, and produces a single grid. ah!
something i forgot: what's the output type? a grid of dimensionality
equal to the maximum of the input grids.

so, an n-dimensional grid is placed on the same notational level as an
n-ary procedure.

ok. the above can be transformed to the loop body

(+ (index a (+ 0 i)) (index a (+ 1 i)))

where a runs over the line. the rest is border values:

(+ left (index a 0))
(+ (index a w) right)

so the idea is to make the loop body and the 2 borders.

implementation (see ip.ss)

     implicit  ->  explicit
     (a 0          (a ([I 0]  0) 
        1             ([I 1]  1) 
       -1)            ([I 2] -1))
                          | 
            loop depth ---X



Entry: thinking error
Date: Fri Mar 30 19:53:11 EDT 2007

the error i made previously was to 'precompile' things: bind stuff to
tiles, then bind some stuff later in an interpreter. the problem with
this is that you're solving the same problem twice. not very good..

a much better idea is to keep everything in a highlevel description,
then compile it as composition goes on: one thing i'm dreaming about
is to build things in a pd patch, then hit 'compile' for an
abstraction, and it will compile an object that performs the
operation.

so, the other error was to use low level reps. forth has benefits, but
not for writing compilers, which is mainly template stuff: mixing name
spaces. you really need quasiquoting and random parameter access.

EDIT: this is what's so nice about the scheme macro system: the mixing
of compiler and target namespaces works really well.

Entry: monads and tree accumulation
Date: Sat Mar 31 10:44:58 EDT 2007

writing the source code analysis functions i run into the following
problem: map a tree, but also run an accumulation. now of course it's
easiest to just use local side effects here, since they behave
functionally from the outside. (linear data type construction).

but just out of curiosity, what kind of structure is necessary to do
this functionally?

basic idea of monads: if you don't save 'extra' data in the
environment, save it in the data. this requires 'map' to be
polymorphic, so it can act on this type accordingly. i don't think
it's worth the trouble here.



Entry: boundaries
Date: Sat Mar 31 12:14:13 EDT 2007

border values

using finite grids, borders need to be handled. basicly, invalid
indexing operations need to be replaced by valid ones. some
strategies:

constant:    (a -1 0 0) -> c
repeat:      (a -1 0 0) -> (a 0 0 0)
wrap:        (a -1 0 0) -> (a (wrap -1) 0 0)

how to name border regions? there are several distinct cases, for
example a square grid has these:

(L L) (I L) (H L)
(L I) (I I) (H I)
(L H) (I H) (H H)


L  low boundary
I  bound to iterator
H  high boundary

with looping indicated by { ... } a full 2D loop looks like:

   (L L)           ;; top left
   { (I L) }       ;; top
   (H L)           ;; top right
   {
       (L I)       ;; left
       { (I I) }   ;; bulk
       (H I)       ;; right
   }
   (L H)           ;; bottom left
   { (I H) }       ;; bottom
   (H H)           ;; bottom right


that basicly solves the problem. note that it's best to lay out the
code in a L I H fashon to keep locality of reference.

on to representation.

the loop body is a serialization of an N-dimensional 3-grid. (a 2-grid
is a hypercube). it's serialized into a ternary tree.

how to represent ternary trees? the following representation looks
best in standard lisp notation:

     ((L . H) . I)

other variants have the dot in an awkward place. another possible rep
is (I H L) which can be written in mzscheme's infix notation as

      (H . I . L)

i'm going for the former, as it allows to use (B . I) in case L and H
are the same. in order to generate the full loop body.

EDIT: it's easier to just use s-expressions: (range H I L), and have
'range' to be a keyword..

loop borders can be constructed using the data structure provided by
'src->loopbody'

ah! it's possible to separate the operations performing loop order
allocation and pre/post expansion, but probably not very
desirable.. so let's combine them, so we can get rid of using natural
numbers.

note: i found out that when i needs index lists, i'm doing something
wrong: applying a certain order on things...

so, in order to generate the tree above, we consume coordinates from
left to right. all loop transformations need to be done on the source
code before generating loop bodies.


Entry: lexical loop addresses
Date: Sat Mar 31 14:33:53 EDT 2007

i need a notation for addressing loop indices. currently i'm
converging on not updating pointers in a loop, but using indexed
addressing, since that's something that can be done easily in
hardware.

an optimization here is to use relative addressing only for the inner
loop, so only one index needs to be added, and cache the computation
for all other relative accesses.


each loop has exactly one index that's being incremented. the depth of
the loop determines how many indices are bound. what i'm trying to do
is to generate the border conditions that have not all indices
bound. how to do that?


loop a {

     ... data (a) ...
	  
     loop b {

     	  ... data (b c) ...

     	  loop c {

	       ... data (a b c) ...

	  }
     }
}


the inner loop here needs to be split into 3 parts

data (l  b c)
data (a  b c)
data (h  b c)

then the 2 unbound parts can be moved out of the loop.


so, basicly.

BODY -> (nonfree . free)

as an example, take (+ (a 0) (a 1))

split in
      (+ (a 0)    (a 1))       ;; border
      (+ (a (i 0) (a (i 1))))  ;; body


since code is originally in unbound form, it might be more interesting
to perform binding inward. start from the relative description, and
split this into a partially bound and partially filled structure.

              border <- relative -> bound

then iterate downward

before this is possible, all code needs to be translated to full
'virtual full grid'. later on, it can be substituted back to its
original form.





Entry: representation
Date: Sun Apr  1 10:22:02 EDT 2007

ok, i think i got the basic idea, so it's time to start using some
abstract data structures. on the other hand, if using list structures
is possible, debugging is more convenient.. sticking to lists.


Entry: breath-first
Date: Sun Apr  1 16:52:47 EDT 2007

i think this is the first time i ever encountered a problem that's
easier solved using breadth first expansion.

hmm.. that's probably plain bullshit.. it's just my particular
approach at this moment using an 'infinite' expansion with an escape
continuation:

(define (expand e)
  (call/cc
    (lambda (done)
       (let ((expand-once 
              (lambda (f) ...
                      (done e))))
            (expand (expand-once e))))))

basicly, this just iterates expand over and over, and backtracks to
the last correct expansion 'e' whenever some termination point is
reached in expand-once.

ok, abstracted in 'expand/done'



Entry: separation of concerns and exponential growth
Date: Sun Apr  1 19:15:54 EDT 2007

was thinking.. separation of concerns: hyperfactoring, whatever you
call it, is a means to move from linear -> exponential code dev..

once you can separate things into independent parts A x B, increasing
functionality in either will increase total functionality by the same
multiplication factor. if they are not separated, increase in
complexity doesnt translate to increase in functionality.

this is very badly explained, but i think i sort of hit a spot here.

compare the payoff of time invested in building independent/orthogonal
building blocks that can be combined, against the payoff of time of
tweaking a small part of a huge system. the added complexity
(information, code size) might be the same, but the added expressivity
(possible reachable behaviours) is hugely different. multiplication in
the first, and addition in the second.

it's the difference between adding a bit in state encoding
(exponential), and adding a state (linear).


Entry: the inner loop
Date: Sun Apr  1 19:26:11 EDT 2007


how to encode the innermost loop? for example start with

(+ (a (I 0) (I 1)) (a (I 0) (I 0)))

with the inner loop being the last index (arbirary choice). the main
question to answer is: "relative or absolute addressing?"

either one uses explicit pointer arithmetic, or one uses index
registers. for the outer loops, increments occur infrequently, so
it's best to use pointers. a -> pa

(+ (pa (I 1)) 
   (pa (I 0)))
 
so, the number of registers used for addressing in the inner loop is
equal to the number of grids (including the output one), and one loop
index. if addressing modes like BASE+REL+OFFSET are not available,
extra pointers or indices are needed.

i seem to remember that incrementing pointers using the ALU is bad on
intel, and it's better done using AGU..

i guess there's a lot of room for doing this right or wrong depending
on the architecture. and i sworn never to intel assembly again :)
if C is the target language, i guess some experimentation is in
order. for simple processors, it seems quite straightforward how to
subdivide things so maximum throughput can be attained.

i guess the next target is to generate actual code. that should iron
out the conceptual problems..



Entry: inner loop cont
Date: Tue Apr  3 09:47:24 EDT 2007

the problem is, the indentation shown by 'print-range' is not the same
as the indentation for the C code loop blocks. setup code needs to be
moved out of the loops. going from inner -> outer:


   (+ (grid a (I 0) (I 0)) (grid b (I 0) (I 0)))

needs to be translated to


   (update a 0 (I 0))
   (update b 0 (I 0))

   (+ (grid a (I 0) 0) (grid b (I 0) 0))

   (downate ...)

effectively updating the pointers before the loop is entered. i was
thinking about just shadowing a single variable 'i'

in that case, what is necessary is to make sure each expression
referencing I has only one occurance (or an occurance in the same
position).


instead of construction an intermediate range representation, it might
be more valuable to generate the loop structure directly, following
the same approach as before.


    (a (0 1 2))

->
    (a (L 0) (1 2))
    (a (I 0) (1 2))
    (a (H 0) (1 2))

->

    (let ((a (L a 0))) (a 1 2))
    (let ((a (I a 0))) (a 1 2))
    (let ((a (H a 0))) (a 1 2))

so, basicly just specializing variable names. this boils down to
computing pointers.


so, to resume the downward motion is:

(expr (+ (a 1) (a 0)))

->

(bind ((a_p1 (S a 1))
       (a_p0 (S a 0)))
      (expr (+ (a_p1) (a_p0))))
...


     
ok, i think i got somewhere:

> (p '(+ (a 0 0) (+ (a 1 0) (a 1 1))))
{
	int i;
	for (i = 0; i < (400 * 300); i += 300)
	{
		float* a_p1 = a + (i + (1 * 300));
		float* a_p0 = a + (i + (0 * 300));
		float* x_p0 = x + (i + (0 * 300));
		{
			int j;
			for (j = 0; j < 300; j += 1)
			{
				float* a_p1_p1 = a_p1 + (j + 1);
				float* a_p1_p0 = a_p1 + (j + 0);
				float* a_p0_p0 = a_p0 + (j + 0);
				float* x_p0_p0 = x_p0 + (j + 0);
				*(x_p0_p0) = (*(a_p0_p0) + (*(a_p1_p0) + *(a_p1_p1)));
			}
		}
	}
}

now, there are quite some possible optimizations or simplifications.
one is to leave the inner level as indexed pointers. another is to
replace stride multiplication with addition.



Entry: scheme syntax
Date: Tue Apr  3 22:35:32 EDT 2007

today i (re)discovered:

      (define ((x) a b) (+ a b))

and was surprised that it also works for

      (define (((x)) a b) (+ a b))

first saw it used in SICM




Entry: accumulation / values
Date: Tue Apr  3 23:29:08 EDT 2007

i need an abstraction for (linear) accumulation. no need to mess with
monads. the pattern i'm finding is:

* substitute expression in tree + accumulate a set

i want a function that returns 2 values, the original expression and
the accumulated set.

note that use of assignment like this isn't so bad, beacause it's
encapsulated (linear): there are no references to the object until
it's ready. also note (again) that using monads requires polymorphic
versions of generic list processing operations, and is overkill.

the 'lifting' technique used in the compiler do need monads, because
they are open: each operation modifies a state, and intermediates are
accessible, so pure functional programming is a good idea to keep
backtracking/undo tractable.



Entry: aspect oriented programming
Date: Wed Apr  4 08:56:29 EDT 2007


1972: Parnas "On Decomposing Systems"
1976: Dijkstra introduces term "Separation of Concerns"
1982: Brian Smidth introduces "Reflection"
1991: Metaobject Protocols
1992: Open Implementations
1993: Mini Open Compiler
1997: First paper on AOP
1997: D
2001: AspectJ
2004: JBoss

http://www.cs.indiana.edu/dfried_celebration.html
Anurag Mendhekar: Aspect-oriented programming in the real world



Entry: back to sheepsint
Date: Wed Apr  4 12:49:40 EDT 2007

i need to restart the board design soon, but i do need a fully
functional dev env before i can do that. some more things are
necessary:

proper stateless message interface (CAT = object) for sending code and
performing command completions.


Entry: summary
Date: Wed Apr  4 19:21:27 EDT 2007

THINKING ABOUT PF 

been looking into bootstrapping PF from lowlevel forth core. aspects:
polymorphy and types (clos), linear memory managent (lazy copy),
transition from vector -> list.

the latter is interesting since it contains 2 parts: code: needs a new
interpreter, data: needs a lot of new primitives, maybe combined with
type checking. i wonder whether it's easier to just start from a cons
cell VM directly.


C CODE GENERATION

* separate statements and expressions
* plugin expression transformers


POKE

* using a non-blocked version of C gen


LOOP CODE GENERATION

i think i have the general idea:

* c code generation working
* functional specification mapped to assignment
* nested loops: blocks to bind locally cached index pointers
* additive index arithmetic
* inner loop uses a single index

the scheme code looks simple, and well factored. gut feeling says the
code is simplified enough for gcc's optimizer to tackle it.

i still need to do the border conditions. this will need to be example
driven. next month i might try to plug in some code.




Entry: from forth to PF
Date: Wed Apr  4 19:37:52 EDT 2007


1. data

a PF primitive written in forth looks like:

- (force) collect arguments (list -> vector)
- method lookup
- perform primitive forth code
- (lazy) push arguments (vector -> list)

so the stack is implemented like:

[ list | vector ]

the vector actually needs to be a circular buffer, because it behaves
as a deque: traffic between list and vector is on bottom end, while
primitives operate on top end, unless the primitives accept their
arguments reversed.


2. code

fairly straightforward.


because of the difficult impedance match between list and vector
machines, i think it makes sence to forget about building one on top
of the other, and write only the vm.

an interesting question is whether this can be abstracted. and also,
can i write the VM in itself?

been tinkering a bit with poke.ss and mole.ss
got the basic permutation worked out.




Entry: alan kay name dropping
Date: Wed Apr  4 19:59:37 EDT 2007

from "Proposal to NSF - Granted on August 31st 2006 - Steps Toward The
Reinvention of Programming"

i'm curious about the albert thing. what i read i don't understand
though.. better next time.

motivation and inspiration:
John McCarthy  LISP
... bootstrapping



Entry: persistence & late binding
Date: Thu Apr  5 16:11:14 EDT 2007

so, borduring verder on that article.. i ran into the problem of
saving parsed code, beacause semantics is stored as a procedure. what
about replacing this by a symbol?

assuming data will only be read by a system that has the bindings in
place, this is a valid approach. then bootstrapping can be solved
differently, and all internal representation is just cache.

so..

a word = code object
* a source representation
* a symbolic semantics (other word?)

* a cached transformer procedure (concrete semantics)
* a cached meaning = lambda expression

the cycles in this representation need to be broken somehow.

hmm.. this is actually a lot harder than it sounds, since the cache
really needs to be a cache. probably needs a from-scratch approach.

ok. started the 'symcat' project. for the current project i think i
can live with non-savable parse trees, since it's always possible to
save source code, and i have a working 'reload core' command for use
during compiler development. all in all, the system i'm writing is
fairly straightforward.

so no more about this really cool idea here. see symcat.



Entry: name spaces
Date: Thu Apr  5 16:59:24 EDT 2007

something that's getting on my nerves a bit are CAT namespaces. small
special purpose apps can benifit from the simplicity of a single
namespace and short names, but for CAT i'm not so sure any more. also,
i'd like to catch undefined names early on.



Entry: standalone
Date: Thu Apr  5 17:06:15 EDT 2007

time for the standalone forth. one of the things i've been wanting to
try for a while, but never got to.. i should have a look at flashforth
and also retroforh for inspiration. roadmap:

* 'accept' terminal input into buffer
* 'parse' words
* 'find' a word in the dictionary

compilation is straightforward, but requires some thinking since stuff
will need to go to ram first. (it's multipass, i.e. if .. then).




Entry: reflection
Date: Thu Apr  5 19:37:03 EDT 2007

the ideas of reflection and metacircularity probably go hand in
hand.. in CAT i'm getting a bit annoyed by having to choose between
implementing something as a scheme function, or as a cat function. for
example: semantics is implemented as a scheme function, so it's
technically not accessible from CAT.

let's re-iterate. the point of CAT...

usually, a forth compiler is written in forth. a cross compiler poses
problems in this sence, since the normal 'local feedback loop' doesn't
work. the (re)constructed rationale:

1. forth is extremenly modular: a function is a composition of functions
2. a forth compiler is most naturally expressed in the same way: a forth
   compiler is a composition of compilers (macros).
3. most naturally, forth is implemented metacircularly.
4. i can't do that because the target is too simple -> simulated
5. the metalanguage best reflects the same structure: compositional
6. choosing for a functional language (CAT) -> monadic composition
7. CAT is written in scheme to avoid it's own bootstrapping problem


the last one actually reads as: CAT is an impedance map from scheme to
a compositional language to easier implement an extensible optizing
forth compiler. if CAT is metacircular, there is no need for
scheme. this approach is not used because:

- (plt) scheme is packed with features
- i use a fair amount of scheme to provide primitives. in fact
  'primitives' is not really a good word for it..

so it's best to see CAT as scheme in disguise, and as a vehicle for a
decentralized compiler/interpreter, bound together by monadic
composition. to have the possibility of writing new CAT words is
mainly for extension property (writing the compiler), not for CAT
core.



Entry: nested scope
Date: Thu Apr  5 19:55:49 EDT 2007

as i've learned, these features are really necessary to write a
compiler:

* lexical variables
* quasiquotation
* pattern matching

however, they do serve most purpose adapting to a representation that
is inherently imposed, i.e. assembly language syntax. anything that is
non-compositional is better handled with something like scheme.

however, if you can design everything from scratch, it's probably
quote doable to get by with a couple of combinators and aggressive
factoring.

but, in the end, some form of lexical scope should be possible, if
only for the practical problem of name clashes.. there is only one
question. are names functions or values?

in lisp, they are values, because functions are explicitly invoked: if
a variable is in the head of a list, it's a function. in a
compositional language it would involve something like 'i'.


((a b c) locals
  ... a i ... b i ...)

treating things like values makes them more natural. an abstraction
could be added to do the other (bind as program). then, how to handle
the environments?



NOTE: got lexical variables and quasiquotation working in symcat, but
only by a more direct cat->scheme translation. i dont think it's
really necessary here, since i do most in scheme. also, some name
space issues are still not resolved. maybe i can switch for the next
reqrite tho :)





Entry: back in the solder lab
Date: Mon Apr 23 17:10:12 CEST 2007

things i need to get working before the end of the week:

* sheepsint input switches
* room for xtal on pcb
* capacitors on pots

random hacking:

* 3.3V serial interface
* usb?



Entry: emacs integration
Date: Mon Apr 23 20:03:23 CEST 2007

this screams for a 'once and for all' solution. i'd like to keep brood
portable, so using unix sockets for a console as is done for pf, is
not the way to go. since we're running a lisp in a lisp editor, it's
probably best to keep the one 'default' interface on stdin/stdout as a
lisp channel, and run the console logic in emacs.

maybe a bit in the style of slime?

ok.. following slime to ielm.el, modified to connect it to a running
scheme process. slime is too big for me to make sense of, i might
return later for some features, but i need to get something running
first.

what i need is multiple languages on the same console, or maybe
different buffers?

the whole idea is to have most of the parsing in emacs, so emacs can
make the editing a bit smarter. maybe i should have a look at:




Entry: erepl
Date: Wed Apr 25 14:18:32 CEST 2007

looks like it's working reasonably well.. things to add:

* tab completion
* multipe languages

either parser in emacs, or sending out raw lines. the former is better
for better line editing, it already does that really.. the second is
better so i don't need to rewrite anything, though forth parsers are
really simple and i'm not using a tremendous amount of special plt
read syntax. i wonder if emacs read syntax is extensible?

anyways, what i do need is a way to switch the mode in emacs, and not
in the target scheme image.



Entry: fresh install
Date: Thu Apr 26 09:16:42 EDT 2007

i tried a fresh install, but apparently my compile script tries to
compile stuff in the plt dist, starting with the deps of "match.ss".

"sudo ./go" should work..

so, how to install? should i keep all the source files as 'writable'?
should i keep it in dev land for a while? maybe best.



Entry: project directory
Date: Sat Apr 28 19:24:10 CEST 2007

i need to solve the following problems:
- core should be installed system-wide
- project directory should contain multiple projects

the idea is that 'clicking' on a state file should bring up
everything.

let's try to make sense of this: the brood system is aimed at
developers. in that sense, it is encouraged to hack the system, which
means the scheme files should not be stored system-wide, and they
should be writable. this allows the compilation cache to remain as it
is.

the source dir has a subdir called 'prj' which contains
subdirectories, one for each project. these individual subdirectories
could be managed using darcs.

it's absolutely essential to find a way to have the TARGET determine
which project to load. in order to do this, we use the reply of 'ping'
as the name of the project.

there is one default project for each architecture, which serves as an
example.

-> compilation from scheme: right now i invoke mzc, it's probably
   better to do so from a scheme script.


all this seems to work. next problems:

* windows / osx : emacs + serial port config

* using snot : rewrite all language repls to standard interface : one
  line (string) at a time, require from snot for it to be 1 or more
  valid s-expressions.

for the last one, i think i found it: just have 'prompt' display the
prompt and accept the next line input, this can be done using a simple
coroutine/continuation trick.


Entry: getting to working usb
Date: Sun May  6 13:04:44 CEST 2007


roadmap:

* constants as forth file
* platform dependent constants
* 2550 init
* get serial monitor working
* ...


Entry: usb debugging
Date: Mon May  7 13:36:31 CEST 2007

got the kernel messages going etc.. looked at doc/usb/asmusb.asm
(johannes adapted this from C code) to find out i need to enable full
speed instead of low speed: #0x14 -> UCFG. now i get transactions.

time for the highlevel protocol.


Entry: usb device descriptors : usb.ss
Date: Sat May 12 13:25:30 CEST 2007

looks like it's working: i can compile device descriptors from a more
reasonalble highlevel description. next step is to organize the tables
in flash.

ignorant of content, the thing it needs to do is to map

device     -> (n,addr)
(string,i) -> (n,addr)
(config,i) -> (n,addr)

the logic then needs to transfer the buffer in chunks

so i need a proper tree structure in flash. preferably one that can
handle errors so the device is a bit robust.

these things are read-only, so they can be implemented directly as
code. for example:

device/string/config  ( id -- string )

which is encoded as

: device  3 word-table 
  	  addr0 ,,
	  addr1 ,,
	  addr2 ,,

: addr0   length , 1 , 2 , 
: addr1   length , 3 , 4 , 5 ,



here 'word-table' does bound checking + throws exception


for error handling: it's probably easier to just use 'max' to limit
the offset, then install the last redirect as an error handler, so:

: config   3 min route
  	   config0 ;
	   config1 ;
	   config2 ;
	   error ;





Entry: conditionals < and >=
Date: Sat May 12 14:54:37 CEST 2007

in pic18-comp.ss they are implemented as macro predicates, following
the standard forth comparison operators: consume 2, leave condition.
( a b -- ? ). these can be followed by if.

i've been looking into a more general way of using the CPFS[EQ|GT|LT]
opcodes, by mapping them onto the conditional jump implementation.
been avoiding this for a while, because i have unsigned 'max' and
'min'.

the thing is 'cbra'. it consumes a condition, and compiles a
conditional branch. does this really make sense? the other
conditionals can be inverted, these cannot: only by swapping jump
targets. so:

- change 'not' to support a new pseudo op
- change 'cbra' to do this branch based swapping

looks like it's working.

an optimization is possible in case of single opcode instructions, but
it's probably better to just code them as macros. needs some thought.


Entry: usb descriptors again
Date: Sat May 12 15:53:12 CEST 2007

it's probably best to just keep using 'route' in combination with
'min' and an error handler. let's standardize a 'buffer' or a 'string'
to what i already use for the 'ping' command:

: my-flash-buffer
	string>f
	length ,
	0 , 1 , 2 , 3 , ;

this means that the word 'my-flash-buffer' sets the current flash
object (the f register). a string is a flash object which has its
length stored in the first byte. so '@f++' on a string object will
give the length, and leave f pointing to the raw bytes, so successive
'@f++' will read out the bytes.

the usb descriptors should be stored in exactly the same way: device,
configuration and string should just set the current flash object,
which is understood to be a purrr string.

so, the following output

((device
  (16 1 16 1 0 0 0 8 216 4 1 0 4 3 2 1))
 (strings
  ((23 3 68 101 102 97 117 108 116 32 67 111 110 102 105 103 117 114 97 116 105 111 110)
   (19 3 68 101 102 97 117 108 116 32 73 110 116 101 114 102 97 99 101)
   (5 3 48 46 48)
   (10 3 85 83 66 32 72 97 99 107)
   (28 3 77 105 99 114 111 99 104 105 112 32 84 101 99 104 110 111 108 111 103 121 44 32 73 110 99 46)))
 (configs
  ((9 2 25 0 1 0 0 160 50 9 4 1 0 1 3 1 1 1 7 5 128 160 8 0 0))))


can be transformed into:

: device  string>f <length> , ... ,
: string  5 min route string0 ; string1 ; ... ; string-error ;
: config  1 min route config0 ; config-error ;

: string0 string>f <length> , ... ,
: config0 ...


maybe it's easier to just eliminate the intermediate names, since
there is a notion of arbitraryness involved. they are just local
labels, as used with if ... then. all in all, just generating a couple
of <tag><number> names is probably easiest.

ok, done.

now loading. the thing to fix next is a global path for any kind of
file loading mechanism.



Entry: some weird bug with forth parsing
Date: Sun May 13 13:35:02 CEST 2007

apparently, for parsing macros (color macros) like 'load' and 'path',
there is a problem when the macro that's implementing the behaviour,
popping the name from the data stack, is not defined..

i don't know why.. maybe i need to make that macro parsing part a bit
more transparent.

currently parsing words are a bit of a hack. i need to get to the core
of the problem and fix it. again:

* forth macros are cat words, as such they are 1-1 semantic/syntactic

* forth parsing transfers parsing words to quoting code: something
  forth source cannot represent, but parsed cat code can.

maybe i need a symbolic intermediate form, where lists are quoted
explicitly? like PF. with a mapping like:

(load file.f) -> (('file.f load) run)

hmm.. it's probably just a bad day to make decisions.

ok. calmed down a bit. 

load-usb is working now.
next: hands on transfer.



Entry: state machine or task?
Date: Sun May 13 15:55:33 CEST 2007

a task that does usb transfers makes sense. however, since i'm still
debugging i think a more lowlevel approach is better. when i got it
running, i can write everything in blocking form.


Entry: jump bits
Date: Sun May 13 15:56:55 CEST 2007

words use relative addressing. this can lead to trouble. what about this:

* just assemble, but when an address doesn't fit, keep it symbolic.

* 3rd pass: gather all addresses, and compile words which contain a
  goto statement to the words that were called, but not reachable.

this will keep code small, and the assembler simple: no need for
variable size goto instructions inside words. the rationale is: this
forth is for lowlevel stuff. for highlevel things, use a DTC on top of
this: there you don't have a problem.


Entry: stamp dead
Date: Sun May 13 19:57:41 CEST 2007

serial port driver dead or something? i don't know. it doesnt seem to
be a software problem. chip isn't doing anything. without scope hard
to debug... so plan B

1. brood + snot (1 evening)
2. sheepsint buttons + audio out port (1 evening)

-> leuven for scope and other stuff..


Entry: stamp back
Date: Sun May 20 12:01:59 CEST 2007

something going on here.. i tried stamp 2, which refused to work a
couple of times, until i got it going. then replaced with the original
'broken' stamp, and now that one works too.

maybe it's just my breadboard.. since i did have to move 2 pins to the
left on the breadboard because the 2nd stamp's pin header is too
big.



Entry: late binding
Date: Sun May 20 12:47:48 CEST 2007

what i need next is some form of late binding to do incremental
debug. the code runs fine up to a point from which i need to make
small changes to the code. reloading there is a drag, so i need a
proper construct.

   defer broem

   2variable broem-hook
   : broem broem-hook run-hook ;


some premature optimizations: since these variables don't really need
to be accessible, it's maybe better to put them somewhere behind the
ram bank, for example shadowed by the FSR registers.. this way a hook
can be represented by a 1 byte XT.



Entry: color macros
Date: Sun May 20 18:30:49 CEST 2007

what i mean with color macros is macros that modify the 'color' of
subsequent words. currently i have no way to implement new parsing
words in forth. this is not a good thing.. something is broken, but i
dont know what exactly. probably my understanding...

problem: parsing words use automatic name mapping. this is bad, since
it's viral. meaning, once you start doing things like that it's all
over the place: there is really no clean way to nest parsing words.

so i need a different approach: extend the partial evaluator to
include symbols. the deal is this: the PE uses the assembly buffer as
a data stack. because some words use the CAT data stack for 'data'
items, things get confusing.

so, the thing is: i need a single macro that quotes the next atom in
the input stream as a literal, and then use that.


Entry: partial evaluation revisited
Date: Sun May 20 19:31:53 CEST 2007


i ran into a pattern: the assembler buffer can be used as a data stack
to perform partial evaluation. i don't have a proper way to make this
sound, but it seems to eliminate the need for an 'interpret mode' in
the sense of classical forth. 

the interpret mode is replaced by a set of rewrite rules that will
perform compile-time evaluation. so instead of

	[ 1 2 + ]L 

we just have

        1 2 +

with the same result: 3 being compiled. actually, in the latter case
purrr will produce [movlw (1 2 +)], so the evaluation can be delayed
as long as possible.

this can be extended to the following pattern: allow target forth
values to be richer than just numbers, but require that they can be
combined into lowlevel constructs.

since i use this trick a lot, why not make it a feaure instead of an
optimization? currently the postcondition of compiling a literal is
valid assembler code. what about relaxing this to a delayed literal
stack, and introducing a 2nd pass to comb out all the remaining,
non-optimized literals.

once i have this, partial evaluation becomes better defined: quoted
symbols can be included and can be used in parsing macros. the CAT
data stack can then be used for control operations only.

big change. probably requires a temporary fork.

NEXT: 'lit' macro preprocessing step

is it possible to make 'lit' a pseudo-asm operation? yes, but the
disadvantage is that it's not 1->1. is this required? yes. the asm is
1-1 sym<->bin, so this needs to be solved in the compiler.

considering the percentage of code that intersects with delaying
'lit', i guess it's best to wait until after the big deadline, and
work around the macro stuff now. as a matter of fact, i can still do
it the old way, just adding a single 'quote' operator, for example
backtick `.

that's a good idea, as long as there's a [`] too, meaning macros can
have literally quoted symbols in them. with those 2 primitives, all
parsing words can be implemented.



Entry: back to debugging -- deferred words
Date: Sun May 20 20:37:11 CEST 2007

if the idea is just to get debugging working, it's easy: execute will
do enough.



Entry: back to thinking about the literal stack..
Date: Sun May 20 23:53:26 CEST 2007

there's a jucy fruit on the tree somewhere.. but i can't see it
through the thick leafs. a literal stack is an interesting idea, and
also is commutation of some constructs with literal stack..

i noticed that a problem atm is hardcoding of [lit a b] instructions:
the number of arguments is hardcoded. could be fixed with a postproc
step, but have to be careful there..



Entry: parsing macros
Date: Mon May 21 11:42:18 CEST 2007

forth parsing words require an input to be attached. my model does not
allow that: it requires parsing macros to live in a separate class.

hmm.. this is really kind of complicated. what about providing a
mechanism to create parsing macros as pure symbolic macros?

hmm.. ok, i got symbolic expansion macros now, but that's not the same
as recursive parsing macros!

i'm having difficulty getting my head around all this..

next step is to write a macro mode which recursively calls the parser.


ok, i think i found it now: the trick is to allow composition. the
best way to do this is probably to write the parsers as CAT words.



Entry: parsing
Date: Mon May 21 14:16:48 CEST 2007

i think i got it now. i'm just doing parsing wrong: each parser should
have an explicit 'read' and 'write' operation. then some glue can be
constructed to compose all of them.

'read' reads the next input atom, and 'write' outputs CAT code in
parsed or symbolic form.

i need to really let this go and get the usb driver working.. rewrite
stuff accumulated thus far:

- explicit literal stack with compile postprocess
- parser with recursive composition

anyway, the bigger picture becomes visible: 3 different interpreters

- compiler is kept in compositional mode: every source atom
  corresponds to a single action in CAT

- before: parser converts multiword constructs into single word constructs

- after: assembler uses localized arguments -> not compositional, just
  a sequence of independent commands



Entry: grounding problems
Date: Mon May 21 16:24:16 CEST 2007

very strange: if i touch the table, the pic resets. some kind of EM
interference. i don't really know what's going on, but putting the
stamp in a cage worked: just a grounded metal top of a metal box.

if i stick the probe in the carpet, i can measure about 25 V
peak-to-peak 50Hz signal. maybe i should just ground my table?

ok, i connected the TV cable shield plugged into the cable modem to
the case of zzz. without this cable there's 114V ac accross. this
seems to fix the problem: no more 50Hz on the carpet.


Entry: defer
Date: Mon May 21 17:11:41 CEST 2007

hmm... the only thing i really need is to 'overwrite' a
function. using a separate ram table for deferred words might be a
good solution if a lot of them are needed, but it sure does complicate
matters. moreover: it requires loading values to ram etc.. what i need
is really a cheap hack:

	 : someword nopf 1 2 3 ;

the 'nopf' could be overwritten, since it's #xffff. this opcode can
then refer to the next definition.



Entry: usb debugging
Date: Tue May 22 13:49:39 CEST 2007

using usbmon, i get this as first failure after the first request,
which is a device request:

d97cb540 144438646 S Ci:000:00 s 80 06 0100 0000 0040 64 <
d97cb540 145068664 C Ci:000:00 -84 0

the odd thing is the request length, which is set to 64 and not
8. status code is -84 which means
 
http://www.mail-archive.com/linux-usb-devel@lists.sourceforge.net/msg25936.html

so why doesn't it respond at all?

maybe i need to acknowledge the TRNIF before sending a response?



Entry: the a & f registers
Date: Tue May 22 16:50:37 CEST 2007

i need a proper coding style. lets try the following: caller is
responsable for saving the current object context. this means it's
regarded as a low level feature, and bad coding style to pass
arguments in the a and f register.

conclusion: use them only in small lowlevel words, and use functional
words or different object representations on higher levels.

CURRENT OBJECT = BAD !!



Entry: different macro implementation
Date: Tue May 22 16:57:42 CEST 2007

or better: an extension

currently 'macro' in forth only takes names from the macro dict. what
about allowing runtime behaviour here?



Entry: ram copy
Date: Wed May 23 11:17:14 CEST 2007

funny, but i don't have any ram copy facility! the reason is of course
there is only one free indirect addressing register to use. in order
to make a faster one, interrupts need to be disabled and one of the
two other regs need to be used.

no time to think about that now, so i'm going to avoid mem->mem copy,
and save only what i need.. (SETUP request is what i'd like to copy)



Entry: something wrong with XINST
Date: Thu May 24 12:19:49 CEST 2007

this is probably the cause of a lot of my misery: somehow access bank
variables don't work right when XINST indirect addressing is
enabled. for the workshop i switched back to old inst, with access
bank.

need to figure out what's going on there later: somehow
fetching/storing address 96 doesn't work either.. if i stay low, it
works.


Entry: bouncing ball physics
Date: Thu May 24 15:22:02 CEST 2007

a bouncing ball can be made using the natural rollover from 255->0,
combined with some coordinate mapping.

A---B
|   |
D---C

->

A---B---A
|   |   |
D---C---D
|   |   |
A---B---A


so, using the high bit to signify whether a coordinate is reversed, the
operation simply becomes:

: bounce clc rot<<c
  	 c? if
	    rot>>c
	 else
            rot>>c #x7F xor
	 then ;

or even simpler: 1st 7 high? test if -1 xor then


Entry: johannes config bits
Date: Sat May 26 18:52:37 CEST 2007

low voltage program off
HS oscillator
power up timer off



Entry: meta workshop notes
Date: Mon May 28 09:25:16 CEST 2007

all went really well after day 1 of total chaos, very happy with the
result in the end.

some remarks:

1. need a proper 'erase-all' in case the chip is messed up
2. need interaction words composition -> all symbolic
3. more docs or reference words -> find some automated mechanism
4. need simpler conditionals
5. maybe distinguish between @high? and high? -> btf are odd duck
6. investigate extended instruction set troubles
7. automate 'expose'


Entry: quoting symbols
Date: Mon May 28 09:35:39 CEST 2007

so, why not use syntax for this?

`hello : 1 2 3 ;

i think i need to preserve parsing words for the simple reason that
':' is a parsing word. changing that behaviour makes things very
different from standard forth. 

however, internally the parsing words should compile to the literal
stack.

the code above is actually quite clean. it has a symbolic
representation as CAT code, in the form of lisp's quote form. this
could be translated to forth in a minimal way. i could use this
symbolic representation as the output of the parsing stage. an
alternative lexer could then be used to make use of the more
functional forth described above (one without parsing words, only some
symbolic quote mechanism, where macros are purely concatenative).

note that since it's not legal to have a literal symbol not optimized
away, the ':' is redundant: symbols present after conpilation are just
labels. maybe even better, symbols are always labels. so why not get
rid of the space?

:help 1 2 3 ;

so, if parsing macros are symbolic transformers, interactive macros
could be the same. 'test words' if you want. this could lead to a
better simulator. the first version looks better, and has `<word>
compile <word> as literal.



Entry: literal stack
Date: Mon May 28 10:01:19 CEST 2007

just a quick look at what it would take:
1. abstract all literal patterns
2. make a local change in the abstraction

so this boils down to writing a generic pattern generator for literal
opti, and a mechanism to execute arbitrary macros as a pattern. this
is already there, but a bit of a hack. maybe it should be the default?
ok. there is already 'lit' defined in comp.ss, which can be extended
to take multiple arguments.


Entry: cache
Date: Mon May 28 10:09:10 CEST 2007

an annoying thing in the current code is to have to reload everything
when an implementation of a word changes: the cache never invalidates.
or, it's not a cache.

so i need to change the implementation of 'word' to include a cache
mechanism. it would be interesting to plug into the cache mechanism of
scheme, but that would require either a lowlevel thing, or something
with namespaces.


Entry: bug fixing day
Date: Sun Jun  3 12:32:20 CEST 2007

time to clean up some minor annoyances:

* serial port settings: use 'system' + platform script when port is opened
* faster upload (faster baudrate?)
* snot integration + better emacs integration
* fix parser -> parse to symbolic code
* create interpret macros
* sheepsint: build board tests for proto



Entry: monad stuff
Date: Sun Jun  3 17:15:08 CEST 2007

one problem i have with the way i perform function lifting (monads),
is that it's not mixable: i can't just 'tag on' another monad.

maybe this should be made a little more explicit. the next thing i
need to implement is parsing macros: symbolic preprocessors to map
forth to something closer to 1-1 cat code.

last time i got lucky: i was able to use code as one of the input
streams. now that's not so easy any more: there is an input stream
which is not code.

i guess the easiest way to tackle this is to just define a prototype
CAT function for a parsing word, and work from there.

in rout -> in+ rout+

with rout a reversely accumulated list of atoms. it's like the
assembler proto, but with an extra 'in' state value.

the default parser moves an atom from in -> rout

it would be nice to be able to compose parsing macros, so they really
should be a special kind of macro: one built on top of ordinary
macros, with the input stream on top of stack, and a primitive 'read'
which takes an input object.



Entry: snotification
Date: Sun Jun  3 20:55:19 CEST 2007

-> entry point = load state + enter main loop
-> main loop = event dispatch

got it mainly working, but i'm experiencing problems with asynchronous
messages.. maybe i should get rid of the dots?



Entry: faith, evolution and programming languages
Date: Tue Jun  5 11:21:39 CEST 2007

bye Phillip Wadler, April 27, 2007
Google Tech Talks

http://video.google.com/url?docid=-4167170843018186532

a bit over my head, but things to look into:

- a logic corresponding to a programming language
- contracts
- haskell type classes for polymorphy

about logic & programming languages:
http://video.google.com/url?docid=-4851250372422374791



Entry: boot config
Date: Wed Jun  6 19:06:17 CEST 2007

i'm looking for a better default for the boot loader, to make sure a
project is either in one of 2 states:

virgin:   run purrr interpreter on boot, no interrupts
app:      fresh reset vector + isr installed

if i make it so that 'scratch' can safely erase the boot sector,
things might get more robust.

things that can go wrong:

- reset not defined, but isr defined
   solotion: always define them in the same macro
- reset or isr defined, but target code is gone
   solution: always erase the boot block on 'scrap'



Entry: forthtv pro
Date: Tue Jun 12 04:18:08 CEST 2007

let's see. what i need is a dual processor 18f1220 system

VIDEO
- low bandwidth I2C master (pull: poll at line frequency)
- video using USART out
- audio sampled at video line frequency

HUB
- low bandwidth I2C slave (push)
- keyboard interface (bitbanged)
- USART for host serial


this is a nice excuse to prepare brood for multicore projects.

note that the 18 pin chips do not have I2C, so i need to go to 28 pin
versions.



Entry: bank select
Date: Tue Jun 19 15:44:50 CEST 2007

keeping bsr at a fixed value, the extra bit in the instructions that
access the register file can be used as an address bit. note 1x20 has
only 256 bytes of ram.



Entry: sheepsint core todo
Date: Wed Jun 20 14:52:53 CEST 2007


- standalone boot + fall into debugger
- battery operated
- brood async io
- 16 bit math for control ops
- note/exponential lookup tables
- pot denoise?
- keep it working (*)


(*) don't know if i can do that yet. what i can do is to freeze the
software: make a fork of brood. i also can't fix the boot block. but i
can fix the app block.. maybe i should go that way.


Entry: text interface
Date: Wed Jun 20 16:07:24 CEST 2007

i probably need to take a deep breath and change the monitor from
binary to text. this would make it a bit easier to standardize, and
also, make it usable without the brood system, for debug purposes..



Entry: application control flow
Date: Wed Jun 20 16:38:38 CEST 2007


1) boot
2) mainloop (contains RX check)
3) on RX, fall into interpreter

then from the interpreter, 2) can be entered. this works like a
charm. sheepsint runs fine on 2 AAA batteries too using
18LF1220.

summary:
- empty bootblock -> fall into interpreter ('warm')
- application -> install reset and isr vector (best at the same time) 

something which is important though: if there's no serial TX connected
to the pic RX pin, something needs to pull the line high. on the
CATkit board, the easiest way is to insert a jumper inbetween RX and
TX.


Entry: DTC
Date: Wed Jun 20 18:03:14 CEST 2007

time to do the real job: a dtc forth. what i'd like to do is to chop
the chip up in 2 pieces. first half is kernel + audio, second half is
DTC on top of that.

this is not so easy as it looks :)

but.. it might be more robust. basicly i have the following choice:
A. go brood/snot and finish that interface (requires emacs + plt)
B. go binary and use just a terminal emulator

i basicly promised B. which, for education and not-too-sophisticated
use, is what we need. getting A. ready to the point where i can teach
it is too much, so i have no choice really. i need a real forth!

and i need it before i can do more synth stuff.. or better, while
doing. so what is necessary?

1. terminal input with XON/XOFF
2. dictionary
3. compile link to ram
4. copy ram->rom



Entry: conditionals
Date: Wed Jun 20 21:16:41 CEST 2007

ok.. using the flag macros can be fast, but it's alsy really really
hard to use if a condition always needs to be a macro. so i need basic
'=' etc.. using nfdrop, and proper if that accepts any kind of byte.

not completely tested, but asm looks ok.


Entry: mini module system
Date: Wed Jun 20 22:19:50 CEST 2007

basicly, do something similar as in PF: a 'provide' word will skip
loading the current file if the word already exists.


Entry: terminal.f
Date: Wed Jun 20 23:10:17 CEST 2007

thinking about this XON/XOFF thing: there is really no way around
doing this with interrupts and proper buffers. the problem is really
that the when we send an XOFF, a byte can be already in progress. in
fact, if there's no break, and the host is sending full speed, it
probably is.

so a proper interrupt/buffer scheme is necessary.

time to dig up those cool 15 byte buffers again :)


Entry: read/write pattern
Date: Thu Jun 21 12:02:31 CEST 2007

something which occurs a lot is an update to memory which i'd like to
put in a macro. till now i always solved this using a macro which
expects a memory address. maybe that's the only sane solution? need to
think about this..

a bit of a hack, but something that might be interesting: have a
'lastref' macro which compiles a ref to the last referred variable.



Entry: workshop
Date: Thu Jun 21 12:41:26 CEST 2007

this serial terminal thing is not going very fast.. maybe i should
focus on finishing the 16 bit words first, then build a tethered DTC
on top of that?

maybe indeed best not to stress too much. it is working. i just need
to add some control to the synth.



Entry: multiplication
Date: Thu Jun 21 21:38:45 CEST 2007

the first thing to do is to create a generic unsigned multiplication,
and derive the other muls from that.

let's call 'z' a 8 bit shift (256)

we need to compute (x0 + x1 z) (y0 + y1 z)
all coefficients are 0 - 255

this gives

z^0   x0 y0
z^1   x1 y0
      x0 y1
z^2   x1 y1

the lowest of 4 bytes is unaffected by the 3 bottom ones
the second of 4                            top one

so, i'd like to do this
- fast
- functional

so no temp variables

the variables are presented as

  x0 x1 y0 y1

every number is used twice

now the juggling

done: i gave up on not using ram. it's probably possible to just use
the stacks, but it's really inconvenient due to the 'convolutive'
nature of multiplication. what i mean is: multiplication has all to
all datadependencies, and is not easily serialized. if it is
serialized, it needs random access (variable names) or at least
relative indexing. forth is not good at that.



Entry: refactoring
Date: Fri Jun 22 13:18:53 CEST 2007

some things that need to change in brood to make it easier to
understand and modify:

- words need to be cached, not delayed evaled, so incremental loads are possible
- parser macros -> purely symbolic, using only a 'quote' word for some 'pure forth'
- partial evaluator needs to be properly defined, so more elaborate operations are possible. i.e. explicit literal stack + commutation of operations with literals.

so, in short: CACHE, PUREFORTH intermediate (without parsing words),
and explicit PARTIAL EVALUATOR.

the PE needs to work together with the PUREFORTH, to be able to have
symbols als "ghost values".



Entry: forth vs DSP
Date: Fri Jun 22 13:27:09 CEST 2007

following the remark above about multiplication. most DSP stuff is
like that, so i wonder if it makes much sense to write a forth for the
dsPIC. anyways, it shouldn't be too hard once i clean up the compiler
code a bit.


Entry: sheepsint next
Date: Fri Jun 22 16:34:41 CEST 2007

ok, DTC and multiplier are working. time to get busy :)

maybe i do need to think a bit about the memory model though. might be
interesting to have full device control.


Entry: memory model
Date: Fri Jun 22 17:40:02 CEST 2007

what about simply:

* kernel is overlayed with RAM + EEPROM
* all the rest is flash

note that only the first 32kb can contain VM code, due to VM using 2
bits. the other 32kb is addressable, but only usable for tables etc..
not important now for PICs i use.

so ram max address space is max #x1000, data eeprom i'm not sure there
is a limit. but we have only #x100 and it's not used. what about we
map the flash to the upper 32 kb, and ram from the start? then eeprom
could be added later.


Entry: vm macros
Date: Sat Jun 23 22:47:10 CEST 2007

basicly, i need control words. so i need a mechanism for vm macros.
ok, in place. next is to just write macros, and to add a mechanism for
loading.

actually, this is kind of interesting, since it requires 'control
stack' operations. to re-iterate, i have these kinds of macros:


- peephole optimizers (asm buffer + used as literal stack)
- control operations (use data stack as control stack)
- recursive macros
- simple incremental macros (writer monad)
- whole-state assembler macros (i.e. global optimization)


if i make the stacks a bit more obvious: literal and control stack
need to be independent. control stack is sort of a literal return
stack.

so i just need to write accessors that bridge literal stack (asm
buffer) and control stack (data stack).

the more general thing that interests me is to make more functionality
available to the forth level, so more powerful macros can be written
straight in forth, without having to resort to tricks. in short: i
need a meta-forth, not a meta-cat, so cat can be tucked away as an
implementation/intermediate language.


Entry: compilation stack and word names
Date: Mon Jun 25 11:40:58 CEST 2007

i find some standard forth words a bit confusiong. it's probably
easier to start calling the compilation stack 'c' and be explicit
about the traffic.

there are only 2 label operations: localsym>c (generates a new label)
and label>c (compiles a label reference for the assembler).

in ordinary forth, labels can be patched, effectively implementing a
dual pass assembly. since we're not using mutation, we just generate a
label at the first occurence (instead of reserving an empty cell and
pushing its address) and bind it to opcodes as required later. these
symbols will be bound by the multipass assembler later.


Entry: writer macros
Date: Mon Jun 25 11:56:23 CEST 2007

these are confusing. maybe i really shouldn't distinguish between
'writer' macros, and 'asm buffer' macros. the the writer thing is
clumsy and a bit hard to understand. so i'm taking it out.

+ it's simpler: i'm using some I/O style monad '>asm'

- writer macros can't be isolated any more (assumption needs to be:
  modifies the whole state, not just concatenation.)

this doesn't seem to be a big disadvantage. it's probably better to
use some kind of tag system to classify macros according to
properties. the only thing i use it for is optimization, where missing
a classification means some optimization can't be done, so it won't
cause fatal errors.



Entry: make-composer
Date: Mon Jun 25 13:24:19 CEST 2007

another thing i'm running into is my terminology about namespaces. if
i have a collection of words, i'd like to specificy:

- source dictionary (semantics)
- destination dictionary (def)
- parser (syntax )

currently that's make-composer, but the names used are a bit
confusing. this can be done better. maybe i should just rename
make-composer to define/parse/find.



Entry: parser words
Date: Mon Jun 25 13:38:01 CEST 2007

this needs a thought about what to do with parsing words, mostly
quoted symbols. i guess it's safest to put them on the compilation
stack, so i don't need any literal optimizations.

Entry: todo
Date: Mon Jun 25 13:38:54 CEST 2007

- take out all writer stuff OK
- rename 
  	 asm-buffer-find       to       find-asm-buffer
	 asm-buffer-register!  to  register!-asm-buffer
	 state-parse	       to      parse-state
- fix parser macros: decide on lit/comp stack
- fix assembler evaluator


i'm not going to change th find/register!/parse names. this is just
cosmetics..

about fixing the assembler evaluator. what about requiring all literal
arguments to be cat code?



Entry: literal stack + compilation stack
Date: Mon Jun 25 14:37:20 CEST 2007

the important thing about stacks is that you need two of them, i once
read. which seems to be the case. currently i'm trying to figure out
what should go to what by default.

the idea of the 'literal stack' is simply to be able to do some
computation at compile time. a nice feature here is that a lot of
operations become more natural. for example:

	   1 2 +

is really just 3. and this is a mandatory optimization in
badnop. something you can rely on as a feature. standard forth would
make this explicit

     	  [ 1 2 + ]L

the reason i dont use the above is that my meta language is not
forth. it's CAT. more importantly, CAT is much more powerful than the
simple 8 bit forth is.

so, the idea goes:
- mandatory literal optimization (compile time evaluation)
- forth extended with 'ghost' types

the ghost types are things that make no sense for the microcontroller,
but when they are combined with other ghost types, result in things
that do make sense. the most obvious one is assembler labels:


           ' foo

will compile code that loads the (symbolic) address of foo. if this is
followed by a macro that consumes it, the whole can be reduced to code
that does have a meaning on the microcontroller.

i'm not 100% convinced this is a good idea (not being explicit), but
it does feel like one. what i'm looking for is to give it a decent
meaning. and to find out when to use the literal stack, and when to
use the compilation stack.

another thorn is the way the literal stack is implemented, but that
can be fixed later. right now i need to get the semantics right.



i'm not asking the right question..


what's the real problem here? the target chip has a clear separation
of ROM and RAM. this is both convenient (code is persistent), and not
(they need to be treated differently).

what i'd like to do is to make a source file correspond to only
ROM. standard forth doesn't do that: loading a file both writes code
and initializes data. i guess this is the main reason why things are
different for me:

 harvard: ram initialization (run-time code) and meta compilation
 (compile-time code) are strictly separate.

 von-neumann: both can be done at the same time (program load time),
 and blurr together.

so what does this have to do with the literal stack?

- the meta language is not forth
- i'm trying to disguise this

basicly, i'd like to not think about this thing being a
cross-compiler, and act as if everything runs on the target. one way
of doing that, is to require compile time evaluation whenever it is
possible. as a result, the simple recursive macro system, which does
not refer to the real meta language directly, becomes more powerful:
required partial evaluation gives it some run-time power, instead of
merely being passive concatenation of code.

so the real question is:

  how to simplify the target language such that no explicit reference
  to the meta language is ever necessary, and all macros have a
  compositional semantics.


the way that seems most natural to me is:

- partial evaluation is the default: act as if everything is done at
  run time (like "1 2 +"), but write the macros such that they perform
  compile time compilation + raise an error when higher level things
  can't be resolved at compile time.

- some constructs use the COMPILATION STACK referred to as 'c'. this
  is mainly intended for code blocks, and serves a bit the role of the
  return stack.


this also gives the solution for parsing words: their default
semantics is to map something to a literal compiler. 

a common problem i encountered is a macro which has 2 references to
the same name. this is now easily solved using the compilation stack.

so the key is really in the words '>c' and 'c>'

   

Entry: vm words and literals
Date: Mon Jun 25 15:18:13 CEST 2007

so, looking at the remarks before.. the literal stack is really more
than just literals. it could contain words to. words in their normal
meaning are calls.

so: the assembler buffer is just a stack of symbols, bound to
semantics (literal, call, jump). what i really need is 2 new opcodes:
lit and word, that will be resolved in the assembler, but that can be
used in the optimizer and partial evaluator without too much trouble.

so i think i see the roadmap now:

1. fix assembler to take these opcodes:
      cw	 call word  (code)
      jw         jump word
      qw         quote word (data)

   which are really just the primitives used in the VM

2. fix the peephole optimizer to operate on those words


this will give a proper semantics to the literal stack, basicly it
will then contain words + their meaning: code or data. again a simple
pattern: delay low level representation as long as possible.

ok.
now i need to check first if the monitor code still runs..
it does. time to fix this.

it's probably easiest to create an extra assembly step which filters
out the pseudo ops. could be interesting to clean up the assembler a
bit.

i'm writing pic18-compile-post now, and will start using 'values' to
do the expansion. at first i though this values thing was a bit
clumsy, but having to wrap things in a list is usually more work: it's
better to do this in the consumer using call-with-values, than in the
producer when there are a lot more producers than consumers. which is
the case here..


Entry: literals : save
Date: Mon Jun 25 16:02:53 CEST 2007


oops. too much coffee, going too fast..  i AM doing SAVE for each
literal, so the postprocess step shoul maybe perform the save too?
this is a bit more complicated than i thought..

so: when to do SAVE?

currently save does:

 ((['drop] save)            '())
 (([op 'POSTDEC0 0 0] save) `([,op INDF0 1 0]))
 ((save)                    '([dup]))


what about just a second compiler pass with the word 'save' ?

that 2nd pass seems to work. problem now is a lot of the literal
macros do need their arguments. am i going to try to fix that now?
maybe think a bit about how to do that in a smart way..

ok. roadmap:

- just added c> == '(qw) op>asm
- replace all lit macros by a qw macro, and remove them from expose hack

done, now for the calls

TODO:
- replace branches and calls with pseudo ops
- fix vm control ops
- start working on the control part of synth


Entry: multiple passes: pseudo assembly language
Date: Wed Jun 27 10:41:04 CEST 2007

so the pseudo assembly langauge is a bit more explicit now. between
forth and real assembler there is a representation where the opcodes
qw, cw and jw are used. they give a proper typed stack meaning to the
assembly buffer.

as a consequence, quoting in macros can be eliminated, and pure
postfix notation can be used, using the 'word>c' operation, takes a
code word from the assembly buffer, and moves the tag to the
compilation stack. in short, instead of

	    	   ' word <...>

one can do

                   word word>c <...>		                   

where <...> handles the symbol/address.



now wait.. if quote is not longer necessary in the compile time
semantics, why use it? (it is still necessary in the run time
code. back to that later.)

the whole idea seems to be: because all behaviour is postponed
(compilation), it doesn't have to be stopped before it
happens. meaning: if i enter 'broem' at a command prompt, it will
execute -> damage done. if i want to not run it, i need to 'quote' it,
which means postpone execution. during compilation, everything is
postponed, so no need for quotation! wonder if i can make that
a bit more formal.

this does ring a bell somewhere. been reading about the macro/module
system in PLT scheme. something with keeping run time and compile time
separated to make dependencies explicit.. anyways.

oops. not completely true: if it's a macro, and you want to refer to
it, it needs to be quoted.

an 'almost right' thing here.. if there are no macros, it's
right. macros are code, the rest is data during compilation. if
there's no code, true. back to my original point: quoting is
postponing execution. maybe i should just try it to see if i get into
situations that are awkward, because it does look promising.


Entry: vm compilation: one word to change semantics of parsed code?
Date: Wed Jun 27 11:25:18 CEST 2007

since the compilation buffer already contains the code/data
(quote/call) distinction, only a single word is necessary to convert
any operation to its vm equivalent. this word should leave macros
alone.

problem here is that i need a type (pattern) matching word here, so
not yet..

simpler: i'd like to remove the quote in 'vm->native/compile' and in
the vm-core.f file, so i can easily compose macros. quote is really a
preprocessing thing, which is necessary to get from source -> forced
data semantics. once parsed to intermediate, no quote is necessary.

argh.. so i don't really need to remove the quote there, since it's
exactly that: a preprocessing step to generate native forth code. it's
ok this includes a quote operation. si '_literal' and '_compile' take
data atoms on the literal stack, which means quoting is necessary for
code atoms. this allows VM semantics (decision for code/data) to be
different than lower level language, which is a good thing.

check vm-core.f for some explanation. sommary:
"purely compositional macros == good thing".

it's the basic idea of CAT. see the notes below.

question is though: can i make these macros powerful enough to have
some kind of lambda construct? postponed macros basicly? the only
thing i want to solve now is conditionals, but better to aim for the
bigger thing.



Entry: language path
Date: Wed Jun 27 11:41:28 CEST 2007


FORTH with parsing words = symbolic ---> 
FORTH with only quote = symbolic ---> 
pure FORTH without quote, = compositional CAT code -->
intermediate assembler = effected macros (real asm) with pseudo asm (qw, cw, jw) -->
symbolic machine assembler -->
binary

i should give these a name
PURRR18/forth      (quote + parsing words)
---> PURRR18/quote (pure + quoting word)
---> PURRR18/pure  (purely compositional macro language, as CAT code)
---> PURRR18/asm   (PIC18/asm augmented with pseudo ops)
---> PIC18/asm     (my version of the symbolic assembler language)
---> PIC18/bin     (binary machine code)


Entry: i want it all
Date: Wed Jun 27 12:14:37 CEST 2007

what about postponing macros? i basicly want conditional branching at
compile time, but full lambda (quoted macros) would be interesting too.

now what did i expect? forth is not CAT. this is a game of syntax, in
the end.. i'm trying to cram a meta language into the language syntax,
without using its quoting mechanism: lists. it looks like i can't make
it too powerful without introducing quoting syntax, which is what i'm
trying to avoid to keep it simple.

the problem which sprouted this line of thought is the VM return
operation. the words "_then _;" don't work because ";" expects a
word. so i'm going to need an extra primitive to solve this
conditional execution.

maybe there is only one real solution. make the ' operation a
syntactic one, like in lisp. 

- if quote is syntax, an intermediate language is not necessary.
- if it's not, a parsing stream needs to be available

the last one is obviously worse, since it makes composition harder. so
that's what it will be: quote needs to be syntax, and ' is a special
character.

so, pure forth in s-expressions is

<program> ::= ( {<atom>} )
<pure>    ::= <number> | <word>
<atom>    ::= <pure> | ( quote <pure> )


to preserve previous syntax, the run-time semantics of "' word" still is
"load address of word on parameter stack"

so, summarize again:

is quote a lexing operation, or a syntactic operation? the answer
seems to be the former. the problem this solves is this: syntacticly,
code and data are distinct. the full domain is split in 2 parts, but
semanticly, code is a subset of data.

introducing quoting at the lexing level gives:
- better mapping to CAT (using the same lexical trick)
- saner semantics: quote is defined independent of an input stream
- quoting can be used in macros, using forth syntax, keeping the compositional property

in the language path above, the 'pure' and 'pure+quote' will now be the same, so i have


Entry: updated language path
Date: Wed Jun 27 13:31:57 CEST 2007


PURRR18/forth      (quote + parsing words, symbolic form is not CAT)
---> PURRR18/pure  (purely compositional macro language, has symbolic CAT form)
---> PURRR18/asm   (PIC18/asm augmented with pseudo ops)
---> PIC18/asm     (my version of the symbolic assembler language)
---> PIC18/bin     (binary machine code)

so the entry point is there to preserve original forth syntax, i.e. ":
abc". for internal processing, this will be mapped to "'abc make-:" or
as s-expression ((quote abc) make-:).

the 'make' name i need to think about still..

reason for having ' as lexing operation, instead of parsing, is that
it eliminates one parsing layer + it maps better to CAT.

this is different than forth, but in a way that is probably hardly
noticed.


Entry: again?
Date: Wed Jun 27 16:16:34 CEST 2007

so why not just a parsing step?
i need types to do this properly


macros:  pure+quote -> pure
forth:   forth-> pure

parsing words are merely frontends for pure

the alternatives are: 

1. lexing produces a stream of symbols and numbers. then there are 2
different parsers that map this to pure forth.

2. lexing already produces quotes

the first option is really simpler, so let's keep that.



Entry: parsing
Date: Wed Jun 27 16:36:30 CEST 2007


so now i need to redo parsing. currently, it's a bit of a hack. it's
not extendible. but do i really want it to be extendible? i need a
different 'kind' of word. a parser is not a macro.. they operate on
different levels.

so let's abstract it out a bit.

2 steps need to be separated

forth -> symbolic cat
symbolic cat -> parsed cat

both are parsing operations structurally, but it's maybe best to give
them different names?

i got it, except for the quoting stuff..

now, a problem i ran into is that ' abc actually compiles a byte
address. i wonder where this will fail if i change that.



Entry: bytes or words
Date: Wed Jun 27 18:25:42 CEST 2007

some conflict here

bytes:  ' abc org	needs byte addresses
words:	"' abc"		can be used as just a symbol.

maybe quote is more important. maybe we need to have "execute" take
word addresses everywhere? that's also better for the VM.

the thing is: data is always byte addressed, while code is always word
addressed. a unified address space (bytes) would be nice, but makes
things complicated since quoting is not just quoting..

so best seems to me:

* execute takes word addresses
* monitor JSR will also take word addresses
* quoting a symbol name has default semantics to load word address on stack




Entry: cosmetics
Date: Wed Jun 27 18:40:35 CEST 2007

TODO:
- make dtc intermediate code a bit more readable
- fix prj path as mutable state (arbitrary.. maybe see it as a constant?)

last one isn't so important.. first one requires some kind of
loopback, and i think it will make things too complicated.. need to
think about it.


Entry: dtc control primitives
Date: Wed Jun 27 20:49:18 CEST 2007

i need 'run' and 'jump' prims.. time to get confused about primitives
and programs again. if i remember correctly, the lessen is to never
let primitive addresses leak into the higher level code: it's not
convenient to have to deal with 2 kinds of code words. in cat, i only
use programs (lists of primitives) never primitives directly. same
here.

just like for primitives, i need to choose for some kind of basic
representation: byte or word addresses for composite code? the only
thing i need to take care of is that continuations (return addresses)
are compatible with "run".

i'm getting confused.. i guess i just need to write if/then/else and
we'll see how to continue. it does look like there's no easy way other
than:

	LIT L0
	BRZ
	<true>
L0:	<rest>

and

	LIT L0
	ROUTE
	<true>
	LIT L1
	RUN;
L0:	<false>
L1:	<rest>


ok, so be it. can't win them all.. maybe a good opportunity to use
ifte instead of if .. then .. else.

so.. primitives. can't 'run' primitives. can run programs. so the idea
is that quoting code always quotes programs, so i need something like
PF's { and } words. for conditional branching i can use 'route' as a
basic word. cloaqued goto or something.

route \ ? program -- 



Entry: assembler bug
Date: Thu Jun 28 00:07:21 CEST 2007


performing meta evaluation needs to happen in the 2nd pass, because of
the presence of code labels.

time to clean up the assembler, and sort out all different
meanings. the bug is simple: just retry if there's an undefined
symbol.

then another problem: literals take 14 bits, but quoted programs are
byte addresses. can we resolve this somehow? if i really need the
return stack to contain word addressess, that can still be fixed
later. now i'm going for 'run' and 'run/b'.

ok, it seems to work now.



Entry: vm optimization
Date: Thu Jun 28 09:36:41 CEST 2007

now it's time to reduce code. it's not very fast anyway, so no reason
to start spilling bytes. but this is for later. got some stuff to get
ready now.

i'm happy with how it's looking though. some minor things need fixing,
probably the most important one being return stack alignation.

something to focus on is to limit the number of macros. i probably
only need conditionals, the rest can be written in forth even. macros
are only necessary for marking jumps.




Entry: sheepsint 8 bit interface
Date: Thu Jun 28 15:24:27 CEST 2007

so. i need a synth control layer. going to use the ordinary 8 bit
forth.


Entry: loading dtc forth
Date: Thu Jun 28 16:03:59 CEST 2007

problem. the mapping from vm -> native forth is not just syntactic. it
uses knowledge about target words being macros (as native macros) or
dtc target words. this means 'load' will not work properly.

so this decision needs to be postponed. 

easiest is to load both symbols (word and semantics) on the literal
stack, and have a macro determine the semantics.

ok, seems to work.



Entry: problem with dup and literals
Date: Fri Jun 29 09:49:11 CEST 2007

123 dup 456 doesn't give 2 literals on the stack.. if i let dup copy
the literal, some other things go wrong.. maybe it's best to have dup
copy the literal, and solve the other problems in a second pass?

i found an optimization that solves it in one pass, by realizing

1 (2 3 !)   ->    <...> 1

where <...> stores the value with stack effect = 0.

other places where this might go wrong is where an explicit dup is
expected.. there are none outside of the '!' i think.



Entry: sheepsint core
Date: Fri Jun 29 10:51:39 CEST 2007

things to fix:

- noise
- sample playback

then for control, i need to find ways to map parameters to meaningful
ranges. this is where multiplication and exponential table lookup come
into the picture, which might be an interesting advanced topic.

ok, there's a problem with the buffering: i don't have a fixed sample
rate any more, so computed values need to be sent out immediately: i
have no idea when the next event will output the previous state!

ok, just moved it to the end of the isr.. now there's a bit of jitter,
but probably not really noticable.


noise still isn't working. i can't find the problem. probably needs a
fresh look. also, notes aren't working..


Entry: unified namespace and rolling back
Date: Sun Jul  1 15:47:32 CEST 2007


for target stuff.. meaning: something defined as a variable should be
able to be redefined as just a target word. or not? this is not so
easy since all meta objects are compiled into the core, and are not
really seen as data..

there is also a conflict between forth's "first find" and my meta
language's last redefined. maybe the project file should index macros
somehow? so they to can be reverted.. this would be cool for variable
names etc..


Entry: VM and TBLPTR
Date: Sun Jul  1 15:57:09 CEST 2007

maybe it's not such a good idea after all.. the deal is this: the VM
should be easy to use. anything that needs speed can simply be moved
to primitive code, completely eliminating interpretation overhead. i
put some effort into making both layers interoperable, so why not use
it?

it seems as if each 'useful' feature of the VM makes it a lot
slower. why do i care? the whole idea is to make some kind of
standard. why not write the VM on top of the memory model for
instance?


Entry: swapf
Date: Sun Jul  1 17:53:10 CEST 2007

something is wrong with the nswap macro:

ok, i found it: nothing wrong with the macro. there was en error in
the assembler binary opcode.


Entry: control slides
Date: Sun Jul  1 18:17:29 CEST 2007

linear & exponential. in-place updates? probably best to go
out-of-place. with wrap-around?


Entry: control timer
Date: Sun Jul  1 18:37:05 CEST 2007

previous sheepsint had some fixed sample->control rate timer. here i'm
using a fixed sample rate for the noise generator (bit less than 8
khz), which increments a 32bit counter once every tick. this can be
used as a general fixed time source.

ok, trying to sync to bits of the 32 bit timer, i'm using this code:

\ control at 244 Hz    
: wait-control
    begin tick0 6 high? until
    cli tick0 6 low sti ;

but the cli/sti isn't necessary: the timer increment is atomic:
there's no read-modify-write.

one problem though, if the counter is reset, higher bits will never
get set! so a better strategy would be to wait for a bit to go low,
then wait for it to go high, so the transition is captured.



Entry: fix macro loading
Date: Tue Jul  3 12:07:42 CEST 2007

really annoying to have these not synced to project.. maybe include
them directly in the project file. also need caching: timestamps would
work together with mark points. a problem point is missing variable
and function name spaces. once something has been a macro, it will
remain a macro. a single dictionary stack is easier to use.



Entry: transient controller
Date: Wed Jul  4 12:43:16 CEST 2007

this is fairly simple if it only needs to save the mixer config (one
byte). saving oscillator frequency state requires 6 bytes more. what
about making the transient word itself responsible for saving current
state, and just using the x stack.

if the time base is fixed (32 bit tick timer), control words become
fairly simple. remaining question: who is responsible for syncing to
note tick? this is a question of composition: i.e. hihat + kick at the
same time requires hihat word to sync to note, not kick.

best to keep control syncing independent of note syncing.


Entry: AD conversion
Date: Wed Jul  4 13:53:39 CEST 2007

2 things to determine:

- aquisition time (sample/hold settling)
- TAD (per bit sample time)

TAD should be as short as possible, but greater than the minumum TAD,
approximately 2uS for 18F1220. the datasheet says for the F version at
8MHz, to use 16TOSC, and for the LF version to use 32TOSC.

It was on 16TOSC, 20TAD.. put it to 32TOSC, but can't see a
difference. maybe the pots are too noisy. i tried to add a capacitor.
100n and 10u, but no difference..


Entry: noise
Date: Wed Jul  4 15:49:18 CEST 2007

noise is probably more useful as one of the oscillators instead of a
fixed 3rd one, just like sampler. using the 8 bit timer only for
control time base frees up some resources, and decouples noise
frequency from control frequency.

best seems to be OSC1, keeping in mind the formant mixer. changing the
mixers: silence, xmod, formant. and having OSC1 do noise/square/sample.


Entry: bootsector
Date: Wed Jul  4 16:02:52 CEST 2007

maybe it's best to reserve some functionality for chip erase, so i
don't need to worry so much about messing up the bootsector. basicly,
just need a single piece that never changes, which has the ability to
influence the booting process to run the interpreter. probably an 'ack
bang' or something?

- keep boot sector free for fast isr
- reserve 2nd block for reset vector?

seems the core of the problem is that boot vector and isr vectors are
in the same block. what if:

- default reset vector = jump to second block
- add an application vector after this
- second block contains some kind of checking code to determine
  activation of application or debugger



Entry: metaprogramming
Date: Fri Jul  6 13:03:44 CEST 2007

more things from forth. i'be been using the first couple of macros
that use the compilation stack explicitly. i could probably move more
code to be accessible from the forth macro language. to have a forth
like [ and ] section would make sense.

the point where i want to stop is s-expressions: once i'm introducing
that syntax into forth, there's nothing stopping it from becoming
something completely different. one of the aims really is to keep out
s-expressions. however, it's not so hard to have some kind of 'begin
... end' construct that maps directly onto cat code.



Entry: noise as osc1
Date: Tue Jul 10 22:22:35 CEST 2007

tested. seems to work.


Entry: macros and cat
Date: Tue Jul 10 22:25:32 CEST 2007

name space mixing in macros. the ultimate goal is to have a forthish
CAT that i can just include in PURRR/18 code. currently the 'c' words,
combined with the literal stack, work pretty well. i need to think
about cleaning up the semantics a bit. there's a lot of nice things
hidden here..

one of those is: you need 2 stacks. mapping behaviour in an assymetric
way (i.e. return stack / data stack) is arbitrary "human meaning" to
ease understanding of components so they can be composed.



Entry: nand synth
Date: Tue Jul 10 23:45:29 CEST 2007

works like this:

- 4 schmith-trigger based oscillators, cap select (decade) + pot

- chained: 2nd AND gate input turns oscillator off

- the NOT in the chain prevents subsequent oscillators from being OFF
  at the same time

so, the the first oscillator A produces a square wave. during A's ON
period, the second oscillator B produces a square wave, during A's OFF
period, the second oscillator gives ON. and so the story continues...

....AAAAAA....AAAAAA....AAAAAA....AAAAAA
BBBB.BB.BBBBBB.BB.BBBBBB.BB.BBBBBB.BB.BB
C.C.CC.CC.C.C.C.CC.C.C.CC.CC.C.C.CC.CC.C

etc

this can give a quite complicated pattern after a couple of steps. one
thing is missing though, there is no resync: all capacitors keep state
between oscillator ON/OFF switches, so no formant-like tricks.


Entry: noise as sync source
Date: Wed Jul 11 12:12:56 CEST 2007

It's possible to use 'filtered pitched noise' by using the RESO mixer,
together with a noise OSC1. however, the opposite: an oscillator
resynced by noise i don't have atm. maybe OSC0 should be able to do
noise too?

OK, that's a different game



Entry: boost converter hack
Date: Mon Jul 16 19:13:12 CEST 2007

as mentioned before (probably in brood 2 ramblings.txt), it is
possible to use a protection diode as rectifier for a signal -> power
converter by connecting a signal with a large enough duty cycle
directly to an input pin, and connecting a cap across the power pins.

related, it should be possible to convert that scheme into a boost
converter, by connecting a power supply to an input pin using an
inductor, and using the pin's output stage as a switch that to charge
the inductor (by connecting the point to ground).

when the pin is switched to input, the inductor discharges the energy
stored in the magnetic field, and charges the capacitor through the
protection diode.

this way, the uC can regulate its own supply voltage. this scheme
just needs an initial push to charge the capacitor such that enough
energy is stored to boot the program that starts the feedback
mechanism.



Entry: filter bank on PIC18
Date: Mon Jul 16 22:13:34 CEST 2007

so, if i want to run a digital filter on the PIC18, for, say, some
demodulation. what performance am i looking at?

running on 5V and a xtal, i can get to 10 MIPS. for audio rate
signals, say up to 5kHz, this gives 2000 instructions per
sample. that's not quite nothing.

using half of this for the filtering, and the other half for the
decoding and the actual application, we're looking at 1000
instructions of DSP to burn. looks to me there's plenty of room.

to make it sound good, tones need to be quite stable. at least 1/16th
of a second. say 6.4kHz, this is about 400 samples.



Entry: PSK31 and meshing
Date: Mon Jul 16 22:27:06 CEST 2007

i think for the waag, we need to keep the basic objective simple:
PSK31 as it is tried an true, and there is decoding/encoding software
to actually test it.


Entry: human naming nature
Date: Tue Jul 17 01:21:33 CEST 2007

One of the things that's nice about a compositional language like CAT
is that they force you to aggressively factor. Simply because programs
become to hard to understand if you don't. Factoring is really
identifying (naming) substeps. In a compositional language, factoring
is really totally arbitrary, from a machine point at least. Not for
the programmer. Since function arguments are not named, names have to
be introduced elswhere.

This is that extra bit of 'meaning' in a program which transforms it
from the mess a computer just executes, to some meta-executed thing
represented in a human mind.. Those are really not the same. Being
able to program something and 'knowing' how it works are different
things. The 'knowing' is hard to explain sometimes.

It's just a force of (human) nature, really..  For a program to be
actually readable, a bit more than the connectivity (topology) is
necessary: the information encoded in the names an sich seems to help
the human brain to understand the connectivity, or at least give it
some analogy.  Maybe a bit like embedding a topological thing in a
geometry to make it more 'real', programming is embedded in the real
world of thoughts by associating some native language to it. The two
ways to do this are either the lambda calculus (lexical scope) or
combinators.



Entry: get off that lazy ass
Date: Thu Jul 19 13:49:18 CEST 2007

i think i'm not made to idle around. depresses me. people tell me i
need to try harder, give it a couple of weeks of idling to find out
the true joy of life. i don't have time for that :)

so.. next things to tackle are:

* fix boot loader so the ICD2 can stay safely in the box for really stupid mistakes.
* interaction macros
* SNOT and sending code from emacs
* the slow highlevel forth on virtual memory



Entry: the boot block
Date: Thu Jul 19 13:54:04 CEST 2007


conditions:

* BLOCK 0 = empty OR 0000 and 0006 contain jumps to BLOCK 1 (soft reset)

  this ensures that an empty boot block is valid + interrupts and
  application invocation result in a reset when they are not defined.

* during boot, a DEBUG condition is checked. this will force it to run
  the interpreter to await commands.

* if the DEBUG condition is false, the application (addr 0002) is
  executed. if there's no application, a soft reset is run. (so
  eventually the chip responds).

* installing a new application:
   - clear boot block
   - install security jumps
   - install isr code

* possible conditions:
   - a pin
   - a boot wait + serial activity
   - break condition on serial port

--

installing the bootblock can be done in a single interaction macro:
compile an init macro, then when this succeeds wipe bootblock, and
upload a new one.

the deal is that the boot sequence up to the DEBUG check is NEVER
changed! it's not enough to have your application perform such a
test. this can go wrong in it's boot sequence before the check is
executed, or even during the check. get it right once, then keep it
like it is.

another possibility is to have the serial port operate from
interrupt. that way sending a break signal could actually stop the
program. however, this is more complicated and reduces freedom for
custom isrs.

--

thinking about it, why the one at 0006 ?  ok, it prevents problem if
there's a reset vector but no application vector installed. better to
be safe.

ok, default really is empty boot block: means app is gone. whenever
APP and ISR vectors are installed the 'reset-vector' macro needs to be
included.




Entry: new stuff
Date: Mon Jul 23 13:44:31 CEST 2007

done doing goto10 admin stuff. time to make a list of things that need
a different approach.

BROOD:

* streams (don't save intermediate state)
* macro namespaces
* interaction macros
* clean up pattern matching macros
* SNOT
* clean up / document / reflect on the forth macro semantics (partial
  evaluation + parsing words)

PURRR:

* boot block updates
* highlevel forth on virtual memory


Entry: name spaces
Date: Mon Jul 23 13:54:02 CEST 2007

i guess i need a proper name space mixing for the macro system. it
should all be just scheme functions, not hashtables full of
structures.

currently i have the following name spaces: cat, state, store, meta,
asm-buffer, forth-parse, macro, badnop

so.. let's see if i actually understand the plt scheme namespaces. a
namespace is something that maps symbols to storage cells for works
like 'eval' and 'load'.

so instead of using hash tables and explicit lookup, using namespaces
one could use 'eval'. the advantage is that run time 'eval' could be
avoided, and macros could be used where possible.

so, what do i want really.. 

* access macros using scheme names in scheme code.
* compile (eval?) a symbolic cat function straight to scheme fn
* be able to change cat macro name bindings just like scheme


questions i need answered:
* can an entire namespace be hidden in a module?
* is it possible to dynamicly add stuff to a module?
  (i guess so, using module->namespace)
* how to 'merge' namespaces?
* can i abstract the rather awkward symbol prefix merging?
* is prefix merging really awkward?


name spaces in scheme:

* once evaled/compiled, an expression is bound to a certain name space
  and independent of the current one




Entry: callout
Date: Mon Jul 23 22:30:28 CEST 2007

i need some knowledgable people to discuss this stuff with. don't know
where to find them though. things to try:

* plt list
* comp.lang.forth
* picforth list
* gnupic list



Entry: BROOD 4 takeoff
Date: Tue Jul 24 00:00:00 CEST 2007

EDIT: this where the ramp up to brood 4 starts with the move from
interpreter -> macros.




Entry: really on top of scheme
Date: Tue Jul 24 19:02:23 CEST 2007

so, i need to get rid of the explicit interpreter. or not? i'm mostly
concerned with name spaces here, not implementation.

(1 2 +) ->

(lambda stack (apply cat:+ (cons 2 (cons 1 stack))))

what about preserving original source form? do i actually still use
that? yes, when printing code. for example, doing (1 2 +) creates a
quoted CAT program, which when compiled doesn't have a source form.

so, how to assoicate original source form to lambda expressions?


i really should define my interface first. i don't need to use raw
functions as representations. the 'stuff' that's bound to names can
just as well remain a word structure. in the end, i'm doing nothing
but replacing hash tables by name spaces.

so..

* modules: separate code into logic entitites
* namespaces: allow run-time eval/compile

the latter part is not really necessary for the core! so, i should
build macros first, make sure i have a direct map from:

CAT (or any monad language derivative) -> 'raw' cat -> scheme

raw cat is just cat with scheme words.


so how to do this?

- all CAT code is compiled: use modules
- how to separate name spaces: (i.e. how to prefix names?)

so.. it's seeping through. names are compile time stuff. macros are
compile time stuff. anything that juggles names should be a macro. so
(cat +) is a macro, which expans to a labda expression, or a variable.

it's not enough to have it expand to just a lambda expression. storage
should be shared, so (cat +) should return a binding in case of a
single expression, or a composition (cat 1 +) in case of multiple
arguments.

so, what about this:

  any CAT-like languages use the (<language>: <word> <word> ...)
  syntax, where the macro <language>: (i.e. 'cat:') transforms the
  code into a function that maps stack -> stack


this way everything is directly accessible from scheme. for example
(cat: 1 +) is a lambda expression. neat. even, ':' could signify THE
cat. then 'cat-compile' is no more than (eval (cons 'cat: src))

note that i don't really need to ever run any programs. cat is just
functions, and in scheme, they can be applied to data.

the thing is, i don't need an interpreter. i just need a proper way to
associating compiled code to original source form (reflection). this
does mean giving up some reflection: the current source/semantics
association probably needs to change. it's not a small rewrite..



Entry: the macro way..
Date: Thu Jul 26 11:17:42 CEST 2007

let's start with some basics.

apparently structures can be used to implement behaviour of
procedures, using struct-type properties. this should be enough to
convert completely to macros.

i started cat-base.ss

so, here we go.. all the freedom is there again. 


* i'm starting with one modification: low level CAT source
  representation is reversed. this makes writing the macros a bit
  easier.

  this makes (a . b) be 'compose a AFTER composition b', so:

  (pn-compose  a b c)  ==  (apply a (pn-compose b c))

* 3 phases are separated:
    - compile: atom -> representation of behaviour (apply/cons)
    - compose: list of words -> nested apply/cons
    - abstract: application -> lambda expression

  compile can be recursive due to the presence of quoted programs

* reversal is introduced early on: it's too confusing to have it
  around after the nested 'apply/cons' is in place. i'm switching from
  pn- to rpn- prefix at the point of abstraction (converting code to
  scheme lamda expression).

* snarfs can be stolen from previous implementation. maybe the code
  reversal should use a generic reverse macro too. (done)

* now all that's left is to solve the name resolution.



Entry: separating syntax from semantics
Date: Fri Jul 27 13:05:09 CEST 2007

I got the syntax working. Now i'd like to build an abstraction that
takes a binder macro, and produces a compiler macro:

cat-bind   ->   cat::

Assuming the structure of the language remains the same.

The problem is i keep running into compilation phase problems and i
don't really know why.

It's quite intriguing, this macro programming. Not quite the same as
regular lisp hey :) It's a bit like a lazy language with pattern
maching. Maybe it is a lazy language? Would be nice to read a bit about this..

Anyways, i do see to start some programming patterns. I have a problem
that i'd like to keep both semantics and syntax abstract. Currently, i
pass around 'compile', but it's too general. I'd like to specialize
only some compile behaviour, and keep the rest open. So: message passing!


That seems to work quite well.

Now, on to semantics.



Entry: macro expansion
Date: Fri Jul 27 16:36:51 CEST 2007

One problem i run into is that (cat: ....) seems to be looking for
symbols in the toplevel. I guess if i know why, i'm a big step
further in understanding this whole module/namespace stuff..

From the manual: 5.3 Modules and Macros

"... uses of the macro in other modules expand to references of the
identifier defined or imported at the macro-definition site, as
opposed to the use site."

This looks like the 'no surprises' rule, or the 'dynamic binding is
evil' rule to me.

The toplevel can still be used for dynamic binding, hence the macro
expands to (#%top . xxx::+)

So it looks like i have only one choice. Either i make sure the names
are available at the point where the macro body is defined, or i put
them in the toplevel explicitly.

Let's see if the former is doable.

Ok, trivial but still feels a bit weird. Maybe i'm too much accustumed
to late binding by 'load/include', which is as far as i get it,
exactly what the module system tries to avoid.

* Circular dependencies are allowed within a module
* Not in between modules
* Undefined symbols in a module are not allowed.
* Any late binding is to be done in the toplevel (but feels dirty)


Ok, time to clean up the utility code.


Entry: control structures
Date: Fri Jul 27 18:24:03 CEST 2007

.. become a lot easier to implement:
 
(define (xxx.choose no yes condition . stack) 
	(cons (if condition yes no) stack))



Entry: where to store the functions?
Date: Fri Jul 27 18:34:08 CEST 2007

This remains a question. I thought it was necessary to have them in a
scheme name space. Not true. As long as they can be identified at
compile time, and mapped to storage, all is well.

Not true, and also not convenient, because i really can't find a good
way to do it except for explicitly creating an empty name space and
dumping all the references there. 

Another thing: i don't really need the extra level of indirection a
name space cell provides: it is ok to just mutate the word structure
that's permanently attached to a certain name. It already behaves as a
cell:

instead of          NAME -> CELL -> WORD
we could just have  NAME -> WORD

since every cell is a word.

So why not just dumping stuff into hash tables? If (compile function
sym expr) returns a word structure, all is well. Since my language
doesn't have anything else than words, each name simply IS a word.

Make that nested hash tables, so i have a mutable real store to go
with the functional store. Maybe i can even unify them?


Entry: macros really are better
Date: Fri Jul 27 18:46:44 CEST 2007

* no VM, no custum control structures that invoke the interpreter. just 'apply'.
* functionality can still be stored in a hash table: each name refers to a fixed cell = word struct.
* hash table needs to be available at compile time



Entry: 2 stores
Date: Fri Jul 27 19:10:32 CEST 2007

Why not store the functions in the functional store? The main reason
is that the functional store is supposed to be dynamic, and the
mutable store static, never muted, except for debug purpose. But debug
is always!

So is there a better reason?

* It's not serializable. 
* It's fully derived from source, and just a cache.

So a better division is:

- everything that's completely derived from source, and doesn't change
  during a regular, non core-sev session goes into the hash store.

- all the rest, the real state which is result of computations (like
  assembler labels) goes into the functional store.



Entry: compile time hash
Date: Tue Jul 31 20:14:25 CEST 2007

let's do this namespace thing: a hash module, used at compile time and
later run time to solve all binding problems.

something i forgot: a namespace has both runtime and compiletime
semantics, however, i need to transfer everything explicitly from
compile time to run time if i want to use a hash..

now i am really confused. does this even matter?  the hash is not
accessible at run-time, but it is possible to have it around at
compile time and just have a macro spit out some values..

the real problem is: modules can be compiled independently, and all
state accumulated over such a run needs to be saved if it is to be
used somewhere else. so what i'm trying to do will probably not work.


Entry: got snot working async
Date: Wed Aug  1 22:49:16 CEST 2007

so now it's time to do some real work. i still don't want to give up
the idea of putting cat names in modules, and using eval to compile
code at runtime. it really can't be that hard. would be a good
exercise to find out what a namespace needs next to being empty to
just compile code..



Entry: cat and #%top / lexical variables?
Date: Thu Aug  2 09:00:15 CEST 2007

what about this: redefine #%top in the cat syntax expander to go look
for the cat namespace. this should enable the use of lexical scope to
do name resolution.

i found something easier: using 'identifier-binding' names that are
not lexical can be drawn from a namespace object. this gives maximal
scheme<->cat interplay, while keeping the namespace mechanism we had
before.

so:

- compilation to lambda expressions
- top level name resolution

are now separate. at this point it looks like i'm where i was before,
only with word rep changed a bit, and lexical scope.


Entry: namespace again
Date: Thu Aug  2 10:19:01 CEST 2007

so all name resolution is a runtime thing. at runtime, a tree of hash
tables is available which contains permanent bindings to word
structures. the code expands to forms that get bound to this word
structure whenever they are executed, using 'delay' forms.

so, with this delay mechanism in place, is there a need for storing
semantics in word structures? probably not.

..

something is not right:

- can't have (apply (delay expr) body ...)
- can't insert a word structure at compile time either

i wanted to to the latter to avoid a delayed expression. the only
solution is to use a different apply.


ok, i got it now. just using delay in the macro and force in the
applicator.


Entry: lot of work
Date: Thu Aug  2 18:40:57 CEST 2007

got myself in a lot of work because i'm not respecting
interfaces.. maybe fix that temporarily? it was necessary for the
control structures because they're low-level, but maybe not for the
rest of the code?

next: the 'compositions' macro parameterized by:

* source name space
* target name space
* compilation macro

maybe it's best to take a step back, and respect the interfaces.. it
looks like this is going to work, so i can just as well make the step
and replace the entire vm code.



Entry: weird macro bug
Date: Thu Aug  2 20:52:12 CEST 2007

  ;; This driver could be generalized into eager evaluation for macros.

  (define-for-syntax (process-args op stx stx-args)
    (datum->syntax-object
     stx
     (map (lambda (x)
            (if (and (list? x)
                     (eq? ': (car x)))
                (op (cdr x))
                x))
          (syntax-object->datum stx-args))))
    
  ;; This utility macro calls another macro with an argument list
  ;; reversed if it is tagged with ':'. This is necessary for PN <->
  ;; RPN conversion.
  
  (define-syntax reverse-args
    (lambda (stx)
      (syntax-case stx ()
        ((_ m . args)
         #`(m . #,(process-args reverse stx #'args))))))



The code above doesn't work.. Something about the syntax gets lost
maybe? Expanding the macro seems to do the right thing though..



Entry: base functionality working
Date: Fri Aug  3 10:34:04 CEST 2007

got cat/cat.ss as absolute minimum: anonymous and named functions.
(like lambda and define).



Entry: macro weirdness
Date: Fri Aug  3 10:38:41 CEST 2007

i'm confused again.. syntax-rules macros are like normal order
application: 

(macro arg1 arg2)

the arg1 and arg2 forms are left alone until after the expansion of
macro.

This is how it should be i guess (the only way to get non-eager
evaluation in scheme is by constructing macros). But somehow it's hard
to switch between both ways of writing code..

One of the things i miss is to parametrize a macro with an 'anonymous
macro'. Something that behaves as a transformer, but does not have a
name. More specificly:

(compositions (lambda-macro ...)    ....)

Is this possible, or am i just confused about something??


and another one:

why is it so difficult to get this working:
 (define-syntax lex/cat-compile (syntax-ns-compiler cat-ref (cat)))

 (define-syntax syntax-ns-compiler
    (syntax-rules ()
      ((_ ref (ns ...))
       (syntax-rules (global)
         ((_ c global s e)   (apply-force (delay (ref '(ns ... s))) e))
         ((_ args (... ...)) (cat-compile args (... ...)))))))
    
  

i'm importing the module that has 'syntax-ns-compiler' as
require-for-syntax, but i get the error:

ERROR: cat/stx.ss:146:10: compile: bad syntax; function application is
not allowed, because no #%app syntax transformer is bound in:
(cat-compile lex/cat-compile dispatch 3 (pn-compose lex/cat-compile (2
1) s))


but this works :

 (define-syntax define-syntax-ns-compiler
    (syntax-rules ()
      ((_ name ref (ns ...))
       (define-syntax name
         (syntax-rules (global)
           ((_ c global s e)   (apply-force (delay (ref '(ns ... s))) e))
           ((_ args (... ...)) (cat-compile args (... ...))))))))
    
i don't get it..

Update: the answer might be that the latter is a pure rewriting macro,
and thus doesn't need any phase separation.. The former does, and the
problem is just that i don't understand the separation here..



Entry: list operations on code
Date: Fri Aug  3 13:44:12 CEST 2007

since all compiled code should have it's source rep still attached,
generic list operations are possible. i'm inserting a call to 'source'
for most of them.

Now, why not have 'run' accept data? This will make the language
simpler, and representation just a matter of optimization.

So.. a consequence here is that there always is a default or base
semantics. Maybe that's better.



Entry: Conclusion
Date: Fri Aug  3 14:52:34 CEST 2007

Maybe a bit early since i don't have the old stuff ported yet, but the
main conclusions seem to be:

* name space storage can be kept abstract: it's ok to do part of the
  binding at runtime, as long as this behaviour is abstracted
  (cat/lang.ss)

* defining a new language as syntax instead of explicit interpretation
  is good, because scheme's scoping stuff carries over: it's possible
  to only replace the global name space, but to keep lexical variable
  bindings.

And, macros can be simple, if you stick to syntax-rules. The more
general syntax-case can become very confusing very fast. The most
important thing to remember for syntax-rules is that it is a DIFFERENT
language than scheme! It is normal order (breath-first) instead of
applicative order (depth-first).

So.. time to look into CPS a bit more. There's this SRFI 53 i might
have a look at, but before that, i had a go at rev-k and rev-arg in
stx-utils.ss

seems to work..



Entry: and beyond
Date: Sun Aug  5 01:14:43 CEST 2007

So..

Maybe it is time to make a proper module based CAT language. Modules
really are a nice way to factor a design.. and i am already running
into the simplest of problems: name space clutter. A lot of temp
functions i'm using are littering the name space.


Entry: porting
Date: Sun Aug  5 16:08:52 CEST 2007

so, i started porting badnop to the new cat core. the first nontrivial
problem i run into is 'state-parse'.

Maybe i should keep 'define-symbol-table'. This needs some thought,
since the whole namespace thing changed. In effect, it's the same:
there are still hash tables with functions.

Wait.. the 'make-composer' things need to be macros now..

so, what's needed is sourcedict,compiler,target
currently, 'cat' is sourcedict+compiler, and 'cat!' adds destdict. i
need a better naming for this, since it's so general..


Entry: mzscheme things to look into
Date: Mon Aug  6 10:35:47 CEST 2007

* what is a 'transparent repl'
* moving more snot functionality to scheme
* snot and syntax coloring


Entry: anonymous macros
Date: Tue Aug  7 00:30:38 CEST 2007

is it at all possible to have anonymous macros? what i need is to
parametrize one macro with an implicitly defined other macro.

maybe this is not necessary: it is possible to have 'local' macros,
meaning macros defined by other macros, with names from syntax
templates. those names never clash, so it serves the purpose.

  (define-syntax compositions
    (syntax-rules ()
      ((_ (gen-def! . gen-args) . definitions)
       (begin
         (gen-def! CAT! . gen-args)
         (compositions CAT! . definitions)))
      ((_ def! (name body ...) ...)
       (begin
         (def! name body ...) ...))))


Entry: lifting
Date: Tue Aug  7 09:09:06 CEST 2007

when i want to do lifting, a decision needs to be made based on whether
a symbol is present in one namespace or not. this is a run-time
decision, since i'm using late binding. that doesn't look too difficult.

i think i have it now, overriding 'global' and 'constant' methods. the
rest should just work.

but. it's good to have a better look at monad theory and the 'lifting'
formulation to clean up my terminology a bit.

Let's see:

map	(a -> b) -> (M a -> M b)
unit	a -> M a
join	M (M a) -> M a


Setting a 'stack' as the base type t, the monad type M t will be a
stack with added state.

map is trivial and already used, however, the other two operations are
hidden somewhere else: in the words that implement the monad
dictionary. Does it make sense to make them explicit?

The thing that confuses me is that i am doing the 'lifting'
automaticly, based on a name space distinction. All the functions
inside the monad dictionary actually to the mapping joining and
returning, but in a way that's not factored into those 3 operations.



Entry: state lifting works
Date: Tue Aug  7 11:54:47 CEST 2007

now i need to think about some proper abstraction names, so the
'compositions' declarations look nice and readable.

maybe it's best to standardize on the following syntax:

(compositions
	(syntax (dst ...)
        	(src ...) ...)

  def ...)


* 'syntax' refers to the macro used to compile the body of the
   code. this is actually a compiler which needs source semantics.

* '(src ...) ...' are the namespaces representing the source semantics
  used by the compiler.

* '(dst ...)' is the namespace used to store the resulting code
  object.



Entry: program quoting and lifting
Date: Tue Aug  7 13:29:37 CEST 2007

i ran into this before.. in lefted code, how do i quote programs?
because of automatic lifting, the only sane way is to default to
non-lifted cat semantics. so i need to fix it up a bit..

looks like it's fixed now.


Entry: things that need fixing
Date: Wed Aug  8 12:10:32 CEST 2007


Probably the parser in forth.ss needs to be rewritten.. maybe as
macros? The thing that needs to change is that the parser always
returns symbolic cat code. No tricks with inserting internal
representations.

another thing i need to fix is default semantics: what to do if a
symbol is not found? maybe using a parameter?

done..

so the parser macros. if it's entirely built on top of the ordinary
cat macros, i could disentangle them and get them to work first, then
rewrite the parser macro preprocessor.

so let's start top-down.


Entry: macro.ss and literal + compile
Date: Wed Aug  8 21:09:02 CEST 2007

now i get it:
they need to be in (asm-buffer) and c> and c>word in (macro) need to
refer to them. that way 'macro-prim:' can be used together with
lexical binding.

ok, that works.

actually, it's quite cute this way. lexical scope to mix scheme and
cat code is nice..

this makes me think: if i implement the preprocessing macros also as
lexical extensions, that property remains. maybe that's overkill?
maybe the current code is ok, as long as i make it fully symbolic?



Entry: hygiene and the rewrite-patterns macro
Date: Thu Aug  9 11:35:50 CEST 2007

It's fairly complicated, but the name bindings introduced are only:

  make-word-compiled
  lift-macro-executable
  lift-transform

what if i factor it into 2 parts:

  a nonhygienic part that creates just the match clauses
  a hygienic part that binds the function and macro names


It looks like this is sort of working. Now what about preserving
syntax information in the expression parts of the match clauses?

(match ---
       (pattern expression))

so the expression part can refer to lexical variables etc..

let's do that, but first see if this non-hygienic version works.

one important question: when peeling off syntax with syntax-e, and
using datum->syntax-object to put it back, is the orginal syntax that
wasn't peeled off preserved? it really has to be..

seems to work.. at least the expansion does, but i can't see what can
go wrong with the quoting..


Entry: reduce
Date: Thu Aug  9 14:34:52 CEST 2007

transforming

((a . 1) (a . 2) (a . 3)
 (b . 4) (b . 5))

into

((a . (1 2 3))
 (b . (4 5)))

is called 'reduce', at least that's what i recall... but, i think
maybe the more general 'fold' is also called reduce sometimes.. so i'm
going to call it 'collect' for now.



Entry: require-for-syntax
Date: Thu Aug  9 17:55:48 CEST 2007

look at the macro compiler-patterns. find a way to put the utility
functions in a module without getting the error:

 pattern-core.ss:94:11: compile: bad syntax; function application is
 not allowed, because no #%app syntax transformer is bound in: (begin
 (ns-set! (quote (macro +)) (make-word-compiled (quote +)
 (lift-macro-executable (lift-transform (lambda asm (with-handlers
 (((lambda (ex) #t) (lambda (ex) (pattern-failed (quote +) asm))))
 (match asm ((((quote qw) b) ((quote qw) a) . rest) (appen...

i don't get it. when i make them local to the transformer expression,
all is well, but using 'require-for-syntax' doesn't work.


i tried the following isolated case:



;; Utilities for syntax object processing.
(module stx-utils mzscheme
  (provide (all-defined))

  ;; Reverse a syntax list.
  (define (reverse-stx stx)
    #`(#,@(reverse (syntax-e stx)))))


(module test mzscheme
  (require-for-syntax (file  "~/plt/stx-utils.ss"))

  (define-syntax reverse-quote
    (lambda (stx)
      (syntax-case stx ()
        ((_ list)
         #`(quote #,(reverse-stx #'list)))))))


and this seems to work fine, so i'm doing something else wrong..



Entry: CPS macros are fun
Date: Thu Aug  9 18:29:06 CEST 2007

but not really practical when syntax-case is around. now that i'm
understanding it a little better, there isn't any reason to keep the
CPS macros for list reversal.

the other thing to consider is the 'compile' macro. i'm using
something akin to CPS there too, only it's more message like message
passing: pass the current object (self).


Entry: datum->syntax-object
Date: Thu Aug  9 19:17:06 CEST 2007

thinking a bit more.. i'm still not convinced that

#`(#,@(syntax-e #'some-list-stx)) 

is doing what i think it is doing: the manual says
datum->syntax-object is used, but does it see the syntax substructure?

reading the manual again, now that i know what i'm looking for:

"(datum->syntax-object ctxt-stx v [src-stx-or-list prop-stx cert-stx])
converts the S-expression v to a syntax object, using syntax objects
already in v in the result."

for (with-syntax ((pattern stx-expr) ...) expr)

"If a stx-expr expression does not produce a syntax object, its result
is converted using datum->syntax-object and the lexical context of the
stx-expr."

then for quasiquoting syntax:

"If the escaped expression does not generate a syntax object, it is
converted to one in the same was as for the right-hand sides of
with-syntax."


so i guess we're ok!


Entry: symbolic macro names
Date: Thu Aug  9 20:05:26 CEST 2007

something i run into is the (macro x) function from comp.ss: (macro:)
wont work because the variables in the patterns are symbolic!

this is really confusing..

i'm replacing the symbolic function with 'macro-ref' to make it more
clear this is a run time symbolic lookup, not something that can be
bound once.



Entry: lexical quoted
Date: Thu Aug  9 20:41:58 CEST 2007

with the new syntax approach, i can use lexical variables like

(let ((xxx (lambda (a b . stack) (cons (+ a b) stack))))
   (base: 1 2 xxx))

Which is convenient. However, i ran into at least 2 cases where the
more convenient thing to do is to insert a constant instead of a
function. However, the semantics of a symbol is always a function in
CAT. Except.. when it is quoted!

So what about this

(let ((yyy 123))
   (base: 1 'yyy +))


Meaning (base: 1 '123 +) ???

This is very convenient, but looks a bit weird. The reason is of
course that stuff after base: is NOT SCHEME. Quote in the cat syntax
only means: "this is data".

The benefit of this is that it somehow resembles pattern variable
binding as in syntax-rules.

A better explanation is this:

 The scheme and cat namespaces are completely separate: scheme has
 toplevel and module namespaces, while cat has everything from a
 separate hierarchical namespace. The only way they can interact is
 through lexical variables: this is the only set of names that is
 fully controllable.

 In cat expressions: 

 * free identifiers come from the associated name space
 * identifiers bound in scheme are 
    - used as functions when they occur outside of quote
    - used as data when they occur inside of quote



This can be implemented by mapping quote -> quasiquote, and unquote a
symbol whenever it is lexical.

It seems to work fine.
Quote for macros is now also fixed.

Another attempt to justify myself:

 The quote operator in cat language is NOT the same as the quote
 operator in scheme code. More specificly: lexical variables will be
 substituted whether they are quoted or not. i.e. both (base: abc) and
 (base: 'abc) will be substituted if the variable abc is bound. The
 quoting just indicates the atom is not to be interpreted as a
 function, but to be loaded on top of the data stack.

 The substitution is there to make metaprogramming easier.



Entry: pattern transformer extensions
Date: Fri Aug 10 10:23:06 CEST 2007

I'm trying to perform the pattern extensions properly. A true test
about this phase separation thingy, since i have a couple of phases
here:

0 matcher runtime
1 execution of pic18-pattern transformer
2 execution of pic18-pattern transformer generator

an extra problem is that i'm matching transformer names -> syntax
transformers.

this gets a bit complicated, because the name of the pattern
generator, i.e. 'unary->mem' is used both as a macro template, and as
a function name, so the transformer generator needs to be generated!

too many levels of nesting: this has to be simplified somehow..

wait:

the thing that needs to be generated is a pattern expander function,
which can be used in pic18-comp.ss to create the extended compiler-pattern
macro.


ok, i'm running into the problem again: if i put pic18-meta-pattern
and pic18-process-patterns in a different module, and
require-for-syntax it, i get the #%app error again..

so until i figure that out, maybe best to always use local transformer
procedures?

i guess it has something to do with binding identifiers. here the
problem seems to be 'lit'. in the binary-2qw pattern..


i ran into the problem again, in pattern-utils.ss /
extended-compiler-patterns and it was something like:

      #`(namespace . #,( ---- ))

which needs to be

      #`(namespace  #,@( ---- ))

because the #, expands to an s-expression, which is then just inlined
too, leading to 'process-patterns' not being quoted. weird stuff can
happen.. ALWAYS CHECK EXPANSION when this #%app thing occurs!

wait.. that's not it.. damn!

i'm going to leave it at not using any syntax for it.. it's not too
bad, and understandable.


Entry: for-syntax
Date: Fri Aug 10 14:08:57 CEST 2007

I still get into trouble with higher order where i completely don't
understand what's going wrong. Well, i guess it will come with
time. I'm glad i get syntax-case to a point where i don't need to use
unhygienic constructs any more.

And, if i get into trouble with name bindings, it's always possible to
put local functions in a transformer expression.

I'm done for today tough.. head is hurting :)


Entry: preprocessor
Date: Fri Aug 10 14:50:57 CEST 2007

Time to adapt the parser/preprocessor, and change it to something
purely symbolic.

Seems to work. It's a lot simpler too now that the macro
representation supports quoting etc..



Entry: cat name space organisation
Date: Fri Aug 10 17:45:16 CEST 2007

Things changed:

* i found a way to easily debug modules and scheme code using snot
* cat code is now fully embeddable in scheme
* things got separated out a bit more

So, what i'd like to do is to separate pure scheme code from stuff
that accessess the global 'stuff' name space. This doesn't include
private name spaces that are only written in a single module, like
'asm' or 'forth', but it sure does include 'base'.

base is full of junk.. maybe that's the real problem?

Maybe i should just implement more in scheme, and have this base thing
as a scripting language only...

Or. I need to add an easy syntax for creating 'local words' using the
lexical stuff i have now.

(letrec
    ((a  (base: 1))
     (b  (base: 2))
     (c  (base: 4)))
  (begin
    (ns-set! '(base broem)   (base: a b + c /))
    (ns-set! '(base lalala)  (base: a b c))))


(local ((a 1)
        (b 2)
        (c 4))

       ((broem  a b + c /)
        (lalala a b c)))



this requires a different syntax, since the anonymous compiler
needs to be available always.

very straightforward. works like a charm.


Entry: wordlist search path
Date: Sat Aug 11 15:21:18 CEST 2007

i need to change the 'find' macros so they accept multiple paths..

one think i'm wondering about is how 'force' is implemented.. somehow
i suspect that the thunk is not erased... it probably is..

found it:

(define(force p)
  (unless(promise? p)
    (raise-type-error 'force \promise\ p))
  (let((v(promise-p p)))
    (if(procedure? v)
	(let((v(call-with-values v list)))
	  (when(procedure?(promise-p p))
	    (set-promise-p! p v))
	  (apply values(promise-p p)))
      (apply values v))))

thunk is erased. the only thing to optimize is to not use the values
stuff, but a single return value. probably not worth it.

Haha, something i didn't see at first there: the p is either a
procedure or a list, so i can't make a single atom of it, because it
could be a procedure :)


another trick:

creating the '<language>:' macro at the spot where the '(language)'
dictionary is created and populated with primitives uses the module
dependency system to somehow enforce dependencies on namespaces, which
are not checked.


Entry: next: namespace
Date: Sun Aug 12 21:10:45 CEST 2007

more specificly, it's time to start using eval on the 'macro:' syntax,
and i run into the problem that this happens in a toplevel where it's
not defined. can you tie 'eval' to a module context?

update:

what about making this namespace explicit? i just need a single
namespace object which contains all the relevant compilers.

hmm... it looks like the easiest way to implement this is to require
each 'lang:' macro to be associated with an 'eval-lang' function.

argh... can't do that, since those also need eval.. looks like i can't
escape this namespace thing..

Entry: lazy data structures
Date: Mon Aug 13 01:33:23 CEST 2007

so, what do i need to make the assembly process lazy? match needs to
work on lazy lists.


Entry: disentangling
Date: Mon Aug 13 10:43:14 CEST 2007

i can't just separate out all the code that defines names in the
namespace, because the namespace is used for other things. there's
some conflict here..

what about i start doing it anyway.

- more namespaces

- populate each name space at the same place where the scheme code
  that uses it is exported.


the real trick is of course to see the direct specification of
namespaces as 'internal'. this should be wrapped by functions.





Entry: name space trick doesnt work
Date: Mon Aug 13 15:56:35 CEST 2007

the problem seems to be that data constructed using the struct from
rep.ss is different from data constructed using run time evaluation in
the separate namespace..

i need a different approach. first to identify the problem:

  - mzscheme has strict phase separation
  
  - "(eval '(macro: ,@src))" works when the runtime name space, the
    current one when that code is executed, has that macro available.

  - somehow, the 'rep.ss' gets loaded twice, since i run into
    incompatible instances.

now, why is it loaded twice?

one thing i could eliminate was a for-syntax dependency on rep.ss, so
the messages are a bit less confusing now.

it looks like the namespace trick creates a new instance. that's where
my trouble is.

let's simplify it a bit. the only thing that really needs dynamic
compilation is 'macro:'. so i'm going to put that dependency in the
code itself. but this should be independent of the problem, so i leave
it like it is.

i found a solution: just make sure the current namespace has the right
symbols. there's no other way. currently i just dynamic-require them
in, but i dont know if this is better than just requiring them at load
time, or namespace creation time..

it does feel a bit dirty though.. why use modules if they start
injecting stuff in your namespace? Maybe i should just pass the macros
upstream.. whatever..


Entry: quoting parsers
Date: Tue Aug 14 12:59:37 CEST 2007

They seem to work now. Map well onto the literal stack / typed macros
approach. The question is how to map them. It doesn't look like a good
idea to keep the same symbol, but how to change it then? I'd like both

  ' <filename> load 

and

  load <filename>

to work. Prefixing them with 'def-' seems not right. What about
"/load". A single symbol seems the right thing.. ~#$%& are
ugly. "*load" seems a good compromise.

I got it sort of working, and factored it out a bit. However, might
need a bit of name cleanup to distinguish the following source
representations:

- a filename
- a list in symbolic forth format
- a list in symbolic macro
- the latter compiled into executable code

I switched to the following naming convention:
- files use 'load'
- strings use 'lex'
- list -> compiled code uses the ':' prefix
- all other operate on lists directly

Entry: default semantics
Date: Tue Aug 14 13:55:56 CEST 2007

Also, i start to wonder if it's a good idea for 'run' to take literal
lists as an argument. The only real benefit is Joy like introspection,
but since in badnop most source reps are macros, this doesn't make
sense: source is not the only thing, semantics needs to be added, so
default semantics might not be good.


Entry: long run times
Date: Tue Aug 14 14:46:10 CEST 2007

seems to go into a loop somewhere.. time for a break.

it seems it's just really slow!

and all the time is spent during compilation. looks like i got some
quadratic things going on in the expansion..

so, i suspect this is syntax-rules.. i'm using a lot of rewriting to
avoid syntax-case.. maybe i should just come back from that? already
eliminated the if-lexical? macros.

ok.. let's see. first make the expansion a bit less dramatic: some
things can be abstracted in a function.

then i replace rpn-compile by a single syntax-case macro. it still
calls the compile macro, which can be customized by stuff built on
top.. there's no difference in speed, so i guess it's somewhere else..


so it's this:

    (rpn-compile *forth* 'macro:)

if *forth* is about 150 atoms, there's about a second delay.
maybe it's the nesting of the macro?

i wonder if i can write a macro that's faster..

let's try something different.

currently, the rpn-abstract macro is using fold. it's still calling
the 'compile' macro. looks like that's what i need to replace.

so, how to implement modified behaviour. instead of using macros, why
not use functions? i do need proper phase separation to do this. let's
see if that's possible by moving stuff out to for-syntax-rpn.ss


Entry: running into #%app trouble again
Date: Tue Aug 14 22:15:17 CEST 2007

This is the smallest example i could find that doesn't work as i expect..

(module for-stx mzscheme
  (provide (all-defined))

  (define (break-stx fn args)
    #`(#,fn #,@args)))

(module test-stx mzscheme
  (require-for-syntax "for-stx.ss")

  (define-syntax (bla stx)
    (syntax-case stx ()
      ((_ fn . args)
       (break-stx #'fn #'args))))


  )

The, putting the 'break-stx' definition inside the define-syntax def
works fine..

Ok, if i change the quoting mechanism above to:

  (define (break-stx fn args)
    (datum->syntax-object fn
                          (cons fn args)))

It does work. Now i'm really confused.
I found this on the plt list:

http://groups.google.com/group/plt-scheme/browse_thread/thread/327013d5c6f61017/9a12e93d683a5f94?lnk=gst&rnum=2#9a12e93d683a5f94

  (require-for-template mzscheme)

in the module that generates syntax seems to solve it.


So, on to replacing the old 'compile' macro with a functional
approach, which works a lot better. There's really no reason to mess
with syntax-rules for anything else than simple patterns.



Entry: disentangling
Date: Wed Aug 15 14:19:05 CEST 2007


rpn-tx.ss        lowlevel syntax generation, parameterized by 'find'
rpn-runtime.ss	 runtime support for the above
rpn.ss           bind a 'find' closure generator to lowlevel syntax
ns-utils.ss	 support code for namespace lookup, to be used in find closures.
state-stx.ss  	 namespace namespace -> state syntax "language:" compiler
base-stx.ss      namespace -> base syntax "language:" compiler
composite.ss  	 create named words from compiler


Entry: mission accomplished
Date: Wed Aug 15 15:15:54 CEST 2007


Looks like i got it back online. The transformer works a whole lot
faster now. Let's repeat the conclusions:

- don't use syntax-rules if you need CPS tricks. it's ad-hoc and
  slow. use syntax-case with real functions instead.

- when using complicated syntax-case macros (compilers for embedded
  languages), separate out the transformer procedures and the template
  runtime support into different modules, so they can be tested
  separately.


I did this for pattern.ss  -> pattern-tx.ss and pattern-runtime.ss


Entry: better error reporting
Date: Wed Aug 15 16:21:00 CEST 2007

so... it would be great to be able to relate errors to where they
occur in the source code. however, to use the builtin syntax readers i
need to move both the lexer and the parser so they can
operate/generate syntax objects.

pretty clear what's to do next then:

- rewrite forth parser so it operates on syntax objects + create a
  proper 'forth:' macro that goes with it.

- make the lexer behave as 'read-syntax'

when this is done, i should be able to compile forth files straight away.

First part was easy: driver works. The rest should be
straightforward. However, moving this to compile time requires some
phase magic...

I was thinking about doing a proper phase separation in the forth code
too. Instead of defining macros as side-effect, it's probably better
to isolate them.




Entry: predicates->parsers
Date: Wed Aug 15 21:57:18 CEST 2007

I don't remember why exactly the map 'vm->native/interactive' is not
purely syntactic. it really should be.. refer to previous code to find
the previous functionality, but i'm breaking it and taking out the
'dict' dependency and will replace the predicates->parsers with
something that doesn't evaluate.

the previous 'predicates->parsers' behaviour is too dense. took me a
while to understand it. better to separate out in different
mechanisms: 1. syntacting transformation 2. run time symbol lookup


Entry: produced first monitor.hex
Date: Thu Aug 16 00:31:51 CEST 2007

looks like i got it mostly running now. didn't test the code yet since
a lot of things are still broken, mostly the interactive part. but it
looks ok.


Entry: brood 4
Date: Thu Aug 16 00:36:17 CEST 2007

enough things changed, and i'm in a broken state for a bit now. this
means it's time to up the version, and rewind the the 3 archive to a
working state.

it's archived as brood-3 on apatheia. this is the last patch included:

Mon Jul 23 21:29:12 BST 2007  tom.goto10.org
  * namespaces and next projects

at that time i was changing stuff to the boot block.. i'm not sure if
that code actually works.. might be better to revert a bit more back
till after the workshop.


Entry: next
Date: Thu Aug 16 01:01:53 CEST 2007


- test the target code, see if the monitor still works
- fix the interaction code
- fix the vm interaction/compile code
- fix snot for interacion/compile mode
- factor some badnop code: use local words


Entry: separate compilation
Date: Thu Aug 16 10:51:12 CEST 2007

got me thinking: can't i do the separate compilation trick for macros?
i already run into the non-transparency problem several times: trying
to define some code with some macros not defined..

one of the problems is 'constant': it needs run time compilation so i
can't just do this... another is that macros defined immediately start
influencing compilation of code after their loading.

but.. can the loading of forth files be made free of side effects? or
at least somehow separated? let's see what kind of side-effects we got:

 constant-def!
 2constant-def!
 macro-def!

those are easily isolated into separte dictionaries to separate 'core'
from 'project' macros and constants..

as long as project macros are loaded AFTER core macros, they can be
safely deleted as a whole.

the short version: it's impossible to change it now without real phase
separation..





Entry: literal pattern matching
Date: Thu Aug 16 11:20:58 CEST 2007

Patterns like

         ((['qw a] ['qw name] *constant)
    	  (begin (def-constant name a) '()))
  
are a bit redundant.. a better notation would be "(a name *constant)"



Entry: assembler cleanup
Date: Thu Aug 16 11:36:25 CEST 2007

Can't i get rid of the 'constants' namespace?

Again, why are they different from macros? To postpone symbol ->
number conversion until assembly time. So they can't be macros,
because at assembly time all macros have run.



Entry: compilation syntax
Date: Thu Aug 16 13:42:21 CEST 2007

i'm thinking about adding some syntax to compile code using different
syntax..

(a b c) is still default demantics quoted code, but

(lang : a b c) is interpreted as compiled with 'lang'.

or maybe

(lang: 1 2 +)

let's see if i can do this first on the rep.ss level: just store a
symbol naming the rep.

probably the first thing that needs to change is to change state-stx
to take an anonymous compiler as 2nd op..

it's really annoying to be at the border of compile/run the whole time!

first, the above is not really possible since state-stx code fallback
code is not derived from a named compiler.



Entry: override semantics
Date: Thu Aug 16 15:05:36 CEST 2007

Introduced the (language: ...) syntax for overriding language
semantics while quoting code. It's implemented as follows: the default
'program' compiler checks if the first symbol in a list ends in ':',
if so, the whole expression is passed to the scheme expander,
otherwise the default 'represent' method is used to compile the code
anonymously.

It's a small step from here to a 'lambda:' macro.

I also fixed the semantics annotation. However, it is possible to run
into code which doesn't have the semantics annotated because it works
with an anonymous macro.. This could be cleaned up, but i guess it
serves the debugging purpuse now: 'ps' displays macros as (macro:
....)



Entry: name mangling
Date: Thu Aug 16 16:01:27 CEST 2007

Maybe i should give the name mangling a go again.. If i recall the
thing i did wrong last time was to get rid of syntax information for
names, so they were mapped to toplevel names.

This macro seems to do the trick:

  (define (prefix pre name)
    (->syntax name ;; use original name info
              (string->symbol
               (string-append (symbol->string (->datum pre))
                              (symbol->string (->datum name))))))


So basicly now i have a mechanism to use the mzscheme module system
for handling namespace and dependency management.

I bet i can use some kind of 'module-local?' predicate on the syntax
to find out if a name is local to a module, and if so use that
instead.

I guess a good time to find out if i have the namespace stuff
sufficiently abstracted.

Something about naming conventions: the 'rpn-' modules do not need or
depend on the namespace implementation.

I do need a different kind of 'compile' macro, but for the rest it
works perfectly. Maybe time to rename some things..



Allemaal goed en wel, maar hoe kombineer ik? Lijkt me niet direct een
goed idee.. Dit werkt beter als alles of niets..

So, combining runtime namespace lookup and static modules.. how to?

One of the things to change is to not inherit from a namespace, but
from a named compiler macro.

What about starting from the ground up? Making the base language
static, then moving things from dynamic -> static?

Starts with snarfing. Instead of snarfing to dictionary, snarf to
prefix.

Start with separating primitive.ss into snarf.ss and ns-snarf.ss

So in principle, it should be really easy now to move the
implementation of base.ss to static functions without anybody
noticing. That is, if i can somehow make delegation work using just a
language: macro instead of namespaces..


Entry: the royal DIP
Date: Thu Aug 16 17:35:27 CEST 2007

i guess the solotion is to use 'dip' from base to create state syntax
abstractions. and maybe, to add an optimization that:

	      (+)

does not create an extra lambda, but returns the primitive + right away.

i guess the optimization can be left until later..

so..

the idea is to make the delegate compiler abstract. this requires
quite some change, but should make the code a lot simpler.. it would
also fix the annotation problem mentioned above.

so
	    (ns-base-stx badnop: (badnop) base:)

instead of
            (ns-base-stx badnop: ((badnop) (base)))


let's call this 'extend-base-stx'

haha. gotcha!  of course, the delegate: is a static thing, and the
namespace delegation is a dynamic thing, so there's no way to compile
this: the information necessary to decide about delegation or not is
not available at compile time, when the deligation needs to be
frozen..

it needs to work the other way around !!!

if a symbol is not defined at compile time, the resolution can be
postponed until runtime.

so i guess the gentle way to move things to static implementation is
to use the 'module-local?' predicate mentioned before. that way
module-local symbols can bind first, and cannot be overridden.

the number of methods in the compiler is getting larger. maybe use
real objects? prototypes?

also, if i use a decent prefix, symbol capture is not a real problem,
so i can put it on always? maybe just a dot or a pound sign..



Entry: pff... done coding..
Date: Thu Aug 16 20:49:59 CEST 2007

today was a bit intense. i start to get a bit more of this syntax /
lexical / static stuff.. it would be nice to make more things
static. there are only a couple of places that have 'plugin'
linking. one of them is 'literal' and 'compile' in the macro-prim
dictionary, so it looks as if i do need some dynamic binding.

however, i wonder if it's not better to solve this using units. more
standard tools = better, now that i know what i want at least..

one thing is bugging me though. some paradoxical thought:

i'd like to define words that fall back on another
dictionary. however, using static linking there is no such thing: a
symbol is there, or it is not. and there's no override..

maybe i should stick to dynamic.. it's really different and no easy
migration to static.


Entry: if i go static
Date: Thu Aug 16 23:29:47 CEST 2007

one name mangled namespace is enough, since i can use ordinary modules
to organise code and hide details, just like in scheme. let's stick to
'rpn.'

so i built that in: names like 'rpn.xxx' that are visible at compile
time get used as functions, and bind variables 'xxx' in the
cpmpositional code, just like lexical variables.

it looks like delegation from dynamic -> static parts is not
possible. since this is quite a deep thing to change, i'm not going
to. it's still possible to move highly specialized code into modules
to shield them from the main dictionaries.

what is possible is to add a static interaface to words in
'base'. they could still include code to register to the dynamic space
also, but at least this would enable to freeze some functionality. so,
maybe this: all base words are exported

- rpn.xxx variables from the rpn-base.ss file

- exported in a dynamic dictionary from base.ss, which gets the
  functionality from rpn-base.ss

is a bit confusing.. maybe leave it as is..



Entry: because i can
Date: Fri Aug 17 00:05:30 CEST 2007

there's a lot of 'because i can' code in thethered.ss ... as i found
out, some tasks are just easier to code in scheme. if it's anything
algorithmic, meaning intricate data dependencies, you're usually
better of writing a scheme program. they are easier to understand,
probably because they are a bit more verbose, and because 'automatic'
permutation and duplication of names avoid mental gymnastic for stack
juggling. there is nothing in the way now that i have both 'base:' or
'prj:' in scheme, and 'scheme:' in cat.

for what is the cat code useful then? simple patching and scripting,
there it clearly wins. as long as not too much data juggling is
needed, cat is really easier to patch things together.

also, imperative code looks nicer in cat. because cat is just
composition, it looks sequential. it happens that all (most)
imperative code i use is for communication. in scheme imperative code
seems always ugly.. maybe it's because synchronisation is easier to
imageine in a linear instruction flow: threads of execution joining
together at certain points, breaking the linearity of composition?


Entry: joy
Date: Fri Aug 17 01:28:07 CEST 2007

added a joy interpreter. it doesn't have much, but it can do

   ((dup cons) dup cons) i



Entry: interaction
Date: Fri Aug 17 10:45:30 CEST 2007

got a bit off track again.. time to fix interaction. first thing to do
was to put 'tinterpret' and 'tsim' code in prj.ss together with the
supportin code dip/s and ifte/s

so.. why is this so ugly?

by default, quoted programs and run + ifte use functional context to
limit surprises. however, sometimes i want to do things like:

 (tsim        (prj: dup tfind not) dip/s
              (prj: tinterpret)    ifte/s)


the xxx/s words are the analogs of xxx but pass stack + state to the
programs, and 'prj:' compiles state words.

is there a way to do this automaticly? probably not using my current
setup, unless i make 'run' understand state words, which means they
should be type tagged. since that only takes away the /s notation, i'm
not going to do this. so the convention:


  functionals <xxx> do NOT pass state to quoted programs,
  while the corresponding <xxx>/s DO


but...

if one uses types to do this automaticly, which means that the core
'apply' routine should be made aware of state, and rep.ss should
implement some kind of tagging for state words.. what would be the
real problem?


Entry: Monads
Date: Fri Aug 17 12:10:45 CEST 2007

i don't know much about type theory, but i think i understand how my
ad-hoc approach relates to monads using the unit-map-join formulation.

X     is state type
S     is stack type
( . ) is cons

unit ::  S             -> (X . S)
map  ::  (S -> S)      -> ((X . S) -> (X . S))
join ::  (X . (X . S)) -> (X . S)

so 'unit' introduces a new state object on the data stack. 'map' will
create a function that does what it did, but ignores the X part, and
'join' will accumulate one piece of state into another.

the first two are trival, and i do use them fairly explicit. but the
last one seems to be hidden a bit deeper, because i never use it
explicitly:

every state dictionary has a couple of words that bring stuff into the
monad, but they have type:

A     is assembly opcode

asm ::   (X . (A . S)) -> (X . S)

here 'A' is not the same as 'X', but in spirit it does the same
flattening operation.

looks like i'm missing some of the fun. clearly the 3 law formulation
has some benefit due to a higher level of abstraction, but what would
it bring me to make this a bit more explicit?

first of all, i need a proper type system. the monad objects should be
somehow tagged. that way 'unit' and 'join' can be made
polymorphic. 'map' should not be polymorphic, given i implement
monads as 'things on the top of the stack'.

;; map

(lift   (dip) curry)


the other two are problematic.  'join' is possible to do, since monad
types could be tagged so it _could_ be made polymorhic. but 'return' /
'unit'.. such polymorphism won't work because i can't infer the type!

i.e. 'return' is normally plugged into some expression that expects a
monad type. i have no way of determining something like that, so
'return' should have explicit annotation, probably best using just a
different name.

for example, the assembly 'return' for a single opcode would be

(asm:return    '() cons)   ;; wrap the single opcode in a list
(asm:join      append)     ;; concatenates the 2 state lists

what i do is to just combine those 2 operations into one that conses a
single opcode to the assembly list.

i think i sort of get the gist of it.. or not?



so, the other formulation uses

bind: (X . S) -> (S -> (X . S))  ->  (X . S)

so bind is like 'join' in that it combines a monad data type with a
function that maps from outside the monad to inside, and returns a
monad type. 

note: because i have only one type (a stack: each function maps stack
-> stack).

* in the general case, the source and destination monads for the
  'bind' operator do not need to be the same, but in my approach they
  are, since there is only one type that can be "monadified"

* i do not have the concept of a type constructor (types do not have
  an abstract representation), so i can leave that out.


so a stupid question maybe. how do you get stuff OUT of a monad?

i think there's something i didn't get. the type signature of bind is:

   M t -> (t -> M u) -> M u

so i guess if M u == t, bind can get things out of a monad. 

in general, it can get t out of M t (multiple times!), apply the
function (t -> M u) (multiple times), and combine 'stuff' from M t
with the (multiple) M u, and return an M u.


conclusion: not having 'real types' makes all this a bit difficult to
formulate.. it might be a nice exercise to try to do it anyway.



nice base for some more reading on the subject. maybe "Monadic
Programming in Scheme"

http://okmij.org/ftp/Scheme/monad-in-Scheme.html

this talks about the case where there's a single monad, or types of
different monads do not get mixed.



Entry: source annotation
Date: Fri Aug 17 16:45:25 CEST 2007

really.. does it make sense to NOT have the source annotation be
formal, if with a little more effort it can be?

It's sort of formal now.. Things that are not uncompilable have #f
semantics, the others are created straight from the named macro, so
should be right, or by composition from such, so should be right
because all code is syntacticly concatenable.

It sort of strikes me as odd that i can't have 'curry' or 'lift'
defined in a generic way, because quotation of data is not standard. I
could try to force it. Anyway, for 'lift' i only need base
semantics/syntax.

Wait, lift is possible if semantics is defined, but it requires that
quoted programs are always available. (even in forth macros!)



Entry: and so on..
Date: Sat Aug 18 01:48:03 CEST 2007

time to get back to pic programming... i didn't really anticipate this
static change and the move to brood 4. but this are really better this
way.

once the pic part is back online, it's time to look at interaction
macros, or how to create interactive meta functions.


so, timeline:

- interaction macros

- the standard 16bit forth (requires interrupt driven serial I/O and
  an on-chip lexer + dictionary)

- write something about compile/runtime and the different ways to fake
  the single machine experience.



Entry: state
Date: Sat Aug 18 14:34:18 CEST 2007

got interaction working. i changed it so the commands available in the
interaction mode need to be specified explicitly. this has to be done
for commands that take arguments anyway, so why make an exception for
0cmd?

see interactive.ss

so, i've been doing the snot-run thing, which works quote well. it's a
feast that state is stored elsewhere, and my function core can be just
reloaded. however, there are a few spots where i'm using state still..

one is IO. since it's non-functional anyway, storing the name of the
serial port couldn't hurt, right? wrong.. on restart, it needs to be
reset.

i made the 'boot' word which loads all the macros from source. this is
slow, i guess because of the constants?

so maybe i should just put the constants back as scheme file..


Entry: KAT and TAK
Date: Sun Aug 19 14:05:15 CEST 2007

I'm looking for a better way to explain the pattern matcher. Usually
generalizing helps. The reason why it seems special is because it is
only used with "macro pattern" and "quasiquoted scheme template"



Entry: no phase separation
Date: Sun Aug 19 15:16:21 CEST 2007

Now that i finally understand the point of phase separation, i wonder
if i can do something similar with the forth?

Maybe it's not necessary for small projects, but it does feel a bit
weird to first struggle to write scheme code that obeys mzscheme's
phase separation rules. To see that it's a good thing, and then to go
back to some non-separated way.

I see a roadmap on how to do this: just turn everything into scheme
syntax. The result after loading is a single function that generates
the program when evaluated. That way i know i'm going to get there. On
the other hand, i do not know what to give up then. My whole design
needs to change.

An other way is to do it incrementally. First make sure i can separate
code into macro definitions and rest. For just macro definitions this
is not so difficult. However, constants are a different story, since
they require compile time computation..

I guess that's where the problem is:


Entry: constant
Date: Sun Aug 19 15:27:10 CEST 2007

What are constants? Phase separation violation! In contrast to normal
macros, which obey separation because they do not use any values
created at compile time, macros generated by 'constant' join 2 phases.

The 'constant' word could be termed a "phase fold". The compiler after
'constant' is not the same one as before: it is extended with a
macro. 

This kind of behaviour prevents modularization of code, because it is
not clear what the definition of the new macro depends on, the only
thing that can be assumed is that it depends on all the previous code,
and that all the following code depends on the new macro.

The solution is that this behaviour needs to be unrolled: instead of
updating the compiler on the fly, an extention phase (where macros are
defined) needs to preceed a compilation phase (where macros are
executed).

There is a general way to unroll 'constant': split the code in 3
parts: the part before, the definition of the new macro, and the code
after. This is rather cumbersome and entirely unnecessary..

However, in the case of Purrr18 it is usually possible in to transform
the code to a macro definition. Instead of writing

	   1 1 + constant twee

one could write

           macro : twee 1 1 + ; forth

This enables the macro definition to be distinguished from the rest of
the code, to clarify the dependencies of a file's plain code on the
macros defined in that file.

The only reason not to do it the second way is because it looses the
name 'twee' in the eventual assembly code.

Removing 'constant' could lead to a better transparency in the code:
compiled macros could then be seen as 'only cache'.

Note that i would do this just for more transparency, not to eliminate
undefined symbols: macro name binding is still late.


Entry: phase separation
Date: Sun Aug 19 16:34:26 CEST 2007

So, a forth file contains both macros (M) and forth (F) code. The
forth code always depends (->) on the macros  (M -> F)

If a forth file depends on an other forth file, the macros from the
former depend on the macros of the latter, and the forth code depends
on both macros and forth code from the latter. Due to transitivity,
the arrows from M -> F in between files can be omitted, so one gets
something like

    Ma -> Fa
    |     |
    v     v
    Mb -> Fb

where the arrow from Ma -> Fb is left out.

What this would buy me is that i solve the problem of keeping the
macros consistent with the state of a target:

Target state is a consequence of compiling all the Forth code in a
project. However, as a side effect, a project defines macros that are
used to generate this code in the first place. There needs to be a
clean way to 'reload' these macros from the source code, so we can
connect to a target with the macros instantiated.

I'm trying to see how to make this more rigorous: how to make
incremental compilation work without having to manage dependencies
yourself? Basicly, how to map the nice module system of mzscheme to
incremental Forth development.

This is clearly not for now. It requires a lot of change. One of them
would be management of storage on the controller: if dependencies of
separately compilable modules are fully managed, incremental uploads
are still possible, and become 'transparent'. I.e. changing a module
but not changing its dependencies makes it still possible to update a
system on the fly, but in a transparant way.

I'm still quite happy with the ad-hoc hacked up way of incremental
development. But knowing this is possible might make the itch a bit
stronger.


Entry: dynamic updates and functional programming
Date: Sun Aug 19 16:53:53 CEST 2007

I guess most of this train of thought started after i got to using
sandboxes with SNOT. Currently it works +- like this:

SNOT (the bootloader)
  * manages memory: stores project state in a single toplevel variable
  * manages purely functional sandbox
  * implements REPL

outside of the system the edit-compile cycle runs: changes are made to
the collection of functions that acts on the state and a compiler
recompiles those that have changed. then 'restarting' the system is
almost instantaneous: the state remains, only the operations on the
system can change.

the requirements for this are of course that all state is stored in a
fairly concrete way: representation must not change from one version
of the system to the next.

if representation changes, a small 'converter' could be made..

What i'd like is something like a smalltalk environment, but then for
scheme. A lisp environment with incremental loading comes close, but
transparency is necessary. 

Smalltalk solves this by being completely dynamic: compilation is just
cache, and code can be edited on the fly. There is no 'off', it's
always running.

MzScheme solves this by being static but well-managed
dependencies. Separate compilation to make 'restarting' cheap. There
is an 'off', but it can be made small.

Using the approach above: managing ALL state separately renders a
virtually always-on system. The off period can approximate zero since
it's just "swapping a root pointer" once the code is
compiled. Compilation can take longer if changes to core modules are
made, but there remains a 1-1 correspondence between the system and
the source code.

I guess it's possible and not even too difficult to delay compilation
in the scheme case, making compilation behave more as a cache.



Entry: purification
Date: Sun Aug 19 17:17:13 CEST 2007

So i need to eliminate state. There are 2 cases where i've introduced
state because i thought it "wouldn't hurt"..

* the target IO port
* the project search path

The rest really behaves just as cache.

So if i'm allowed to be really anal about eradication, these things
need to change. Project search path is the easiest. Target I/O is
more difficult because it requires moving from a functional to a
monadic implementation.


Entry: eliminating global path variable
Date: Sun Aug 19 17:41:15 CEST 2007

to be able to eliminate the path state, i probably need dynamic
variables (parameters).

is this cheating?

not really.. since i'm using with-output-file already, and that
doesn't really feel like cheating.

this would also solve the problem with IO of course.. still i'm not
convinced it's not cheating..

one could say it's not cheating because the value has finite extent?
so why not implement monads as dynamic variables? because dynamic
variables are not referentially transparent, which you would want when
you 'run' a monad: it should still act just on the state provided, not
on something else...

so why are parameters different then? are they less evil when they are
constant? they represent 'context'.

* one thing is sure: they are less evil than global variables due to
  limited extent.

* if they are constants, they are less evil than when they are not

To really answer the question is to implement dynamic variables with
monads, and see how they are different. The problem i'm facing in my
ad-hoc state hiding approach is that i can't combine monads:

When i'm running something in the macro monad, i can't access anything
else. To have access to path, the monad should be bigger and include
'compilation context'.

The real solution is of course to make compilation independent of file
system access. Source code needs a preprocessing step that expands all
include' statements. Since it's only one keyword, this can be
implemented in the forth-load function. That function already
implements 'file system dereference', so why not include path
searching?

Ok, made it so. 'load' is now a load-time word so file system access
is concentrated in one point. 'path' is removed: this needs to be
specified in the state file, because it really is a meta-command.


Entry: cleaning up interaction
Date: Sun Aug 19 19:07:52 CEST 2007

This is the biggest change. Probably best to separate it out in a
different monad. The state associated with interaction is:

* I/O port
* target address dictionary
* assembly code

With this data it can start assemble code and upload it to the target.

But.. looking at the contents of the state file, there is not much
else!

(forth)       ;; might come in handy for interaction
(file)        ;; in case we want to access the file system
(config-bits) ;; on the fly reprogramming? some day probably
(consoles)    ;; this is the only real meta data not necessary for interaction..


maybe it's not worth it to split interaction off of prj. maybe it's
even just a bad idea: you'd want the 'fake console' to have power over
the whole project. impossible without giving it all the state.

let's just clean up tethered.ss and move out functions to badnop.ss

but... i'm using with-output already. so why not just have the i/o
commands do the same?

done. this immediately solves the problem of having more than one
device attached. i.e. a distributed system with all identical devices.


Entry: side-effect free macros
Date: Sun Aug 19 20:00:19 CEST 2007

i was thinking: if macros are side-effect free, constants can be
eliminated. because it's always possible to see if a macro is just a
constant: execute it, and if the result is '((qw <value>)) it is!

the only thing you would need constants for is to 'uncompile'.

another thing: what about making the partial evaluator reference
macros if it can be guaranteed the macros perform only computations
that can be completely reduced to values?

i need to disentangle this a bit..



Entry: no values
Date: Mon Aug 20 15:12:00 CEST 2007

i owe this to Joy: it's really good to have no "function value
quoting", i.e. just (foo) instead of something like 'foo

this leaves ' free for quoting literals, and has the benefit of a
symple abstraction syntax.


Entry: distributed programming
Date: Mon Aug 20 15:15:37 CEST 2007

The next hardware project is going to be krikit. It's going to be a
distributed system of small devices.


Entry: done
Date: Mon Aug 20 16:27:08 CEST 2007

yes, i guess so.. no pressing chages ahead, except for the macro/code
separation, side-effect free macros, and maybe dependencies.. which is
a biggie. another thing is interaction macros. so the todo looks like:

- move the words "constant, macro/forth, variable" and the 2-variants
  to preprocessor stage which can separate code into macros and forth
  code.

- add interaction macros


Entry: brood.tex
Date: Mon Aug 20 23:03:09 CEST 2007

i'm starting an explanation of macro embedding with a purely
functional approach. while i'm on the right way with my notion of
compilable, the effectful part is less obvious.

the idea is this: [ a 1 2 + b ] can be simplified to [ a 3 b ] if a
and b are effects.

somehow i'm missing something important.

maybe the situation is symmetric? instead of having language and
metalanguage, which both share some evaluation domain, they also have
functions that act on their full domain only.

i think i sort of got the duality now: the target depends on run time
state which is not representable, meaning only pure functions can be
evaluated.

...

my explanation is not completely sound.. when i'm talking about target
and host langauge, i never make the explicit conversion. there's
something wrong there. almost right, but not quite.

a compilable macro is something which can be 'unpostponed'. meaning,
it is a function that all by itself produces a program that can be
evaluated on the target.

...

another thing is that macros, in the way i implement them, are not the
macros i'm describing in the paper.

my macros are EAGER, they are a combination of the partial evaluation
strategy AND their original meaning.

the macros in the paper, at least the partial evaluation strategy, is
monadic. for compilable macros this makes no difference, but for other
algorithms, order does matter.



Entry: monoids and stacks
Date: Wed Aug 22 16:09:11 CEST 2007

something which has been tickling me for a while because i have it not
formalized in my head:

functional programming with stacks.. how does this work, really?

what's the relationship between state and stack?



so, compositional programming languages use compositions like [fg] to
express programs. all functions are unitary. that's nice to give some
framework about evaluation order (it being arbitrary, if there's a
representation of composed functions). so:

  Functional compositional languages make it easy to talk about
  partial evaluation: it's just the associativity law. Whether this is
  of any practical sense depends on whether we can partially evaluate
  FUNCTIONS to something simpler.

so let's start with inserting that thought in the paper..

then the other one is about locality of action. the fact that a
language is compositional doesn't really do much about this. you need
a way to ensure separation. this is where stacks come in.

but this is more about continuations than about being able to perform
partial evaluation.. really, the only thing i need to know is 

  * POSSIBLE: that [1 +] is equivalent to [1] followed by [+]
  * ECONOMIC: that representation is actually simpler

that's the end of the story. the fact that the thing uses stacks is
relevant to prooving that [1 +] is equivalent.

i need to clean up notation.. i'm using two different notations for
application. one rpn, and one pn. let's stick to pn, because i use
functions somewhere else, and reserve rpn only for compositions.

...

there's another thing that's really wrong in my explanation. something
i noticed yesterday already... macros are about IMPLEMENTATION of
partial evaluation. i really have only a single language! that's what
it feels like also when programming. so i think i can plow over my
whole text again... frustrating, but i'll get there eventually.

maybe this is why i like programming so much. making sense is only
defined from the point of works/notworks. math is too free for me.. i
am not strict enough.

ok. the plan: get rid of the notion of 'macro' and introduce it only
later. keep everything abstract, just show a way to translate forth
into a functional language operating on state + metastate.

looks like i'm getting somewhere.. and this is going to turn up some
conceptual bugs. looks like i needed to spend this time plowing
through misconceptions..

again, this is wrong.. ARGH!

the compiler is not a map. it's a syntactic transformation. what i
call a compiler now is just the property 'compilable'.

so the compiler is something that proves a program is compilable!

ok, i got it sort of explained now. so this composite function thing
is about semantics, which leaves more room to talk about the
implementation of the proof constructor (compiler).

just added a note about function definitions. creating new names is
either something which happens outside of a program, or has to use
side effects. currently it's the latter, but i'd like to move to the
former.


Entry: real compositional language?
Date: Wed Aug 22 22:37:47 CEST 2007

actually, the step to a real compositional language is not so big any
more. just adding the parsing words '[' and ']' for program quotation,
and possibly an optimization for ifte -> if else then conversion
should do it. all other constructs can then be translated into higher
order functions.


Entry: phase separation
Date: Wed Aug 22 22:44:48 CEST 2007
Name: phase_separation

i guess now that base.tex seems to be about bull-free, the step is
phase separation for forth files. basicly this means:

1. collect all names and macros. this includes constants, variables, AND the
macros used for compiling function calls.

2. compile the code.

so.. it should in principle be possible to have proper semantic
separation of names before a source file is compiled. currently, words
have a default semantics (target word). however, i could catch
undefined names if i catch all occurances of ':', and register a macro
for each of them that will compile a procedure call. that way i can
remove problems with macro/code confusion...

so 2.. a name always maps to a macro explicitly. otherwize it is not
defined. no more default semantics. the macro might choose to compile
a call instruction using a symbolic reference.

this means the language becomes a bit less flexible: 
         : (2)variable (2)constant 

are no longer accessible from forth, and are prepreocessor directives
that change the code into a form:

(macros
  (a 1 2 3 +)
  (b 5 -))

(constants
  (c 1)
  (d 2 5))

(tape
  ((broem) a bla 
   (lalala) bla broem))

where the tape is the layout of code memory with labeled entry
points. this structure is there to preserve multiple entry points
(fallthrough) and multiple exit points.

if macros are side effect free, constants can be eliminated. they are
simply macros that evaluate to a literal sequence, if they evaluate at
all.

i can even keep the current context for 'constant' suppose a forth
file starts with the code:

1 2 + constant broem

so the loose code "1 2 +" can be interpreted as a macro. the
consequence is of course that it's not possible to define constants
after the first function.

hmm.. i do need constants if i want constants in the assembly. because
to get them there, every constant needs to have a macro associated to
it that will compile the constant value.. so let's leave them in, but
employ the mechanism above to give them macro semantics. maybe a
constant is a macro that evaluates to a literal, so the actual macro
code can be stored somewhere else?

maybe the more important thing is to unify the compile-time constant
evaluation with macro execution? not really.. ai ai.. time to go to
bed..


Entry: set & predicate
Date: Wed Aug 22 23:56:09 CEST 2007

it never occured to me, but a set is indistinguishable from a
predicate function. operations on sets are then

(define (union        a b) (lambda (x) (or  (a x) (b x))))
(define (intersection a b) (lambda (x) (and (a x) (b x))))

a thing you can't do here is to iterate over the elements.


Entry: a day in bruges
Date: Fri Aug 24 10:32:26 CEST 2007

tourist in my own country.. anyway, i made some notes:

* partial evaluation/optimization

replacing a composition [fg] by a specialized function is always
possible in a compositional language. the reason why it doesn't work
for me is mainly because of 'hidden quotation'

for example the sequence "1 THEN +" contains a jump target. which is
not purely compositional.

solution: only pure quotations. all branching should be optimization
Forth is too dirty, need a syntax preproc. is there a way to have "[ 1
+ ] [ 1 - ] ifte" as the base form, and translate it into "if 1 + else
1 - then" ?  should i move all macros that break the compositionality
to a different level?

* terminology/concept cleanup: define compilability in terms of the
  existance of a retraction.


* proper credit

MOORE: required tail recursion, multiple entry (fallthrough) and exit points.

VON THUN: program quotation + combinators, program = function
composition and constants are functions, monadic extensions: top of
stack is hidden.

DIGGINS: typed view + things you can't do (whole stack ops kill stack
threading)

FLATT: separate compilation + phase separation

* semantics of jumps?

they get in the way of FCL formulation. a jump could be a
non-terminating evaluation? is there a way to make this sound?


* closures versus quoted programs

note that quoted programs are not closures, since they are not
_specialized_. for closures you really need dynamic behaviour: at run
time, some values need to be fixed. something that could emulate
closures is the consing of an anonymous function with a state
atom. this operation is call 'curry' in kat. it could be combined with
a monadic state for more elaborate emulation of closures & objects.


Entry: i hate it when this happens
Date: Fri Aug 24 13:31:09 CEST 2007

i have something in my head, about the relationship between
compositional stack languages, monads, virtual machines for the
implementation of functional languages and the lambda calculus and
combinators.. but i can't quite express it due to lack of literacy..

argh..

ik weet weer nie waar de klepel hangt..

dus:

1. compostional language -> put partial evaluation and meta
programming in a simple framework. independent of set!

2. elaborate on the set's substructure.


--

so, 1. gives a framework on how to build a compiler. but without
stacks, composition isn't really useful. so the stacks are needed as a
tool to create general functions that can be applied in several
concrete settings. so these functions need to somehow be independent
of SOMETHING. that something is the way in which run time data is
organized.

need to find a better explanation..

something that hit me just now: a computation on a stack language
always involves saving some state, and recombining it later.. there
are 2 ways this happens:

* most functions leave the bottom of the stack intact
* 'DIP' leaves a part of the top of the stack intact

this is probably related to normal order and applicative order
reduction.



--

another problem.. why is it so hard to get this formulated correctly?

in my exposition about parsing words, i cannot really use "variable
abc" as a good example, because it really is not compositional code,

that needs to be disentangled..

conclusion is right though: in order to disentangle this system, it is
neccessary to remove some reflection. to 'unroll' the dependencies.

and the picture is really about dependencies. functional programming
is more about getting your graph free of cycles than anything
else.. maybe that's the reason for stressing on the Y combinator: how
to introduce cycles, but not really.

there's another example in dan friedman's book: essentials of
programming languages. i can't find it now, but somewhere about
implementing an environment there is a need for a circular reference,
but he uses a trick to not have to do this..

maybe it's about how to make things static. to keep them from moving
so they can be looked at in peacefully and quietly :))


--

basicly:
- stack = environment (de bruyn index)


Entry: so.. what's he most important thing now?
Date: Fri Aug 24 16:06:47 CEST 2007

a lot of ideas need some fermenting still. but there's one that's
quite clear: names cannot be created dynamicly, because that kills the
representation as a declarative language.

so i need a preprocessing step that takes out all creation of new
names. this makes some things problematic. one of them is multiple
exit/entry points.

multiple entry points can be translated:

: foo a b c
: bar d e f ;

->

: foo a b c bar ;
: bar d e f ;

then at the point where '(label bar)' is assembled, the jump to bar
can be eliminated.

multiple exit points need to be translated to an else clause

: foo if a b c ; then d e f ;

->

: foo if a b c else d e f then ;

so it looks like it's not just names, also 'implicit names' or labels.



Entry: environment and stack
Date: Sat Aug 25 09:14:26 CEST 2007

let's elaborate on this a bit more. the stack can be seen as
related to an environment, which is a way to implement
substitution in lambda expressions. to simplify, suppose we
have only unary lambdas.

(lambda (a)
  (lambda (b)
    ((+ a) b)))

this can be rewritten using de bruin indices (starting from 0) as

(lambda (lambda ((+ 1) 0)))

where the numbers refer to an index into the environment
array. this gives an easy way to represent a closure as a (compiled)
lambda expression, and an environment.

maybe the missing ingredient in my understanding is the SK calculus?


Entry: paper again..
Date: Sat Aug 25 10:42:23 CEST 2007

in fact, i need to distinguish between syntax and semantics a bit
better. a compiler works on syntax, (a representation).

von thun has some text about this.. 

again, i'm amazed by how untyped you can be in scheme! i'm just
performing operations on lists, without ever having to clarify what
things are.. interpretation is a consequence of what functions you
apply on the symbols..

so, let's say that "working with symbols" is always untyped. they are
a universal tool of delayed semantics. maybe that's the idea behind
formal logic, right? by just specifiying HOW to operate on symbols,
you never need to explain what you are actually doing.

Quite an adventure, trying to provide a model for the language and
compiler.

* read Flatt's paper about macros again
* logic and lambda calculus.
* monads and their relationship with compositional programs.
* a purrr module system + compositional language



http://zhurnal.net/ww/zw?StokesTheorem

Funny. I have that book on my shelf, and i tried to start reading it
on thursday. I guess it has a major truth. Once the necessary
structure is in place, the conclusions are often trivial. So all the
effort is in the creation of structure. Sounds like programming.

Try "Once things are clearly defined, the solution is at most a single
line.", "Write the language, and formulate your solution in it.", "Ask
the right question."



Entry: fully declarative and compositional
Date: Sat Aug 25 11:45:01 CEST 2007

declarative:

all names defined in a source file are to be know
before the body of the code is compiled. that way, a program is a
collection of definitions.

compositional:

make all branching constructs fit the compositional view by using
combinators only.

both are largely independent, but should lead to a better
representation. advantages:

D
- side-effect free macros
- detection of undefined words 
- possibility of modularization (later)

C
- correct optimizations in the light of branching


let's learn a lesson from the past.. i can't afford to break it
again. the changes that need to be made can be made without changing
the semantics so much that a radical rewrite of forth code is
necessary. all constructs used at this moment need to be preserved.

is there an incremental path? the following syntactic transformations
are necessary:

1. constant -> macro
2. variable -> macro
3. word definition -> macro
4. split a file into macro + code



Entry: monads in Joy
Date: Sat Aug 25 13:56:47 CEST 2007

http://permalink.gmane.org/gmane.comp.lang.concatenative/1506
http://citeseer.ist.psu.edu/wadler92essence.html

so that's what i've got to do today.

after reading manfred's comments, i think i need to read more of his
work before i attempt to re--invent his ideas.

the paper by wadler gives some relation between monads and cps. might
contain what i need to explain the relation between monads and
stacks. probably reaching the conclusiong that stacks are monads.

let's see if i can learn something from this.

for each monad, provide bind and unit.

one complication is that functions in cat return a stack. let's see if
that makes things worse.

unut:  x -- M x
bind:  M  fn -- N

bind extracts values from the monad, applies fn to each of them, and
constructs a new monad from the output.

it's easier to use 'join', since 'map' is so trivial.

wait, is this really the case?

map is (a -> b) -> (M a -> M b)

from http://en.wikipedia.org/wiki/Monads_in_functional_programming

  (map f) m ≡ m >>= (\x -> return (f x))
  join m ≡ m >>= (\x -> x)

  m >>= f ≡ join ((map f) m)

this is a little different than what i've been talking about
before.. maybe it's best i try to formulate this in scheme first. See
brood/mtest.ss

-- the misconceptions

M a -> (a -> M b) -> M b

does not mean the monads are different!

it merely means: unpack, process, repack


so what is 'map'. map really is map!

see the next entry for more intersting stuff about monads in scheme..







Entry: monads in scheme
Date: Sat Aug 25 16:18:59 CEST 2007


;; Monads in scheme.
(module mtest mzscheme

  ;; Monads are characterized by

  ;; - a type constructor M
  ;; - unit :: a -> M a
  ;; - bind :: M a -> (a -> M b) -> M b

  ;; In words: something that creates the type (ad-hoc in scheme),
  ;; something that puts a value into a monad (unit) and something
  ;; that takes values out of a monad, applies them to generate
  ;; several instances of the monad, and combines them into one.
  
  ;; Let's create some monads in scheme, using ad-hoc typing:
  ;; representation is not abstract, and there is no type check. Start
  ;; with the list monad.

  (define (unit-list a)
    (list a))
  
  (define (bind-list Ma a->Mb)
    (apply append (map a->Mb Ma)))


  ;; Using monads, functions need to be put into monadic form. Simply
  ;; wrapping them with 'unit' is usually enough.
  
  ;; (bind-list '(1 2 3) (lambda (x) (unit-list (+ x 1))))


  ;; So what is 'map' for the list monad? Haha! It's map!

  (define (map-list a->b Ma->Mb)
    (lambda (l) (map a->b l)))

  

  )


so now introduce polymorphy. instead of storing stuff in a hash, it's
easier to just store a pointer to the monad structure in the record
for a certain monad, i.e. use single dispatch OO.

so i got a polymorphic bind, and a fairly decent interface that
abstracts away the polymorphism, so 'unit' and 'bind' for each monad
can operate on the representation only.

  (define-monad Mlist
    (lambda (a)        (list a))
    (lambda (Ma a->Mb) (apply append (map a->Mb Ma))))


so.. what i can i do with this?

maybe best to try to translate some examples from wadler's paper into
this mechanism.

or to define a 'do' macro. the Haskell code for the list monad

  do {x <- [1..n]; return (2*x)}

is a bit too mysterious.. let's try something simpler. the maybe monad.

wait, all my functions are unary.. damn. how to take multiple values
into a monad? can't really do that.. will need explicit currying.

this uses letM*
http://okmij.org/ftp/Scheme/monad-in-Scheme.html

(define-macro letM
  (lambda (binding expr)
    (apply
     (lambda (name-val)
       (apply (lambda (name initializer)
                `(>>= ,initializer (lambda (,name) ,expr)))
              name-val))
     binding)))

so i transform this to my code..

try this:

  a = do x <- [3..4]
       [1..2]
       return (x, 42)

  a = [3..4] >>= (\x -> [1..2] >>= (\y -> return (x, 42)))


now

(define-syntax letM*
    (syntax-rules ()
      ((_ () expr) expr)
      ((_ ((n Mv) bindings ...) expr)
       (bind Mv
             (lambda (n)
               (letM* (bindings ...) expr))))))
  
leads to this:


(letM* ((a (Mlist '(1 2 3)))
        (b (Mlist '(10 20 30))))
    (unit Mlist (+ a b)))

#(struct:monad-instance
  #(struct:monad Mlist #<procedure> #<procedure> #<procedure>)
  (11 21 31 12 22 32 13 23 33))




wicked. the macro expansion gives

(bind
 (Mlist '(1 2 3))
 (lambda (a)
   (bind
    (Mlist '(10 20 30))
    (lambda (b) (unit Mlist (+ a b))))))



let's see if the type of 'return' can be inferred in a structure like
this. no. the return type of the entire expression is determined by
the return in the letM* block. this type is arbitrary and only
determined by the context of the expression, to which we have no
access in scheme. one possibility to fake this is using a dynamic
variable.

so, i guess it makes more sense to switch to map and join as basic
operations?

no. i ran into a problem with double wrapping of structures that
requires the 'join' operation to be aware of the wrapping. so i'm
going to revert the changes.

next exercise: the state monad.

i never really understood this. a state monad contains a function that
will return a value and a new state.

-- "return" produces the given value without changing the state.
return x = \s -> (x, s)
-- "bind" modifies transformer m so that it applies f to its result.
m >>= f = \r -> let (x, s) = m r in (f x) s


EDIT: see monad.ss


Entry: kat monads
Date: Sat Aug 25 21:35:12 CEST 2007

the problem seems to be that 'return' and 'bind' need to be formulated
in a way that properly deals with the stack. somehow it seems to get
in the way.

let's take a new look at it, modeling things on 'map'.

fmap   	  s.a->s.b Ma -- Mb
join	  MMa -- Ma
return	  a -- Ma
bind	  s.a->s.Mb Ma -- Mb

(bind     fmap join)

the thing which bothers me is 'map'. something is smelly about map in
joy, because of the stack "doing nothing".

it's strange that 'for-each' feels really natural, because it has
threaded state. but map somehow feels wrong..



Entry: for-each is left fold
Date: Sat Aug 25 21:45:30 CEST 2007

for-each is foldl is sort of 'universal iteration'.

	  '() '(1 2 3) (swons) for-each   ==  '(3 2 1)

foldr is more like 'universal recursion', and i don't have a direct
analog in kat. maybe i should create one like this:

          '() '(1 2 3) (cons)  foldr == '(1 2 3)



Entry: state monad
Date: Sun Aug 26 13:08:41 CEST 2007

a state monad is a nice example of a computation. nothing 'happens' as
long as the monad is not executed explicitly by applying the value to
some initial state. i think this is a nice starting point to formalize
what i'm doing, since it's about the same principle: build a
composition that represents the compilation, and execute it on an
initial state.

so, really, monads are a way to formulate any computation as a
function composition. doesn't that sound familiar?

the thing to find out is 

* how my very specialized way of state passing fits in the general
  monad picture.

* why does 'map' feel so strange in Joy/KAT ?

* what is a continuation in KAT?

the last one i can answer, i think. it's a function that takes a
stack, and represents the rest of the compuation. so the continuation
of 'b' in [abcd] is just [cd]. i've added call/cc to base.ss


let's re-read von thun's comments


Entry: closures & stacks
Date: Mon Aug 27 14:49:10 CEST 2007

something to think about: a compositional language can have first
class functions without having first class closures, and without this
leading to any kind of inconsistency. like 'downward closures only'.

this brings me back to linear vs. non--linear.

a key observation is that linear data structures are allowed to refer
to non--linear ones, as long as the non--linear collector can traverse
the linear data tree (acyclic graph in the case we work with reference
counts as an optimization). but non--linear structures are NOT allowed
to refer to linear structures. (because otherwise they would not be
able to be managed by the linear collector).

this makes the non--linear collector trivially pre--emptable by linear
programs. PROVE THIS!



Entry: linear memory management
Date: Mon Aug 27 17:57:00 CEST 2007


something to think about is how embed a linear language in scheme as a
model. as long as its primitives never CONS, this should work.

i'm trying to formulate a machine that can express the memory
management part of a linear language, if it is given a set of
primitive functions. see linear.ss

this is an attempt to make poke.ss work, but from a higher level of
abstraction.

something i did wrong on the first attempt is to change the tree
structure WHILE still using the old addressing mode. permutation of
register contents needs reference addressing, so my macros are wrong.

this means i need a different representation of REFERENCES.

let's say a reference is:
- a pointer to a cons cell
- #t for CAR and #f for CDR


funny. i'm running into a problem numbering binary trees. the most
visually pleasing numbering is breath first

1
2 3
4 5 6 7
8 9 10 11 12 13 14 15

this corresponds the the binary encoding:

   1abcd...

where a is the first choice, b the second, etc...
the one i chose intuitively was

  1...dcba

which is not so handy, but is more efficient to implement when the
labeling doesn't really matter that much.


ok i got it working. i have a tree permutation 'engine' which is
accessed by numerical node addresses. now what does this buy me? a
simple way to talk about embedding linear trees.

in practice, some of the nodes are constant, and are better put in
registers.


Entry: binary trees
Date: Mon Aug 27 21:51:58 CEST 2007


still not 100% correct.. i'm loosing nodes.

ok. i'm making a mess of it, but i think can conclude the following:

1. it is possible to use a tree as the data universe
2. normal forth operations can be written as binary and ternary permutations on a tree
3. such a tree is conveniently addressed numerically

what i'm about to do is to:

- create an embedding of normal forth operations in a single tree, by:
  * fixing the positionss of the stacks
  * associating each operation to a permutation

- find a way to efficiently generate code for these operations, with
  the possibility of mapping some fixed nodes to registers.


AHA!

one pitfall i knew, and i run right in it. there's one operation which
is not allowed: if R points to a cons cell, it is not alloweded to
swap the contents of R with CAR or CDR of that cell, because this
creates a circular link, effectively loosing the cell.

more generally, it is not allowed to exchange R1 and R2 if they are in
the same subtree.

baker's machine contains no operations that can lead to such
permutations. it only talks about exchanging the contents of registers
with cons cells. this is different.

i'm trying to write the permutation for '>r', written as (D . R)

the following sequence of permutations is legal

((d . D) . R) -> ((d . R) . D) -> (D . (d . R))

which is (5 3) followed by (2 3). can this be written as a single
cycle (2 3 5) ? one would say yes..

so i guess i had a bug? since it created a circular ref in my previous
implementation.

now i can get (2 3 5) to work, but (5 3 2) doesnt!

i think i don't understand something essential here..

this is getting interesting!

i think i see the problem now. one is that my permutations are
inverted, and two is that (2 3 5) is not legal, but (5 3 2) is.

how to distinguish legal from illegal permutations?

and the inverse of (5 3 2) is not (2 3 5) but (2 3 7)

it looks like this encoding of the nodes is not very useful for tree
permutations.


Entry: legal permutations
Date: Mon Aug 27 23:39:23 CEST 2007


it looks like a more interesting approach is to start with operations
that are legal and invertible, and find their closure.

the difference with baker's machine is that i'm trying to use only one
root. hmm.. there has to be a way to see if a permutation is legal..

why is (2 3 5) not legal. because 5 gets the value of 2, which points
to 5. so a condition is that a register x cannot receive the contents
of a register y if x is in a subtree of y.

in (5 3 2) none such assignment happens. 
- 5 is not a subtree of 3
- 2 is not a subtree of 2
- 3 is not a subtree of 5

'subtree of' can be computed by comparing box addresses

    [1]
   [2|3]
[4|5] [6|7]

        [1]
      [10|11]
[100|101] [110|111]

a is a subtree of b if b matches the head of a.

this way, no circular refs can be introduced. instead of thinking
about cons cells, think of binary trees. it indeed does not make sense
to swap nodes if one node is a subtree of another node.

what about enumerating all legal binary permutations on an infinite
binary tree?


()           identity

(2 3)

(2 6)  (2 7)
(3 4)  (3 5)

(4 6)  (4 7)
(5 6)  (5 7)

(2 12) (2 13) (2 14) (2 15)
(3 8)  (3 9)  (3 10) (3 11)
(4 12) (4 13) (4 14) (4 15)
(5 12) (5 13) (5 14) (5 15)
(6 8)  (6 9)  (6 10) (6 11)
...



back from tree rotations, which are not general enough...

in binary

()

(10 11)

(10 110,111)  (11 100,101)

(100,101 110,111)


back to numbers

level (bits)
1                /
2                (2 3)
3                (2 6,7) (3 4,5)
                 (4 5,6,7) (5 6,7) (6 7)
4                (2 12,13,14,15) (3 8,9,10,11)
                 (5 8,9,12,13,14,15)

it's quite hard to specify without exclusion statements.. 

but i guess i got what i was looking for: limited to only binary
permutations, the legal ones are easy to characterize.

what about using multiple coordinates, and then embedding them in a
numeration? It is always possible to encode an n-tuple of natural
numbers as a single one by interleaving the bits.

a legal binary permutation from node A and node B (A < B) can be
written as the tuple (A - 2, s, d) where s denotes the same level
trees and d the dept from it. this is really clumsy and doesnt work..

it looks like what i am looking for is a primitive dup and drop. the
reality is, these are not primitive!


Entry: tree rotations 
Date: Tue Aug 28 00:03:47 CEST 2007

can i work with just tree rotations? yes. moving an element from one
stack to another is a tree rotation. the essence of a tree rotation is:

- reversal of P -> Q  to Q -> P
- movement of one of Q's subtrees to P

so a rotation is parameterized by 2 adjacent nodes P -> Q, and the left...

wait!

it's not a rotation, since the subtree that moves is the one in between.

it is a rotation if the stacks are encoded as

  ((D . d0) . (r0 . R))

then a rotation is simply

  (D . (d0 . (r0 . R)))

trees which represent associative operations have a value which is
invariant under tree rotations.

is this helpful at all, or am i moving away from my point? with 2
stacks, a data stack and a free stack, motion can be implemented by
rotations. 

this is not general enough.. i have no need to preserve ordering.


Entry: different primitives
Date: Tue Aug 28 00:55:17 CEST 2007

so with a 2-stack system (D . F) with D rooted at 2 and F at 3, the
primitives are:

D  = 2
D+ = 5
D0 = 4
D1 = 10

F = 3



the free list needs to be flattened. this can be done when reserving a
new cell or when dropping a data structure.

the latter is probably best since it is
* more predictable: deleting a large structure takes time
* all references to externally collected objects can be removed

so, i do have a need for rotation! if the CAR of the free list is not
NULL, rotate the free list, then DROP the newly exposed top and DROP
the part we rotated to the stack.

: >free  (D+ F) (D F) ;
: swap   (D0 D1) ;
: free>  (D F) (D+ F) ;   \ [a.k.a. nil / save]

: drop   null? if >free ; then rotate drop drop ;



like baker remarks, a lot of operations can be coded so they avoid
copying of lists. i have a lot of this code in PF already.. 

the moral of the story is: 

* this linear stuff is quite nice to build a language on top of, but
  you need a decent layer below it to create a proper set of optimized
  primitives to make it work efficiently.

* using a single tree works just fine, but is probably not necessary
  if the basic structure (like where the D, R and F stacks are)
  doesn't change.

* only use binary permutation of disjunct trees. disjunct trees are
  easier to spot for binary permuations.

* numbering trees in 1abc... fashion works well, and is easy for
  drawing diagrams.

* drop needs to deconstruct its argument.


the hash consing thing in this paper i dont get
http://home.pipeline.com/~hbaker1/LinearLisp.html

but but... about ternery permutations. they are easier to
understand. because the rotation i'd like to perform has to be
factored in a non-intuitive way..

maybe it's just the rotation operation that's difficult to express
that way? instead of focussing the movement of the data stack's first
CONS cell, it's easier to focus on the movement of the cell we want to
get rid of. so in the picture painted above, the operation 'rotate' is
actually 'uncons' and would be (9 5) (4 5)

i think that settles most of the questions. the rest is fairly
straightforward to fill in.



Entry: next
Date: Tue Aug 28 14:23:06 CEST 2007

after this small detour about trees, NEXT on the list:

* clean up syntax preprocessing & purely functional macros

* investigate on HOF syntax for Purrr18

* determine if Purrr is a valid project, or if it's best to aim for
  Poke.


Entry: ANS Forth - poke - PF
Date: Tue Aug 28 14:26:07 CEST 2007


the last question is quite an important one.. if i'm planning to write
a language for education, do i really want ANS Forth? the only reason
would be to have something 'standard', but for what reason. better
documentation?

i never used ANS Forth, and the more i get into this language
simplicity thing, the more i start to dislike it. i think i have all
the elements for a decent linear VM ala PF. should fit on a pic18

and.. a cleaner language is easier to teach. moreover, a poke language
can be made safe.

is it worth to stop somewhere in the middle to use a little bit more
optimal language, instead of one based on CONS cells?

this is not something to decide in an instance, but i think life is
already complicated enough to fill it with problems created by
weirdness in ANS that i don't use.. Forth is dead. long live KAT &
PURRR :)


Entry: Haskell
Date: Wed Aug 29 13:15:22 CEST 2007

just watched Simon Peyton-Jones’ OSCON 2007 tutorial, which clarified
a lot of things. he talked mostly about type constructors, type
classes, and the IO monad.


* IO a   is   world -> (world, a)

* and a type class is implemented as a record of functions that
  'travels independently' from values, i.e. dispatch based on return
  type.

* type constructors are also used for destructuring. this generalizes
  the 'list' constructor, and tuples (which are not constructors i
  think..)






Entry: hash consing
Date: Tue Aug 28 20:23:22 CEST 2007

so what's that all about. see:

http://home.pipeline.com/~hbaker1/LinearLisp.html
Reconstituting Trees from Fresh Frozen Concentrate

first, that section is not about hash consing, but about something
different: "our machine will be as fast as a machine based on hash
consing"

i dont get it..


Entry: compositional and?
Date: Wed Aug 29 14:30:25 CEST 2007

i was wondering what the deal is with compositional view. it allows a
simple framework for metaprogramming, but that's all.. i made this a
bit more clear in the paper.



Entry: curry-howard
Date: Wed Aug 29 16:42:46 CEST 2007

quite remarkable. i'm running into cases where operations from the
code i thought were merely a hack, like the 'snarf' operation, turn
out to be quite important for a monadic formulation of a stack
language.

in other words: i'm extracting some mathematical structure by naming
the types of all the transformations that are present in the code. i
think i'm just going to do this exhaustively..

in other words, by hacking around semi-blindly, following just an
ideal of 'elegance' i end up with a nice description of what i'm doing
in categorical sense.


Entry: arrows
Date: Fri Aug 31 00:29:36 CEST 2007

reading 'programming with arrows' by hughes.
this 'dip' business is really arrows..

just rewrote brood.tex to give a categorical relationship between a
TUPLE language and a STACK language.

what remains is to explain their difference...

it's been quite a day.. what did i learn really?

given a tuple language, mapping it to a stack langauge makes explicit
the need for run time 'cons' if the tuple language can create
closures.

ok, i need to go over this again since i lost direction a bit..

the CTL -> CSL bit is good though, since it reflects a 'real' part of
brood, namely the relationship between scheme and kat.

I'm still not really satisfied about the explanation. I probably need
some more time thinking about closures and dynamic memory: how to:

- combine a low level language with just stacks and function
  compostions, both implemented as vectors, with a linear memory
  model that supports closures.

- how to add 'constant trees' to a linear memory tree.

- what about trees and reference counts.

Also, i need to read Hughes paper about arrows.

what about this vague rambling:

- data stack = future data
- return stack = future code


Entry: stacks and continuations
Date: Fri Aug 31 18:19:05 CEST 2007

from wikipedia
http://en.wikipedia.org/wiki/Continuation

Christopher Strachey, Christopher F. Wadsworth and John C. Reynolds
brought the term continuation into prominence in their work in the
field of denotational semantics that makes extensive use of
continuations to allow sequential programs to be analysed in terms of
functional programming semantics.


for the linear memory case, i need to implement:
- closures (== cons)
- continuations (== a stack copy)

to do this efficiently, i need baker's approach to linear data
structures, which can be implemented using reference counts because
they cannot be circular.

something tells me i'm chasing something really obvious.. i guess the
next thing to tackle is to describe the linear language, and write a C
model for it. i.e. to implement POKE.



Entry: CSL vs CTL
Date: Fri Aug 31 22:03:26 CEST 2007

i talked myself into a pit.. what about "1 2 3 +". how can this be
seen as a CTL? only by making + operate on more than 2--tuples. this
means all arrows T_i -> T_j are also in T_{i+n} -> T_{j+n}


Entry: linear
Date: Fri Aug 31 22:29:04 CEST 2007

the next thing to do is to create closures without garbage
collection. this would make PF interesting.

so the deal is: tree structured data allows for 1--ref structures
which can be optimized using reference counts. i guess this is the
hash consing business.

hash consing =
- tabel van CONS cells
- bij (cons a b) -> check if cell is in hash: inc refcount else new

so that should be able to speed it up.. it's a bit smelly though.


Entry: poke
Date: Sat Sep  1 12:28:16 CEST 2007


yep.. time to get practical. this linear thingy is the most
problematic one.. i guess the thing i need to investigate is:

- write a linear memory manager in terms of a low-level set of
  operations (forth machine)

- write the linear machine's interpreter in itself.


i'd like to take a different approach with this: first write it in a
testable highlevel setting, then just map it to lowlevel code.

remarks:

 * by making the code storage nonlinear, a large problem is already
   solved: the return stack does not need to copy continuations. the
   return stack is a program == a primitive program | list of
   programs.

 * CDR coding. all code in flash are CDR-linked lists, but encoded
   such that they can be represented as vectors. this works very well
   with the remark above. it looks like this solves my earlier problem
   of vectors vs lists.

 * no branches. only combinators.

 * types: - primitive
          - integer
          - cdr-coded nonlinear cell
          - ram cell
  
 * type encoding: since there are only 4 types, 3 of which are memory
   addresses, it can be solved with a memory map, and N-2 bit integers.
   

there's one important part i forgot: VARIABLES
those don't really fit in the picture..


Entry: partial application vs. curry
Date: Sat Sep  1 23:35:14 CEST 2007


curry:  ((a,b)->c) -> (a->(b->c))

then partial application is i.e. curry (+) 123

so maybe i should follow christopher in:
http://lambda-the-ultimate.org/node/2266

and call what i'm calling curry 'papply'

and apparently, partial evaluation != partial application. so how do
they differ?


Entry: XY and stack/queue
Date: Sat Sep  1 23:53:44 CEST 2007

the [d r] thing i described about continuations yesterday is made explicit here:
http://www.nsl.com/k/xy/xy.htm

XY by Stevan Apter


Entry: goals
Date: Sun Sep  2 12:44:01 CEST 2007

the reason brood.tex doesn't work well is that i'm not setting goals
of the project. i started wandering when talking about categories...

so the goals are:

- create a language based on the ideas behind Forth, which is

  * easily mapped to a target (i.e. has very lowlevel elements)

  * less resistant to static analysis than Forth.

  * requires small resources in base form (i.e. just some stacks)

  * contains some highlevel constructs that can be easily optimized,
    i.e quoted programs ala Joy.

  * serves as an implementation language for a CONS based
    language. either a linear or nonlinear one.


Entry: references
Date: Sun Sep  2 13:15:59 CEST 2007

time to collect some references.


Entry: language levels
Date: Sun Sep  2 13:46:47 CEST 2007


- macro assembler / virtual forth machine: purely static. macros do
  not rely on any run time kernel support.

- macros with run--time support: some constructs that can not be
  translated to straight assembler require run time support code. for
  example indirect memory access using '@' and '!'

- dynamic memory: cons


what i'm guessing is that i need to get my dependencies straight. this
means:

- get rid of side--effects in macros (all names are identified in first pass)
- create a purely compositional base language with 'required optimization'

so where to start? it's a big job, but really needs to be done before
i start implementing linear CONS.

it looks like the end result here is going to be quite different from
what i have now. i'm basicly moving from a linear to a block
structured language.


Entry: block structure
Date: Sun Sep  2 13:54:45 CEST 2007

the real question is: should i implement the block structured language
on top of the linear one, or provide a set of macros to translate
forth into a block structured language, which is then transformed back
into a linear one?

it seems reasonable to keep the forth layer as the lowest one, and
translate into it. so basicly i need a lexer with list support.

time to factor out the basic problems:
* stream.ss
* stream-match.ss



Entry: lazy lists
Date: Sun Sep  2 16:35:20 CEST 2007

added stream.ss and corresponding matcher.

funny how reverse accumulation is no longer needed when you use lazy
lists!

maybe i should propagate this to the asm buffer? there is one problem
with the asm buffer though: it is used as a stack.

anyways.. i can make the lexer lazy. DONE. it's simpler now.



Entry: on lazy lists
Date: Wed Sep  5 17:29:30 CEST 2007

let's see if i can say something intelligent about this.. what i
notice is that streams make you avoid the following pattern:

* read list, process, accumulate as push.
* reverse the list

lexing/parsing fits this shoe nicely.

so.. are streams processes?

instead of using '@cons', one could just as well write:

- read
- process
- write

so what is the difference? it looks the lazy list approach is less
general, since it has only one output? multiple outputs need to be
handled using multiple lists. while the process view uses one process
and multiple streams.

and yes, these are processes. since the non-evaluated tails act as
continuations. every '@cons' should be read as write+block.

so, what about the asm?  it still needs to be used as a stack,
however, multiple passes can now be done lazily.





Entry: onward
Date: Wed Sep  5 22:27:58 CEST 2007

i keep getting distracted.. i got some work to do!

first one is elimination of side effects in macros: all side effects
in the brood application are to be cache only. this is an important
part that will open the road for more interesting changes, hopefully
leading to a fully compositional lowlevel language with a module
system.


Entry: monads and map
Date: Wed Sep  5 22:40:42 CEST 2007

so.. what about writing a macro for this 'generalized map - not quite
a real monad - collect results in a list' pattern?

i guess this is just unfold..

no it's not..

got this macro + usage:

  (define-syntax for-collect
    (syntax-rules ()
      ((_ state-bindings
          terminate-expr
          result-expr
          state-update-exprs)
       (let next ((l '())
                  . state-bindings)
         (if terminate-expr
             (reverse! l)
             (next (cons result-expr l)
                   . state-update-exprs))))))

  (define (@unfold-iterative stream)
    (for-collect
     ((s stream))
     (@null? s)
     (@car s)
     ((@cdr s))))
     
      

but it looks just ugly, so i'm going to forget about it.. i guess, if
this pattern shows up in code, it means i'm not using a proper hof.

what about writing it as a hof instead of a macro?

i think i'm getting a bit tired.. just reinvented unfold.. no, it's
unfold*



Entry: linear parser
Date: Thu Sep  6 00:25:26 CEST 2007


the parser can definitely be moved to streams. the fact that it
contains syntax streams is not really relevant to the structure of the
algorithms.. for example: i'm using 'match' in forth.ss

it changes a lot: the prototype of the parsers now is @stx ->
@stx. but the code should be a lot easier. 

due to the linearity of forth / compositional code, writing a macro
transformer as a stream processor instead of a tree rewriter makes a
lot of sense actually..

the preprocessor will translate a token stream -> s-expressions.

occurances of syntax-case can be replaced by @match. which is exactly
what i avoided in a previous attempt.. maybe i should just create a
@syntax-case macro that's similar to the @match macro, taking
partially unrolled syntax streams.

hmm.. pure syntax-case is a bit clumsy.. but the 'no rest' parser
macro i'm using does fit pretty well.




something i've been talking about before:

syntax-case: matcher for compilation: merge 2 namespaces (pattern var + template)
match:       matcher for execution: only a single lexical namespace

i don't know how to make the pattern more explicit, but it boils down
to something like this: if you're match together with quasiquote,
you're actually COMPILING something, not computing something.

in that case, pattern matching using syntax-case might be more
appropriate, even if you're not using scheme macros, because of the
merging of template and pattern namespaces. (which have to be mixed
explicitly using quasiquoting).

actually: syntax-case matches 3 namespaces:
- pattern
- template
- transformer namespace







Entry: SRFI-40
Date: Thu Sep  6 10:16:17 CEST 2007

it's been fun, but time to move to a standard implementation:
http://srfi.schemers.org/srfi-40/srfi-40.html

it would indeed be strange if this were not somehow standardized..

(require (lib "40.ss" "srfi"))

but, 40 has problems:
http://groups.google.com/group/plt-scheme/browse_thread/thread/637cc74047a7ada9

anyway: thing to remember: streams can be ODD or EVEN
http://citeseer.ist.psu.edu/102172.html

i'm using EVEN style: (delay (cons a b)) instead of (cons a (delay b))


so what exactly is the problem with
http://srfi.schemers.org/srfi-45/srfi-45.html

?

it can be seen in @filter, as explained in the srfi-45 document:
a sequence of

(delay
   (force
      (delay
         (force

is not tail recursive. this is because 'force' cannot be tail
recursive: it needs to evaluate, and cache the value before
returning. srfi-45 solves this by introducing 'lazy'

easy to see in:
(define (loop) (delay (force (loop))))


ok. so i'm sticking with my own lazy stream implementation. most of it
should be fairly easy to replace with some decent standard library
later. i don't think i'm doing anything special..


Entry: linear parser begin
Date: Thu Sep  6 15:36:48 CEST 2007

- all parsers are @stx -> @stx
- parser-rules: easily adapted (used by predicates->parsers)
- named-parsers


i'm forgetting something.. a parser needs to distinguish between
'done' and 'todo': the driver will stitch the stream back
together. otherwise each parser needs to explicitly invoke the driver
routine as the second argument to '@append'.

the reason we use a driver is to make each individual parser agnostic
of it's environment..

in concreto: the current implementation can be largely reused, but
list tails need to be replaced by streams.

then the remaining question is: does a primitive parser return 2
streams, or a list and a stream?

again:
- if parser does 1 expansion, it needs to return 2 streams.
- if it does multiple, it suffices to return only one

it's best to let the driver decide, so the first one is more
general. making both streams makes the interface simpler.

looks like the only thing this needs is a proper syntax-case style
syntax stream matcher so i'm not jiggling too many syntax<->atom
conversions. need to think about that a bit better, to see what the
prototype needs to be.


Entry: parser rewrite
Date: Thu Sep  6 22:46:20 CEST 2007

the end is near.. code seems to simplify a lot.

need to write 2 more generic parsers:

- delimited
- nested


interesting.. this stream business is deeper than i thought. i do run
into a problem though: (values processed rest) what if rest is only
determined if processed is completely evaluated?

by moving the 'append' to somewhere else, the forcing order can no
longer be trusted. does this really matter?? i need a break.

ok.. i got it worked out as '@split' which returns 2 values: the first
is a stream before a delimiting value, and the second is the stream
after.

the code i have now needs a certain evaluation order. i can make it
independent of that by forcing until the rest-stream becomes true.

that works. also got @chunk-prefixed working: which separates a
prefixed stream into a stream of prefixed streams.


Entry: macro mode
Date: Fri Sep  7 13:24:38 CEST 2007

i found out that ';' can just as well be used in macro mode for 'jump
past end', if macro mode can only contain prefixed definitions.  this
will bring multiple exit points to macros. can change this later.

anyways.. all parsers are now token (syntax) stream processors. it
should be really straightforward from here to:

- separate macro and code definitions
- perform separate compilation for forth files (macro definitions)


about the use of ';' in macros: this probably needs some dynamic
variable because of context: a macro representing a forth file != a
normal macro. in a forth file ';' means return to sender, in a macro
it means jump past end.. maybe i should avoid this?



Entry: bored
Date: Wed Sep 12 21:28:51 CEST 2007


i had some days off writing an article for folly, and my mind is
wandering away from the lowlevel forth stuff.. talking to a friend
yesterday i realized i need something different. i'm getting
stuck. let's rehash the problems i'm facing right now:


- i need pure functional macros: no side effects except hidden in
  cache / memoization. this requires a true code dependency
  system. doing this half-assed makes no sense, so i should at least
  have something like mzscheme, possibly piggy backed on top of
  it. that however is not easy, since this will probably mess up my
  namespace stuff. so i'm a bit stuck because i can somehow forsee the
  problems that are coming after i fix up my macros.

- i want to give up on portable ANS forth idea, and design a safe
  PF-like linear language. the stumbling block there is variables,
  since it's incompatible with the linear idea. at least, doing it
  using references to cells. maybe i can use some trick here? can
  variables be managed externally so they never need to be deleted?
  can they be seen as data roots like machine registers? something is
  not right in my intuition here..


EDIT: Mon Oct 8 21:06:17 CEST 2007 

Pure functional macros work now, and make things a lot better, but
this linear language variable thing i'm still quite puzzled by.




Entry: sticking to forth as basis
Date: Sat Sep 15 05:07:40 CEST 2007

reading http://lambda-the-ultimate.org/node/2452 forth in the news


i'm more and more convinced that forth should be the lowest level, not
some block structured higher level construct, which would require more
elaborate optimizations. it's best to have the pure control structs
(i.e. for next) as direct macros, and implement the higher code block
quoting constructs in terms of them. 

forth has this way with return stack juggling that's very powerful for
making new control structures. this is hard to do efficiently when you
tuck it all away in combinators..



Entry: brood paper
Date: Sun Sep 16 14:47:05 CEST 2007

actually.. it would be interesting to go over my ramblings and make a
list of things i got really wrong, or saw too simplistic. then see
what solution i got or how i came to understand the issues.



- monads are not just hidden top of stack items
- the relationship between closures and CONS
- syntax-rules and composition
- pattern matching and algebraic types
- lazy lists vs. generators: lists remain 'connected'
- 'natural' compiler structure: scoping rules, quasiquoting and syntax-case (3 levels)
- more specificly: quasiquote vs syntax case: when to use macros? is it code or data?
- looping and boundary conditions (i.e. image processing)
- cdr coding and lists as arrays
- importance of side-effect free 'loading' + relation to phase separation.


Entry: linear structures, variables and cycles
Date: Mon Sep 17 16:06:58 CEST 2007

in a linear structure (tree or acyclic graph if hash consing is used)
cycles are not possible. so how do you represent datastructures that
have some form of self-reference?

the thing we're looking for here is something akin to the Y
combinator: instead of having a function refer to itself, a different
function is used to turn a function to "tie the knot".

let's start with:

http://scienceblogs.com/goodmath/2006/08/why_oh_why_y.php

i'll try to put it in my own words, see next post. the link above has
an interesting comment on self-application. also, the wikipedia page
has some interesting links:

http://en.wikipedia.org/wiki/Y_combinator

so how to you apply this trick to data structures? my guess would be
to start from data structures in the lambda calculus, and then making
things more concrete.


Entry: Y combinator
Date: Mon Sep 17 18:55:07 CEST 2007

a fixed point p of the expression F satisfies F(p) = p. the Y
combinator expresses p in terms of F as p = Y F. combining the two we
get:

     F (Y F) = (Y F)

simply expanding this gives exactly what we want:

     Y F = F (Y F) = F (F (Y F)) = F (F (F (...)))

where the dots represent an infinite sequence of self applications.
that's all folks. in order to implement useful recursion, simply write
the 'body' F, and Y will take care of the rest.

let's make this a bit more intuitive. suppose we want to create a
function f which is defined recursively in terms of f. look at F as a
function which produces such a function f,

    F : x -> f

the recursion is a consequence of the infinite chain of applications 

    f = Y F
      = F (F (F ...))
      = F f

so what are the properties of F? first it needs to map f -> f. and
second if a finite recursion is desired, it needs to do this in a way
that it creates a 'bigger' f from a 'smaller' one, eventually starting
from the 'smallest' f which does not depend on f: this leads to a
finite reduction when normal order reduction is used.

let's solve this problem in scheme, for Y F = factorial. so we know
that:

   factorial = F (F (F (...)))

or

   factorial = F factorial

in words, F is a function that returns a factorial function if it is
applied to a factorial function. so the factorial function is a fixed
point of F. the Y combinator finds this fixed point as 

   factorial = Y F.

the rest is fairly straightforward: a nested lambda expression which
uses the provided 'factorial' function to compute one factorial
reduction step:

F =

(lambda (factorial)
  (lambda (x)
    (if (zero? x)
        1
        (* x (factorial (- x 1))))))


the thing which always tricked me is 'fixed point', because i was
thinking about iterated functions on the reals used in many iterative
numerical algorithms like the newton method. in the lambda calculus,
there are only functions and applications, so a fixed point IS the
infinite nested application, since that fixed point value doesn't have
another representation, while a fixed point of a function on the reals
is just a point in the reals.





Entry: algebraic data types
Date: Tue Sep 18 13:44:48 CEST 2007

look no further.. the plt-match.ss actually has this kind of stuff, at
least the pattern matching associated to algebraic types. and i think
it is extensible.

http://download.plt-scheme.org/doc/371/html/mzlib/mzlib-Z-H-34.html
http://en.wikipedia.org/wiki/Algebraic_data_type

"In computer programming, an algebraic data type is a datatype each of
whose values is data from other datatypes wrapped in one of the
constructors of the datatype. Any wrapped data is an argument to the
constructor. In contrast to other datatypes, the constructor is not
executed and the only way to operate on the data is to unwrap the
constructor using pattern matching."


Entry: pic network
Date: Tue Sep 18 20:40:10 CEST 2007

1. simple: 2 wires
2. robust: working boot loader



Entry: parser-tools lexer
Date: Thu Sep 20 19:20:11 CEST 2007

i'm replacing the lexer with the one from parser-tools. this is a lot
lot easier than writing your own. what a big surprise; too bad i
postponed it for so long..



Entry: message passing
Date: Thu Sep 20 21:15:05 CEST 2007

hmm.. message passing concurrency seems to be the real solution of
tying a core and metaprogrammer together. i should find out how to
formalize message passing (i.e. Peter Van Roy and and Seif Haridi's
book "Concepts, Techniques, and Models of Computer Programming"
http://www.info.ucl.ac.be/~pvr/book.html)



Entry: work to do
Date: Sat Sep 22 19:42:35 CEST 2007


* documentation
* bootloader (+- DONE)
* independent of emacs?

preparing for waag & piksel, the most important problem to solve is to
make the bootloader robust. this is probably best solved as:

    serial cable  plugged -> start console
    unplugged (i.e. with jumper to gnd) -> start app (at 0x200 hex)
    all interrupt vectors moved to 0x200 block

then this block can be made write-protected, so there's absolutely no
way to mess it up -> can eliminate ICD2 connector on boards.


Entry: purrr manual questions + necessary fixes
Date: Sun Sep 23 13:30:49 CEST 2007

* can i get at least an 16--bit library running without making it stand-alone?
* how difficult is it to unify macros and words from user perspective?
  -> interaction always compiles a 'scrap' function.
* is it possible to write all control structures in terms of tail recursion?

the more filo ones:

* exceptions are imperative features.. is this bad? when is this bad?
  it's like using continuations, which is interesting for backtracking
  etc. i'm leaning toward pure functional programming, but some
  imperative features are really OK as long as they are
  shielded. i.e. global mutable variables are clearly not. (namespace:
  single assignment = ok + possible to hack for debug).


Entry: new bootloader fixes
Date: Mon Sep 24 12:37:41 CEST 2007

i got the monitor working, now i need to get the synth back up. some
things that need fixing from the debugging side:

* a correct jump assembler (+- DONE: throws exception)
* a correct disassembler (+- DONE: lfsr broken)
* constants in console (DONE)
* cache macro compilation
* a command to erase a block of code during upload


note about field overflows: for data values, it should be ok: it's
quite convenient to assume they are finite size. for example, banked
addressing.

for code it's an error, since you don't have any control over this
while programming.


Entry: error reporting
Date: Mon Sep 24 14:15:54 CEST 2007

yes, i am at fault here. never really gave it much thought, but it's
starting to become a problem. my error reporting sucks.

one of the most dramatic problems is the loss of line numbers to
relate errors to original code. a solution for this is to use syntax
objects everywhere.

second is the way errors are handled in the assembler. currently i
have some code that's a bit hard to understand: i got used to hygienic
macros, and symbol capture looks convoluted to me. maybe i just need
to rewrite that first?

hmm.. what about systematicly replacing 'raise' with something more
highlevel. one of the things that is necessary is a stack trace. there
was some talk on the plt list about this recently. let's have a look.

there is (lib "trace.ss") which doesn't really do what i need, since
it's active. what about taking this error reporting seriously, and
giving it its own module? would be good to eventually document all
possible errors etc.

what about the following strategy: every dubiously reported error will
be fixed, no matter what it takes.


>> c>
ERROR:
#<case-lambda-procedure>: no clause matching 1 argument: (qw)

this is a stack underflow error

i was thinking about installing an error translator in rep.ss, but
this kills the tail position. therefore, errors need to be translated
at the top entry point, which in this case is in prj.ss

it's really not such a simple problem.. need to define what
information i'd like to get: errors need to b e reported at
'interface' level which is either compile/run of files/words.

compile errors are most problematic since they need to be related to
source location..


Entry: state mud
Date: Tue Sep 25 14:05:35 CEST 2007


the prj.ss file should do nothing more than fetching/storing state and
passing it to pure functions. i am a bit appalled by the way things
work in prj.ss, because this state binding tends to swallow
everything..

maybe it's not such a good idea after all? i guess it is still a good
idea, but its only function should be to manage state.

let's rehash state stuff:

* only prj.ss contains permanent state
* I/O uses read-only dynamic scope for the read/write ports
* macros etc.. are supposed to be read-only cache
* all the rest is functional

UPDATE: Thu Sep 27 22:56:03 CEST 2007
- moved some functionality to badnop.ss
- adopted a left/right column notation for state/function



Entry: boot code and incremental upload
Date: Tue Sep 25 15:05:23 CEST 2007

the basic rule for forth is: code is incremental. if you need to patch
backward, you need to do an erase + burn cycle. how to do this
automaticly?

it's probably not so hard to solve by performing (CRC) checks on
memory.



Entry: core syntax
Date: Tue Sep 25 18:05:36 CEST 2007

just writing the purrr manual and i got back to this language tower
thing... i really need a core s-expression based syntax for code with
multiple entry and exit points, instead of forth.


Entry: or
Date: Tue Sep 25 19:44:24 CEST 2007

Something that's really handy in scheme is a short-circuiting 'or'.
i'm in need for something like that to define interactive word
semantics: try executable words first, then try variable names, then
try constants (or later macros). In scheme this is easy because
variables are referenced multiple times, in CAT this is awkward due to
explicit copying/restoring of the argument stack.

Some backtracking formulation would be nice, but generic backtracking
is overkill. It also requires explicit handling of the continuation
object. Escaping continuations work fine here, and they can be stored
in a dynamic parameter, so no explicit manipulation of continuation
objects is necessary.

With 'check' being a word that aborts the current branch if the top of
the stack is false, using the quasiquote (see next post) this is
simply:

`(,(foo check do something check more stuff)
  ,(bar check do something else)
  ,(in case everything fails)) 
attempts

The apology:

 In a compositional language, escape continuation (EC) based
 backtracking might take the role of a conditional expression because
 it's often easier to go ahead and backtrack on failure than to
 perform a number of tests/asserts ahead of time which might CONSUME
 your arguments, so you need to SAVE them first. An EC can be used to
 restore the contents of the stack before taking another branch.


The disadvantage of course is that words that use 'check' are only
legal within an 'attempt' context, and are not referentially
transparent. I guess this is ok.. same as using catch/throw.

I do feel a bit like a cowboy now.. What about distinguishing 'bad'
exceptions from 'good' ones? Using exceptions in CAT has always been
awkward, but the 'attempts' syntax here seems nice.



Entry: quasiquote
Date: Tue Sep 25 22:12:34 CEST 2007

what about postscript style [ ] quotation to create data structures
with functions?  i can't use [ ] or { } since mzscheme sees them as
parentheses. only angle brackes are left alone.. so either i'm
creating a syntax extension, ie.e (list: (bla) (foo) (bar)) or i use
an angle braket structure. since the latter will work, i'm using that:
<* *>

what about just using the quasiquote here? i'm not using it anywhere
else and i'm already using quote. it's only legal on programs: and
unquote means: insert program body here.



Entry: assembler optimizations / corrections
Date: Wed Sep 26 02:05:11 CEST 2007

A) jump size optimization

currently i have none. recently i introduced at least error reporting
on overflow. i think the deal is that doing it 'really right' is
difficult; i'm not sure there exists an optimal algorithm. the
simplest approach is: 
 
  * convert small -> long jump
  * increment/decrement jumps before/after the instruction
  * update dictionary accordingly

it's probably easiest to do this on an already fully resolved buffer
(after 2nd pass). this algorithm is confusing due to the
forward/backward absolute/relative destinction. also, doing this
without mutation seems troublesome.


B) jump chaining

was really easy in the original badnop due to use of side-effects.


somehow this problem looks as if there's some weird control structure
that might help solve this is a more direct way.

OK... finding the optimal is apparently NP-complete

http://compilers.iecc.com/comparch/article/07-01-037

> [There was a paper by Tom Szymanski in the CACM in the 1970s that
> explained how to calculate branch sizes. The general problem is
> NP-complete, but as is usually the case with NP-complete problems,
> there is simple algorithm that gets you very close to the optimal
> result. -John]

or not?

http://compilers.iecc.com/comparch/article/07-01-040

  If you only want to optimize relative branch sizes, this problem is
  polynomial: Just start with everything small, then make everything
  larger that does not fit, and reiterate until everything fits.
  Because in this case no size can get smaller by making another size
  larger, you have at worst as many steps as you have branches, and
  the cost of each step is at most proportional to the program size.


so, it looks like the simple approach of using short branches and
expanding/adjusting + checking is good enough. 




Entry: platforms
Date: Wed Sep 26 05:11:06 CEST 2007

been thinking a bit about platforms. some ideas:

* 32 bit + asm makes no sense. GCC is your friend here, and should
  generate reasonably good code for register machines. split language
  into 2 parts: POKE for control stuff, and some kind of dataflow
  language for dsp stuff.

* AVR 8 bit makes not much sense either. there is GCC and i already
  spent a lot of time optimizing 8 bit opcodes.. learning the asm
  sounds like a waste of time.

* don't know if PIC30 makes a lot of sense. it is an interesting
  platform (PDIP available), and they are reasonably powerful, if not
  a bit weird.

maybe focus on PIC18, and a small attempt to get a basic set of words
running for PIC30?



Entry: capacitance to digital
Date: Wed Sep 26 05:26:25 CEST 2007


http://www.microchip.com/stellent/idcplg?IdcService=SS_GET_PAGE&nodeId=2599&param=en531579    

CAPACITANCE TO DIGITAL CONVERTER

To convert the sensor’s capacitance to a digital value, three things
have to happen. First, comparators and the flip flop in the comparator
module must be configured as a relaxation oscillator. Second, the
desired sensor must be connected to the relaxation oscillator. Third,
the frequency of the oscillation must be measured. The configuration
of the comparator and the SR latch require configuring the
comparators, the SR latch, and the appropriate analog
inputs. Connecting the sensor to the oscillator requires the control
software to select the appropriate analog input to the comparator
module’s multiplexer. It must also select the appropriate input to any
external multiplexer between the sensors and the analog inputs of the
chip. To measure the frequency of the oscillation, TMR1’s clock input
must connect to the output of the relaxation oscillator, and a fixed
sample period will be controlled by TMR0.

To start a frequency measurement, both TMR0 and TMR1 are cleared. The
TMR0 interrupt is then enabled. When the interrupt fires, TMR1 is
stopped, and the 16 bit frequency value in TMR1 is retrieved. Both
TMR0 and TMR1 can then be reset for the next measurement.

To keep the accuracy of the frequency measurement consistent, the
interrupt response time for the TMR0 interrupt must be kept as
constant as possible, so no other interrupt should fire during a
measurement. If one does, then the measurement must be discarded and
the frequency measurement must start over.

Once the 16-bit value is retrieved, the detector/decoder algorithms
can determine if the shift in frequency is a valid touch by the user
or not.

For more information on the interrupt services routine for TMR0, and
the initialization of the relaxation oscillator, refer to application
note AN1103 on Software Handling for Capacitive Sensing.


Entry: todo list
Date: Wed Sep 26 19:46:52 CEST 2007

URGENT:

* word reference manual

the primary goal would be to have documentation available at the
command console or during emacs editing, instead of just in paper
form. a tutorial can come later. now where  do i specify it?

* code protect the boot sector (OK)
* interaction macros (needs syntax + minor support in prj.ss)
* readline console
* command line completion
* make it installable (-> solve library deps? : install in collects?)
* check battery / BREAK resistor
* simplify prj.ss into chunks that operate on state explicitly (OK)


NOT URGENT:

* macro cache (maybe explicit files? read about compilation)
* scheme library split (module path handling)
* bootloader: automatic boot (2ND) sector patching
* assembler changes + functional macros


Entry: boot code
Date: Thu Sep 27 15:57:28 CEST 2007

this is actually pretty important. once i start sending out kits, it's
not so easy to change the bootloader.

some things to note:
- monitor.state -> (dict ...) format is the only important part
- boot sector is independent of any macros : only words count
- machine model obviously needs to stay stable (hasn't changed in years)
- binary api (the monitor commands) need to stay stable


what about making the bootstrap interpreter simpler? get rid of
anything other than 'receive transmit execute', and leave the rest to
a dictionary? if there's ever a problem for portability or whatever,
this might be the way to go: this interface allows to hide all
functionality in the dictionary associated to the boot kernel. right
now it's still quite pic-specific. some things will become less
efficient though..

also, the way the code is organised, sending commands will become more
difficult. the set i have now is complete enough, and reasonably
efficient. let's keep it simple and stick with the current one.

another thing: fixing the boot block. let's try that:
setting 30000B to A0 does the trick (CONFIG6H : WRTB)


Entry: using the ICD2 pins.
Date: Thu Sep 27 15:53:34 CEST 2007

last couple of days were a bit too much on the dreaming side. i need
something concrete to fix. i was thinking about simplifying the
programming interface. was thinking about using the ICD2 pins to also
do debug serial comm. but why? if my boot kernel is stable, this is
entirely unnecessary, except for reset!


Entry: ramp up to purely functional macros
Date: Thu Sep 27 20:57:29 CEST 2007

the parser.

STAGE 1:

- rewrite 'constant' as a macro definition

- separate macros from the body code, which is seen as a single
  function with multiple entry/exit points.


problem still not solved: 'variable'

currently, variable creates a constant containing a symbol, and 'code'
that performs the allocation later during the assembly phase. so in
fact, it's not so problematic.


Entry: prj.ss
Date: Thu Sep 27 22:59:52 CEST 2007

simplified it a bit: made state ops more explicit, and moved
functionality to badnop.ss

this looks like a nice approach in general. i do wonder why i still
need 'functional state' at the prj.ss level: most state updates are
intermingled with microcontroller state updates which are dirty
anyway.

one thing: it keeps me honest. on the other hand, i'd like to move to
some "image" representation. cached macros would be cool. maybe i should
look at that now.



Entry: macro cache
Date: Fri Sep 28 00:04:07 CEST 2007

it looks like the bulk of the 'revert' time is spent in needlessly
compiling code. there aren't so many run-time created macros: and
constants are currently not 'eval'ed. maybe i should make that so i
can snarf them out.

hmm.. spaghetti. the problem is that constants are still treated
separately. i can't unify them with macros until macros are purely
functional so they can be evaluated to see if they produce constant
values. solution dependences:

      file parsing to distinguish macro/code
then: purely functional macros
then: elimination of assembler constants

however, doing the first one requires elimination of assembler
constants!

looks like this is the reason why i can't oversee the problem: it's
quite a big loop. anyways, i can write the parsing step and test it
leaving the side-effecting macros intact. then move to side-effect
free macros and change the constant parsing to translate constants to
macros.

so. maybe i need an S-expression syntax first, so i can translate code
to it! for macros this is easy: i'm already using one. for composite
code however, it becomes more difficult due to the multiple entry-exit
points. this can be left alone in a first attempt.




Entry: product vision statement
Date: Fri Sep 28 01:05:20 CEST 2007

http://www.codinghorror.com/blog/archives/000962.html

for (target customer)
who (statement of need or opportunity)
the (product name) 
is a (product category)
that (key benefit, compelling reason to buy)
unlike (primary competitive alternative)
our product (statement of primary differentiation)


for embedded software developpers
who want to program small embedded systems
the Brood system
is a tool chain
that supports incremental bottom up development
unlike C
our product has integrated metaprogramming through built-in macros.


something like that..
interesting.




Entry: documentation
Date: Fri Sep 28 14:03:33 CEST 2007

write a purrr manual in tex2page by sending queries to the brood
system. this should use an interface similar to snot.

brood needs to be centered around services, of which snot is one. so
let's try this:


services with

  - direct access to brood for SNOT and RL
  - document generation


Does services.ss run inside the sandbox?

YES

So all calls from snot.ss -> services.ss go through a sandboxed
eval. Services.ss itself does not need to take care of this, and can
use direct calls.

the deal is this:

a CONSOLE needs to separate:

   - TOPLEVEL (represented by eval)
   - STATE (a data structure stored independent of toplevel)


Entry: persistence
Date: Fri Sep 28 19:26:54 CEST 2007

i must not forget that the way i use persistence is a SOLUTION, and
not the original problem.

the real problem is a conflict between two paradigms:

* TRANSPARENCY as in MzScheme's module system
* image persistence and run--time self modification


as usual, my problem is rooted in ignorance. i've been jabbing about
the distinction between the two above for a while, but the real
problem is compiler compilation time.

i need to have a look at MzScheme's unit system. it sould be possible
to reload units after recompiling them because they are mere
interfaces.


Entry: services
Date: Fri Sep 28 23:24:59 CEST 2007

hmmm.. i didn't really get anywhere today. but at least i figured out
what 'services' should be. it's just the stuff that snot has access
to, but without the snot interface. i renamed it 'console.ss' and took
it out of 'snot.ss', which is now just a bit of glue.


Entry: forth preprocessing
Date: Sat Sep 29 15:51:12 CEST 2007

parsing and lexing.
it's divided in a somewhat un-orthodox way

LEXING

there are 2 front ends:
  forth-lex              :: string -> atom stream
  forth-load-in-path     :: file,path -> atom stream

the lexing part flattens the load tree. i.e. during lexing, the source
code is made independent of the filesystem.


PARSING

this is where i have to break things, so let's commit first.

1. flat forth stream -> compositional forth stream with macros removed
2. constants -> macros

let's see if i understand: constants are bad. there is no way around
the fact that constant swallows a value: it's the worst case of
reflection. this is not compatible with current parser. keeping it
would require lookahead.

so 'constant' needs to be replaced entirely by 'macro' in source code.

looking at the previous entry [[phase-separation]] what is required is
indeed a parsing step that can translate

       1 2 + constant x  -->  macro : x 1 2 + ; forth

yes, this is of course possible, but is it really worth it? maybe it's
better to clean up the Purrr language semantics now than to carry
around the code that allows this. ad-hoc syntax is a nuisance.

so, current path: CONSTANTS are being removed.

that was easy :)

now, for variables.


     variable abc 

does 2 things: it creates a macro that quotes itself as a literal
address, and it adds code that tells the assembler to reserve a RAM
slot.

maybe i should use 'create' and 'allot' ?
(back to that later)

currently the parsing seems to work, except for the macro/code
separation step. for this i need a stream splitter. in stream.ss i
have '@split', which just splits off the head of a stream, not true
splitting.


status:
- parsing step: ok
- load! setep: ok (like previous load, but with macro defs separated)

next:

- remove all side-effecting macros
- change the assembler to take values from macros


remarks: 
  * is dasm-resolve still possible?  (value -> symbol)

status:
- monitor.f -> monitor.hex gives the same code



Entry: cleanup
Date: Sat Sep 29 21:21:54 CEST 2007


core changes seem to be working. the rest is cleanup. TODO:

- fix variable (OK)

- fix interaction constants (OK)

- fix sheepsint (OK)

- extract macros from forth file -> compositions + save as cache (OK)

- fix interaction macros that reduce to expressions

- trick macros into generating their symbol during compilation, and
  value during assembly. (restore disassembly constants)

- clean the assembler name resolver


Entry: storing application macros in state file
Date: Sat Sep 29 22:31:46 CEST 2007

why not?

this solves a lot of problems.. and they are available in source form,
so there's not problem to store them symbolically.



Entry: profiling
Date: Sun Sep 30 03:15:31 CEST 2007

on sight.. but still quite remarkable. loading monitor.f from source
to S-expressions takes a lot more time than either compiling the
macros or compiling the code to a macro and running it. both are
instantaneous.

ha!

actually, that's very good news. improving the speed of the lexer
seems a lot easier to do than improving the speed of the compiler.

looking a bit further, sheepsint.f seemed to be faster. the reason is
thus the constants. maybe i should just put them back to
s-expressions? they don't change much after all.



Entry: upload speed
Date: Sun Sep 30 03:40:57 CEST 2007

It's quite annoying the upload speed is so slow. I need a way to
change the speed on the fly.

EDIT:
baud rate: commit goes a little bit faster when baud rate is changed
from 9600 to 38400, so the limiting factor is probably the flash
programming.


Entry: parsing and printing
Date: Sun Sep 30 16:17:09 CEST 2007

there are a couple of placese in the brood code where (regular)
parsing and printing are done in a relatively ad-hoc way using
'match'. maybe i should have a look at extending match to provide
better pseudo "algebraic types".

EDIT:
http://www.cs.ucla.edu/~awarth/papers/dls07.pdf (*)
http://planet.plt-scheme.org/package-source/dherman/struct.plt/2/3/doc.txt

(*) looks really interesting. also, i need to have a decent look at COLA:
http://piumarta.com/software/cola/

EDIT:
i changed the syntax for the peephole optimizer to something more akin
to algebraic types & matching.. still a bit of a hack, but there's a
better adapted quoting mechanism now.


Entry: deleted from brood.tex
Date: Sun Sep 30 19:25:53 CEST 2007


Some important assumptions I'm making to support the current solution
are that code updates need to be made \emph{while running}, and that
the target is severely \emph{resource constrained} such that all
compilation and linking needs to be done off--target. This excludes
\emph{late binding} of most code.

Another assumption I'm making is that some binary code on the target
will never be replaced, and will drift out of sync with the evolution
of the language in which it was written. An example of this is a
\emph{boot loader}. Such code needs to be viewed as a black box.

This approach violates transparency.

To give this section some context, I have to make my \emph{beliefs}
more explicit. I believe that a compiler is best implemented using
pure functional programming, because it is in essence a
\emph{function} mapping a source tree to a binary representation of
it. This idea is easily extended with \emph{bottom up} programming,
where part of the source tree generates a compiler to compile other
parts of the source tree. In order to make this work, I believe you
need \emph{transparency}. By this I mean that all \emph{reflection}
(compiler compilation) is \emph{unrolled} into a directed acyclic
graph representing a code dependency tree.

On the other hand, I believe that a microcontroller is best modeled as
a \emph{persistent} data structure. A microcontroller is a
\emph{physical object}, and should be modeled as such,
\emph{independent} of the compiler that is used to create the code
comprising the object state. This is what makes Forth interesting: the
ability to \emph{incrementally update} without having to recompile
everything. Due to limited hardware support (flash ROM is not RAM),
\emph{late binding} becomes problematic, and also induces a
significant performance penalty. This makes \emph{early binding} a
reasonable alternative: in the end the objective is to at least
provide the possibility to write efficient code at the lowest level of
the target language tower.

This is the heart of the paradigm conflict. Where do I switch from a
transparent language tower to \emph{dangerous} manually guided
incremental updates? Maybe the question to answer would be: why does
one want to have this kind of low--level control anyway? The real
answer is that at this moment, I don't really know how to create a
transparent system. The real reason for that is that I've been locked
in a certain paradigm.

Let's explore what would happen if we lean towards any of the two
extremes. If the whole system were transparent, the controller code
would need to be treated as a filesystem if incremental updates were
still to be used. After code changes, one could simply recompile,
relink and upload only the parts that changed. This is the sanest
thing to do.


Entry: misc improvements
Date: Sun Sep 30 21:35:40 CEST 2007

note that 'load' as it does currently doesn't 'commit'. actually,
that's not how it's used mostly! also, automatic commit might be nice
for compile mode..

on the other hand, compile mode is kind of an advanced feature also.



Entry: structures for music
Date: Mon Oct  1 05:48:46 CEST 2007

this is more of a tutorial pre. i saw aymeric was using the stack to
store sequences, which is not a good idea.. i see 2 other ways: flash
and ram. i kinda like the x / . approach for pattern synths. the trick
is to do multiple voices, so i really need some kind of multitasking.

say i have 3 patterns

: bd  o . . . o . . .  bd ;
: sn  . . . . o . . .  sn ;
: hh  o . o . o . o .  hh ;

what do o and . do ?

let's assume that recursion is not allowed in these patterns. what can
we hide in a single invocation? a simple trick is to use the
dictionary shadowing: the words could call some fixed word, which is
re-implemented later.

: instrument   do something ;

: bd   o . . o . .  bd2 ;
: bd2  . . o . . o  bd ;

we could have:

: o instrument yield ;
: . yield ;

hmm.. it's probably better to directly use names instead of this
name-capture thing.

if recursion is disallowed, it should be possible to store each thread
in a single byte, so a lot of threads are possible. in that case, an
explicit interpretation and automatic looping might be better, using
routing macros.





Entry: purrr reference documentation
Date: Mon Oct  1 16:13:01 CEST 2007

documentation for each macro. this contains 2 things:

- stack effect (type)
- 1 line human readable doc which possibly points to more information.

so a word's meta info looks like (+)

((type . (a a -- a))
 (doc  . "Add two numbers"))

if i can't do types yet, i should at least put the stack effect in a
form that can be used later to do types. it's also probably a good
idea to add meta-data separately to not clutter the code.

so, how to infer types? from the lowest level (pattern matching
macros) i can infer a lot.

first some cleanups: i'm taking out the 'compiled' field in the word
structure, because it's better to just save the source of macros
before they're being compiled, instead of trying to recover them
later.

what about word-semantics? i forgot the reason why sometimes it cannot
be filled.

been poking in the rpn.ss internals and i guess it's best to have the
state tx take a compiler for backup. but, this doesn't work for some
other reason i can't remember.. tata: spaghetti.

let's see if i can hack around it now by simply providing a language
name for backup.


Entry: i need closures
Date: Mon Oct  1 20:26:22 CEST 2007

yep..

too much crap going on with trying to call from prj -> base and having
to pass arguments.

EDIT:
when i wrote 'compose' i made sure to not allow composition between
words with different semantics. however, i'm not so sure if that's a
good idea.. i only want to use closures on functional words, not on
state words. maybe is should let go of this control freakish behaviour
since the source rep is only debug: it doesn't work relyably for all
words to reconstruct from that source..



Entry: dsPIC
Date: Tue Oct  2 03:46:01 CEST 2007

maybe it's time to try it out, and gently grow it into being. some
challenges:

- 3 bytes / instruction
- 16 bit datapath
- addressing modes

flash block erase size is 96 bytes, but address-wize this counts as 32
instruction words.

  The dsPIC30F Flash program memory is organized into rows and
  panels. Each row consists of 32 instructions, or 96 bytes. Each
  panel consists of 128 rows, or 4K x 24 instructions. RTSP allows the
  user to erase one row (32 instructions) at a time and to program
  four instructions at one time. RTSP may be used to program multiple
  program memory panels, but the table pointer must be changed at each
  panel boundary.

I don't understand why it says 'four instructions at a time' and then
later on talks about 32 at a time: "The instruction words loaded must
always be from a group of 32 boundary."

And the confusion goes on "32 TBLWTL and four TBLWTH instructions are
required to load the 32 instructions."

this looks like a typo.. let's download a new version of the
sheet. got DS70138C now. they're at version E. it's got the same typo.

so assume i need to write per 32 instructions + some magic every
4K instructions (updating a page pointer?). apart from the latter it's
quite similar to the 18f, just a larger row size size.

it looks like this thing is byte addressed, but for each 2 bytes,
there's an extra 'hidden' byte! lol

ok, there is a sane way of looking at it: the architecture is 16-bit
word addressed, but every odd word is only half implemented:
instruction width is 3 bytes.

it looks like it's best to steer the forth away from all the special
purpose DSP tricks like X/Y memory and weird addressing modes. looks
like an interesting target for some functional dataflow dsl though.

there are 2 kinds of instructions: PIC-like instructions that operate
on WREG0 and some memory location, and DSP-like instructions that use
the 16 registers.

roadmap:
- find a 8bit -> 16bit migration guide from microchip
- partially implement the assembler to PIC18 functionality



Entry: direct threaded forth
Date: Tue Oct  2 07:26:49 CEST 2007

i'm toying a bit with the vm forth. and was thinking: it's not
necessary to go stand-alone. it's much better to test this vm forth as
another target.


Entry: type signatures from pattern matching macros
Date: Tue Oct  2 14:38:47 CEST 2007

It should be possible to mine the 'source' field of pattern matching
macros for types, or at leas stack effect, of functions.

the first matching rule is always the most specific one: if that fits
a certain pattern.

the REAL solution here is to change the pattern matcher to REAL
algebraic types instead of this hodge-podge. moral of the story:
whenever pattern matching occurs on list structure, what you really
are looking for is algebraic types.

yes... i'm not going to muck around in this ad-hoc syntax. i need a
real solution: something on top of the current tx. i need real
algebraic types.

there is this:
http://planet.plt-scheme.org/package-source/dherman/struct.plt/2/3/doc.txt

but for my purpose it might be better to just stick with the current
concrete list representation for the asm buffer.

what about:


(([qw a] [qw b] +)   ([qw (+ a b)]))

->

((['qw a] ['qw b] +)   `([qw ,(+ a b)]))


looks like 'atyped->clause' in pattern-tx.ss is working. it's indeed
really simple to implement on top of the matching clauses.

looks like 'asm-transforms' works too.

i ran into one difficulty though. call it polymorphism. in original syntax:

   (([op 'POSTDEC0 0 0] ['save] opti-save) `([,op INDF0 1 0]))

cannot be expressed in the new syntax. however, this is
exceptional. it's probably a good idea to make this polymorphism
explicit. EDIT: it is possible to use unquote!! a bit of abuse of
notation, but ...

let's write the pic18 preprocessor on top of asm-transforms instead of
compiler-patterns.

ok. done. old one's gone.

now it should be a lot easier to write some documentation or type
inference..

i tried to tackle the 'pic18-meta-patterns' but i don't seem to get
anywhere. current syntax is way to complicated. it really shouldn't be
too hard by taking a more bottom up approach instead of trying to use
'callbacks' that force the preprocessing of some macro's
arguments. write a single generator macro for each kind.

trying again. this is the thing i want to generate:

  (define-syntax unary
    (syntax-rules ()
      ((_ namespace (word opcode ...))
       (asm-transforms namespace
                       (([movf f 0 0] word) ([opcode f 0 0])) ...
                       ((word)              ([opcode 'WREG 0 0])) ...))))
    
from this

  (asm-meta-pattern (unary (word opcode))
    (([movf f 0 0] word) ([opcode f 0 0]))
    ((word)              ([opcode 'WREG 0 0])))


the thing which seems problematic to me is the '...'

more specificly

(pattern template) ...   ->   (pattern template) (... ...) ...

that doesn't seem to work.

it looks like the 'real' problem here is due to the fact that i'm
expanding to something linear.. i'm inserting stuff. i wonder if it's
possible to modify the asm syntax a bit so it will flatten
expressions.

wooo.. macros like this are difficult. i'm currently doing something
wrong with mixing syntax-rules with calling an expander directly. best
to stick with plain syntax-case and direct expansion: that's easier to
get right.

the deal was: sticking with syntax-rules as a result of a first
expansion worked fine, i just needed to put the higher order macro in
a different file for phase separation reasons.

so.. the remaining step is to collapse the compiler-patterns-stx
phase, and add the current source patterns to the word source field,
which would yield decent docs.

ok, done.

> msee +
asm-match:
((((qw a) (qw b) +) ((qw `(,@(wrap a) ,@(wrap b) +))))
 (((qw a) +) ((addlw a)))
 (((save) (movf a 0 0) +) ((addwf a 0 0)))
 ((+) ((addwf 'POSTDEC0 0 0))))
> 

that should be easy enough to parse :)
CAR + look only at qw.

the 'wrap' thing is something that needs to be cleaned up too.. i
tried but started breaking things. enough for today.

this is what i get out for qw -> qw

(((qw a) (qw b) --) ((qw `(,@(wrap a) ,@(wrap b) #f))))
(((qw a) (qw b) >>>) ((qw `(,@(wrap a) ,@(wrap b) >>>))))
(((qw a) (qw b) <<<) ((qw `(,@(wrap a) ,@(wrap b) <<<))))
(((qw a) drop) ())
(((qw thing) |*'|) ((qw thing)))
(((qw a) (qw b) ++) ((qw `(,@(wrap a) ,@(wrap b) #f))))
(((qw a) (qw b) swap) ((qw b) (qw a)))
(((qw a) dup) ((qw a) (qw a)))
(((qw a) (qw b) or) ((qw `(,@(wrap a) ,@(wrap b) or))))
(((qw a) (qw b) and) ((qw `(,@(wrap a) ,@(wrap b) and))))
(((qw a) neg) ((qw `(,@(wrap a) -1 *))))
(((qw a) (qw b) xor) ((qw `(,@(wrap a) ,@(wrap b) xor))))
(((qw a) (qw b) /) ((qw `(,@(wrap a) ,@(wrap b) /))))
(((qw a) (qw b) *) ((qw `(,@(wrap a) ,@(wrap b) *))))
(((qw a) (qw b) -) ((qw `(,@(wrap a) ,@(wrap b) -))))
(((qw a) (qw b) +) ((qw `(,@(wrap a) ,@(wrap b) +))))

i also made a 'print-type' function. for '+' :

((qw qw) => (qw))
((qw) => (addlw))
((save movf) => (addwf))
(() => (addwf))

this might be useful.. but what's more useful is the building of a
framework that enables this for all functions. it works for the
assembler primitives only.


Entry: TODO
Date: Sat Oct  6 20:54:13 CEST 2007

- live command macros
- put live commands in a namespace
- add doc tags: math/control/predicates/...
- write small tutorial:
  * assembler + PIC18 architecture
  * logic, addition and 8 bit programming (hex + binary)
  * the x and r stacks
  * route (DONE)
  * predicates & conditionals
  * run time computations & ephemeral constructs
- fix macro cache re-init + initial state loading (DONE)
- fix quoting in macros
- fix hardcoded paths (rename brood/brood)
- rename compilation stack

Entry: unsigned demodulator
Date: Sat Oct  6 21:30:01 CEST 2007

the pic18 has a hardware multiplier, which is nice. however, computing
signed multiplication takes quite a hit compared to unsigned. i was
wondering if i can do an amplitude-only demodulator using only signed
multiplications.

the entire function is unsigned -> unsigned.

    signal -> mixer -> I / Q -> I^2 + Q^2 -> LPF


[EDIT: deleted a long erroneous entry. the thinking error was about
the commutation of the LPF and the squaring operation. the above
expression just gives the average signal power.]

the correct formula is:

  X -> (I,Q) => LPF -> || . ||^2

that's completely symmetric wrt to phase. 

the LPF is straightforward: a simple 1-pole will probably do if i keep
the bitrate low. a 2^n-1 coefficient is easy to implement without
multiplication.

the I=XC and Q=XS multiplications can probably be simplified since X =
x-h and C = c-h have no DC components. here h = 2^(bits-1).

  I = X C
    = (x-h) (c-h) 
    = xc - hx - hc + h^2
    = xc - h(x + c - h)
    = xc - h(x - h + c - h - h)
    = xc - h(X + C + h)

    DC
    = xc - h^2

which is quite intuitive: take the average of xc, but remove the dc
component.



Entry: frequency decoding
Date: Mon Oct  8 00:26:37 CEST 2007

for krikit, the choice to make is to either decode the whole spectrum
(listen to everything at once) or listen only to a single band. this
is a choice that has to be made early on.. some remarks.

* FFT + listening to all bands is probably overkill. it's not so
  straightforward to implement, so the benefits should be big. FFT for
  point-to-point makes only sense when combatting linear distortion.

* single frequency detection is really straightforward. the core
  routine is a MIXER followed by a complex LPF. the output is phase
  and amplitude.

* using a sliding window average LPF together with orthogonal
  frequencies allows for good channel separation. this works for
  steady state only, so some synchronization mechanism is necessary.

* sending out a single message to multiple frequencies: easy to do
  with pre-computed tables for 0 and 1. phase randomisation to avoid
  peaks is possible here.

* i'm afraid of linear distortion due to room acoustics.. maybe FM/FSK
  should be used?

* if non-linear distorition is not a problem, DTMF frequencies are not
  necessary.

* using exact arithmetic, it is easy to update/downdate a state vector
  for rectangular window LPF. this update can be performed at the
  input of the mixer.

* bandwidth limitation for transmission.

http://en.wikipedia.org/wiki/Olivia_MFSK




Entry: network debugging + pic shopping
Date: Mon Oct  8 15:27:25 CEST 2007

- avoid remote reset: use WDT
- central power gives panic switch
- use a standard bus protocol for comm (I2C ...)

The 18f1220 doesn't have I2C, so it might be better to go for a
different component. Lowest pin count is 28. Let's take the one with
the most memory to have some room for tables and delay lines. I'm
thinking about

18f2620:  64kbytes flash, 3968 bytes ram (maxed out)

This is also a nice target for a standalone language. These are the
same, with some things missing.

          EEPROM (b)     FLASH (kb)
18f2620    1024           64
18f2610    0              64
18f2525    1024           48
18f2515    0              48




Entry: 8 bit unsigned mix -> complex 16/24 bit acc
Date: Mon Oct  8 15:32:30 CEST 2007

I've been toying a bit with a mixer + accumulator building block, and
it seems it can be quite simple. Some remarks:

- Perform signed offset correction out of the accumulation loop.

- Perform update/downdate for rectangular window at input due to
  commuation with mixer.

- As long as the result of accumulation fits in the word length,
  overflow is not a problem.


If a signed number X is represented by an unsigned number x, the
difference is X = x - h, where h = 2^{n-1} is 'half'. Per signed
multiplication there is an offset of h^2 = 2{2n-2}.

What this means is that once per 4 accumulations, the correction term
disappears due to word overflow if 2 bytes are used. However, the
maximal filter output occurs at full scale input, which will overflow
the accumulator if more than 4 accumulations are used, so maybe it is
better to use a 3 byte state. In any case, if the number of
accumulations is a power of 2, removing the unsigned offset is a
simple bit operation.



Entry: transmission
Date: Mon Oct  8 16:46:26 CEST 2007

Using hardware PWM with 8 bit resolution I can send out at 39kHz,
assuming fosc = 40HHz. This is still well beyond the maximal frequency
at about 3kHz, and won't pass the speaker, so an analog filter is not
necessary. Differential drive (half bridge) could be used.

One thing to note is that only ECCP (enhanced) can do multi-channel
PWM. The normal PWM is only single output, and all 28 pin chips have
just 2 x CCP. The 18 pin 18f1x20 has a single ECCP, and the 40/44 pin
18f4xxx also have one.

Looks like that's quite a limitation.. On the other hand, a CMOS
inverter could be used on-board.. Is that worth it? Probably not. A
simple coupling condenser will do the trick.




Entry: self programming 5V
Date: Mon Oct  8 18:01:24 CEST 2007

Something i just noticed in the 2620 datasheet: self-programming works
only at 5V ?


Entry: no apology!
Date: Thu Oct 11 01:43:40 CEST 2007


i tried a couple of times this week to explain the "ephemeral" macro
idea, but it's just insane. i need a real solution:

- macro code needs to know whether a certain word is defined or not.

- if a partial evaluation can't be computed, the error should be:

   * none, if the corresponding library code can be found.
   * "partially implemented literal construct" or something..


what i do need to explain is "Why leaky abstractions are not
necessarily bad." This is a core Forth idea quite opposed to the safe
language ideal. I'm using a lot of that stuff, and I guess it's good
to make a list of these.

looking at the code, this 'need-literal' error only happens in 3
places: toggle set and bit? i just took them out: they refer to code
words now, up to user to implement.



Entry: Purrr semantics
Date: Fri Oct 12 16:25:44 CEST 2007

As explained in Brood, there is only a single semantics of a Purrr
program: it is a compositional, purely functional language. A Purrr
program consists of a set of (recursive) macro definitions, and a
``body'' which defines a compilable function with reduced semantics.

It would be really cool if i could get rid of the explicit
'compilation' step, and make everything just declarative.

What i'd like to do is to apply this approach to scheme. Maybe that's
what PICBIT is doing?



Entry: train notes about syntax, semantics and metaprogramming
Date: Fri Oct 12 17:41:09 CEST 2007

I can identify 3 distinct uses of macros:
 - control flow (begin ... again)
 - optimization (1 2 +)
 - explicit meta (using the words '>m' and 'm>')

The latter is actually the same as the first. The 'm' stack is like
the 'r' stack: it is used to implement nesting constructs.


conceptual problem: jumps, can be solved by writing all jumps as
recursion and using higher order functions (combinators). together
with using only a single conditional statement, the solution is to
enable syntax for quoted macros. this leaves:

    * conditional (IF)
    * quoting operation (LAMBDA)
    * dequoting operation (APPLY)


The core ideas behind the macro language are:
    * purely functional (no side effects)
    * everything is first class
    * purely compositional (no syntax)
    
Then, the target langauge should inherit as much as possible from
these properties:
    * functional word subset (data stack)
    * possibility of HOF (with/without closures) using byte codes?
    * mostly pure compositional semantics, with a little syntax sugar

Construct a powerful metaprogramming system by starting with a pure
language, and making the transition (projection) from pure/ephemeral
-> non-pure/concrete explicit. In Purrr this is the decision to use
macros or words to impelement functionality.

Is metaprogramming a form of message passing? Sending "reconfig"
messages?



MERGE TODO:

- check how PLT classes solve name spaces issues + use this for macro namespace.

- fix macro quoting and nesting. maybe write program as list of macros
  instead of 1 macro now? it's isomorphic, but possible to manipulate.

- don't solve nesting in source preprocessor: that is to remain
  regular, and the parser is to be explicit (compilation 'meta'
  stack). maybe this requires a real extensible descent parser?

- check how Factor implementes closures

- make interaction words extensible

- check >> and 2/ simulation and partial evaluation words


Entry: notes remarks
Date: Fri Oct 12 17:43:50 CEST 2007

There are only 2 kinds of distinct primitive macros:

  - partial evaluation macros (written in pattern language)
  - nested structures (written in CAT)

Composite (recursive) macros can combine both. This seems to be the
way to explain how things are going + a way to clean up the code a bit
and reduce the number of primitive nesting macros. Apparently, that's
already accomplished.. on the other hand, these are entry points to a
type inference system.. edit: just re-implemented >c and word>c as
pattern matching words.


I am in trouble: I want to explain why I diverged from explicit
Forth--style metaprogramming to move to compositional macro semantics
with partial evaluation, and why at the same time i'm not going full
length: instantiation is still limited to a subset of the full macro
semantics. The thing is: having metaprogramming constructs in the
language disguised as 'compatible semantics' is a good idea: explicit
primitive macros can be reduced quite a lot. So what's the question??


Entry: debug bus
Date: Fri Oct 12 17:57:41 CEST 2007

      - identical clients
      - ad-hoc 1-wire instead of SPI/I2C/async/...
      - host = master
      - binary tree-like physical structure
      - cables/connectors ?
      - multihost: just use shared terminal

EDIT:
maybe an ad-hoc network is best to avoid at first.. let's get
something simpler working before trying crazy stuff.





Entry: quoting macros
Date: Fri Oct 12 19:53:43 CEST 2007

this looks a bit like the final frontier. currently i can't write
Forth in terms of a compositional language. with the current pattern
matching language, it would be trivial to do so if i had a
representation of anonymous macros. basicly i want:

	       [ 1 + ] [ 1 - ] ifte

that's easy enough if '[' and ']' are part of a parser preprocessor.
however, anything defined in terms of those, like 'if' 'else' 'then'
needs to be implemented as parser macros also! this complicates
things.. i see only 2 solutions:

	    - implement all nested words as parser words
	    - figure out a way to unify parsers and macros

what about this: allow the use of syntax '[' and ']' as a macro
quoter, but write words like 'ifte' in terms of Forth, instead of the
other way around.

again: i'd like to have an explicit compilation/macro stack lying
around, however, quoted macros are nice to have. this is
non-orthogonal, but does it really matter? i don't know what to think
about this..





Entry: Haskell
Date: Sat Oct 13 15:32:18 CEST 2007

I've been looking for an excuse to use Haskell for something
non--trivial. The demodulator (and unrelated, iirblep filter) might be
a good problem to tackle. OTOH, the real exercise is probably to write
a prototype in Scheme, test it, and then write a specific compiler to
translate that algorithm into C or Forth. So maybe best demodulator in
scheme (see filterproto.ss) and iirblep in Haskell?



Entry: the purrr slogan
Date: Sat Oct 13 18:45:18 CEST 2007

in order to explain what purrr actually is, it is best to set these
two points:


 * Purrr is a macro assembler with Forth syntax. It is implemented in
   a purely functional compositional macro language.

 * Because of the similarity of the procedural Forth langauge and its
   meta programming language most metaprogramming can be done by
   partial evaluation, blurring the distinction between the concrete
   procedural language, and the ephemeral macro language. In a sence:
   PE is not just an optimization, but an *interface* to the
   metaprogramming language.

 * The PE is implemented as greedy pattern matching macros (is this
   important?)

Entry: removed from purrr.tex
Date: Sun Oct 14 16:52:10 CEST 2007

\section{The Big Picture}

Purrr can be used in its own right, but it is good to note that Purrr
is part of the Brood system, which is an experiment to combine ideas
from Forth, (PLT) Scheme and compositional functional languages into a
single coherent language tower. Purrr can be seen as an
\emph{introspective boundary} in this language tower: the core of
Purrr is to be the basis of this language tree, but the scope of Purrr
is limited to a low--level language with Forth syntax and semantics
and some meta--programming facilities disguised as Forth macros. For
example, it is not possible to access the intermediate functional
macro representation directly from within Purrr at this moment; this
still requires extension of the compiler itself using the Scheme and
CAT languages. This separation between the Purrr language and its
implementation serves to to keep the programmer interface to Purrr as
simple as possible, while the detais of the language tower are worked
out to eventually lead to a more coherent whole. Purrr by itself is
reasonably coherent, although it is somewhat limited in full
reflective power by this language barrier. Eventually, Purrr should be
just an interface (with Forth syntax) to the low level core of the
compositional language tower in Brood.

Because Purrr is implemented only for the Microchip PIC18
architecture, there is no tested \emph{standard} machine layer: most
functionality is fairly tied with the PIC18. I am confident however,
that refining the split of the current code base into a shared and
platform specific component is fairly straightforward. Due to the ease
with which to create an impedance match in a Forth like language, I am
refraining from an actual specification of this standard layer until
the next platform is introduced.  By consequence, the border between
the machine model and the library might shift a bit.

Purrr's macro system is the seed for a declarative functional
language. Such a language would have no explicit macro/forth
distinction as in Purrr.


Entry: new ideas from doc
Date: Sun Oct 14 16:52:21 CEST 2007

It looks like things are getting cleaner: by taking this partial
evaluation thing serious, CAT primitives can be largely
eliminated. Just the words >m and m>, together with some stack
juggling words like m-swap, are enough to implement the whole
language. I just need to clean up a bit more so this idea can be
sealed as a property: no primitives except for a stack!

For documentation purposes it might now even be a good idea to write
most code in compiler.ss and pic18-compiler.ss in Purrr syntax,
leaving only the true primitives in s-expr syntax.  EDIT: that's a bad
idea until the forth syntax can represent everything the s-expr syntax
can.

The remaining cleanup brings me to the backtracking for/next
implementation. With just quoted macros and a 'compile' that executes
macros, this can be removed from the primitives.


Entry: writing lisp code in emacs
Date: Mon Oct 15 01:38:51 CEST 2007

watching slime screencast

* insert balanced paren: M-( with prefix arg


Entry: quoting macros
Date: Mon Oct 15 17:10:44 CEST 2007

Apparently, it was already implemented. I rewrote the for/next
backtracking so now it's expressed as recursive macros, except for the
part that tests the data structure constraint.

I guess what i have now is that compositional language forth
dialect. The only problem is that my Forth parser doesn't support
it. I just need to write some macros to transform code that uses
literal quoted macros into other constructs. Start with ifte:

   ;; Higher order macros.
   (([qw a] [qw b] ifte)
    ((insert
      (list
       (macro: if   'a compile
               else 'b compile
               then)))))
   

/me got big smile now :)


Entry: practical stuff : starting a new project
Date: Wed Oct 17 14:13:13 CEST 2007

I need to make my old 18F452 proto board work again, so this entry is
a seed for a "getting started" doc: how to get from nothing to a
working project.

EDIT: i'm switching to a 18F2620, so doing it over again.

Assumptions:
 * the project is part of (your branch of) the brood distribution
 * you're using darcs version control


1) Make a directory in brood/prj, and add to darcs

      cd brood/prj
      mkdir proto
      darcs add proto

2) Copy the following files from another project. i.e. prj/CATkit and
   add them to the darcs archive
    
      cd proto
      cp ../CATkit/init.ss
      cp ../CATkit/monitor.f .
      darcs add *

3) Edit the init.ss file to reflect your project settings.

skip step 4-6 if you have a chip with a purrr bootloader

4) Edit monitor.f for your chip

   That file includes the support for the chip in the form of a
   statement:

      load p18f2620.f  

   Look in the directory brood/pic18 to see if such a file exists. If
   it does, go to step 5).

   If not, you need to create one and generate a constants file from
   the header files provided by Microchip. I.e.:

      cd brood/pic18
      ../bin/snarf-constants.pl \
      		< /usr/share/gputils/header/p18f2620.inc \
		> p18f2620-const.f

   The .INC file can alternatively be found in the MPLAB distribution,
   in the MPASM directory. 

   Now you need to create the setup file for the chip. Start from a
   chip that is similar

      cp 18f1220.f p18f2620.f

   And edit the file to reflect changes necessary for chip startup and
   serial port initialization.
   		

   Don't forget to add the files to darcs, and send a patch!

      darcs add p18f2620*.f
      darcs record -m 'added p18f2620 configuration files'
      darcs send --to brood@zwizwa.be http://zwizwa.be/darcs/brood

   In case you can't send email from your host directly, replace the
   "--to brood@zwizwa.be" option with an "--output darcs.bundle" option and
   send the resulting darcs.bundle file.


5) To compile the monitor in the interactive console type this:

      project prj/proto
      scrap

6) Make a backup copy of the monitor state.

      cp prj.ss monitor.ss

   And flash the microcontroller using the monitor.hex file.  In case
   you're using the ICD2 together with piklab, the command line would
   be:
   
      piklab-prog -t usb -p icd2 --debug --firmware-dir <dir> \
                  -c program monitor.hex

   Here <dir> is the directory containing the ICD2 firmware, which can
   be found in the microchip MPLAB distribution.


7) Next when you start the console, go back to the project by typing:

      project prj/proto


8) Now you can start uploading forth files using commands like:

      ul file.f

   This will erase the previously uploaded file and replace it the new
   one. If you want to upload multiple files, use the 'mark' word
   after upload to prevent deletion:

      ul file1.f
      mark
      ul file2.f

   Now the next 'ul' will erase file2.f before uploading a new
   file. To erase files manually, use the 'empty' word.




--- LIVE MODE ONLY ---
bin/purrr
project prj/CATkit
ping


  

Entry: this is a simultaneous fix/todo log for the previous entry
Date: Wed Oct 17 14:28:38 CEST 2007

- add default entries to dictionary on init
- single baud rate spec? mine it from forth source, or the other way around..
- standard naming for the state file?
- for chips that come with a bootloader: need to save the pristine file
- fix state file rep so it is a standard s-expression tagged with 'project'
- fix absolute path
- add 'serial' tag to port
- add a 'chip erase' or a fake one using "mark empty"

the 3 different state files:

    - init.ss         "most empty" state
    - monitor.ss      state file of bootloader only
    - prj.ss	      current state

these names are set as default, but can be overridden.

ok. done.

'monitor.ss' is never written by the application, so ppl with just a
monitor.ss file can revert to just that file (not implemented yet).



Entry: operations on dictionaries
Date: Wed Oct 17 15:34:26 CEST 2007

I'm trying to factor dictionary operations a bit. I already ran into
'collect' which takes a list of tagged pairs, and collects all
occurences for each unique pair. Doing this stuff purely functional
becomes difficult if performance is an issue: naive algorithms are
quadratic. Hash tables could accellerate. It seems overall that
mutation is the thing to choose here..


Trying to write these hierarchical combination things i'm getting
convinced that it's a bit of a mess.. (name . value) pairs are well
defined, but hierarchical structures require polymorphy. To make the
analogy with ordinary functions, basicly you're dealing with a
function that maps a value to a value OR another function..

Maybe the whole abstraction is broken?

I need to think about this.. something profound seems to be hidden
here. I'm going to hack around it for now.

I think I get it.. and it's trivial again.

  A hierarchical hash table (HHT) is an implementation of a finite
  function which maps tag SEQUENCES to values. All operations on HHTs
  have the semantics of operations on finite functions.

From this follows that paths need to be created if a value is
stored. It doesn't make sense to have to create the directory before
storing a value. Otoh, storing a value in a tag sequence, where one of
the top nodes is not a hhash is an error.



Entry: PIC write protect
Date: Wed Oct 17 20:20:23 CEST 2007

write protection works well and all, but i can't get it undone! i
think it works without problem in mplab, but using the piklab
programmer erasing the chip doesnt seem to work...

what is needed is a full chip erase. it doesn't look like piklab is
doing this correctly. on to installing mplab again..

OK i got it: memory that's protected requires a BLOCK ERASE, and such
an operation needs Vdd > 4.5


Entry: macro nesting
Date: Thu Oct 18 14:15:39 CEST 2007

time for the hiary problem: the syntax-rules -> syntax-case equivalent
for macros. what do i need:

there is only one decent way of doing this: use scheme
metaprogramming. i like forth and all, but for numeric stuff, it's
just easier to have variable names.. let's invent some new construct:

\ load a scheme file implementing macros
load-scheme  filename.ss

been hacking a bit, but i need a plan..

* s-expression files contain scheme expressions, not forth files with
  s-expression syntax. this effectively needs a scheme parser down the
  line, something that can convert the inline atoms to proper
  invokation.


what about this: make it possible to load plt modules from
forth. modules are stored as a single s-expression.

hmm... again.. some questions:

* how to store a module definition in the state file, so it can be
  instantiated?

all macros in the '(macro) dict get evaluated using def-macro!, which
does:

  (define (def-macro! def)
    (ns-set! `(macro ,(car def))
             (rpn-compile (cdr def) 'macro:)))

rpn-compile evaluates `(macro: ,def)

so this won't work to store modules. it's probably best to represent
the macros differently in the state file, so it's just scheme code,
and then create a module: evaluator.


that's the main problem: 

* how to store the source of things that generate macros, in this case
  a scheme module, so they can be re-instantiated from the state file.

* do this without introducing ANY limit on what can be included in the
  scheme file.

* without introducing yet another special case. in fact it's probably
  better to remove a special case driven by this requirement.


let's go back to how macros are parsed. ok. they are included as a
(def-macro: <name> . <body>) expression in the atom stream. i guess
this needs to change to include a (def-module: . <body>) form.

why not change the def-macro: thing to a more general def-scheme:
syntax?


(def-macro: name . body)

-> (def-scheme: (def-macro! name body))

or..

have def-macro! support modules. i guess that's the simplest way.

ok.. changed the tag to "extend:" and changed the function that
implements the extension to "extend!"

i'm running into some bad behaviour.. need to formalize



Entry: forth translation
Date: Thu Oct 18 16:58:21 CEST 2007

Time to formalize the forth parsing. Some notes:

- it's actually just a lexer: no nested structures are handled in this
  stage: all is passed to the forth macros, which use the macro stack
  to compile nested structures.

- FILE: the first stage does only file -> stream conversion. this
  includes loading (flattening the file hierarchy)

- PARSE: the second stage does 'lookahead' parsing: all
  non-compositional constructs get translated to compositional
  ones. this also includes macro definitions.

The problem I run into is the FILE stage, which also needs to inline
scheme files, but gets messed up by the forth parser. I just need to
tag them differently.


Entry: error reporting
Date: Thu Oct 18 22:26:25 CEST 2007

using 'error' instead of 'raise' is a good idea since continuation
marks are passed. the rep.ss struct marks CAT words, so something
resembling a trace can be printed. the cosmetics can be done later,
this is good enough for now.

done. maybe want to convert some exceptions that are clear enough back
to raise so they don't print a stack trace. (reserved-word time-out)



Entry: hardware prototyping
Date: Fri Oct 19 11:22:00 CEST 2007

TODO:
- sine wave generation
- debug network
- connect a modulator and a demodulator

the first one seems rather trivial to me, so let's do the network
today. first thing is to give up on an ad-hoc bus: that's ok for
uni-directional stuff, but bidir is a pain. so let's go for something
standard.

i got the samples in yesterday. got them running on the breadboard
with intosc. if we can pull off the project on 8MHz, we can run on
2xAAA cells: the 18LF2620 need only 2V, but they need 4.2V @
40MHz. i'm going to stick to intosc for now.

next: I2C

* 2 lines are used: RC3 = clock, RC4 = data, these need to be
  configured properly by the user. on the 18F2620 their only other
  function is digital IO.

* registers:
   - SSPBUF = serial data I/O register    
   - SSPADD = device address
   - SSPCON1, SSPCON2, SSPSTAT = control registers

* errors:
   - write collision

* firmware controlled master mode: seems it's just more work, so never
  mind..


Entry: TODO
Date: Sat Oct 20 13:30:29 CEST 2007


- get I2C working between 2 18F2620 chips on breadboard at intosc, as
  fast as possible.

- fix purrr.el : stupid broken indentation is annoying the hell out of
  me. clean up the file first, then automate the indentation rules
  generation etc..


Entry: message passing interface
Date: Sat Oct 20 14:00:30 CEST 2007

Since I2C is a shared bus architecture, care needs to be taken to
place operation in a sane highlevel framework. The interface i want is
asynchronous message passing. Messages should either be bytes, or a
sequence of bytes (in which case 'message' contains the size, and the
a/f regs contain the message)

      message address i2c-send

Let's suppose for now there is only a single process per machine, and
build multiple process dispatch on top of single process. 

To do this bi--directionally, an event loop needs to poll for
messages. Dispatching of highlevel messages (internal addresses) can
be done as a layer on top of single message passing. So i need a send
and receive task, and make sure they don't collide

   * it's always possible to RECEIVE, so that should be the background
     task. this simply waits until a message arrives.

   * it's only possible to SEND if the bus is free, so a SEND might
     block.
    
The problem is that a message might come in while waiting to send out
a message. Therefore messages need to be queued. The moral of the
story:
  
     A send can never block a process, only a receive can.

So what is a task? It is a function that maps a single input message
to zero or more output messages. The output can be zero in a
meaningful way, because the task has internal state. So basically, a
task is a closure, or an object.

The driver routine can be a single task, since the hardware is
half-duplex. See pic18/message.f for the iplementation attempt.

Something to think about: the ISR needs to be completely decoupled
from the tasks that generate output messages. This is the whole point
of buffering: if there is straight line code from RX interrupt ->
computation task, the tasks that might run a long time will not be
pre-empted. So:

      The RX ISR and the dispatch loop are distinct.

what it looks like (yes i need to pick up hoare's book again..)

\ Message buffering for a shared bus architecture. The topology looks
\ like this:

\                            wire
\                              |
\                              | G
\         A             E      v      F
\  wire ----> [ LRX ] ----> [ LTX ] ----> wire
\                |             ^
\  . . . . . . . | B . . . . . | D . . . . . . .
\                v      C      |
\             [ HRX ] ----> [ HTX ]
\
\ Code above the dotted line runs with interrupts disabled, and
\ pre--empts the code below the line. Communication between the two
\ priority levels uses single reader - single writer buffers. The 6
\ different events are:
\
\ A) PIC hardware interrupt
\ B) RX buffer full condition
\ C) TX buffer full condition (execute task which writes to buffer)
\ D) wakeup lowlevel TX task from userspace
\ E) wakeup lowlevel TX task from kernelspace
\ F) PIC hardware send
\ G) wakeup lowlevel TX task from bus idle event
\
\ A task is an 'event converter'. The 4 different tasks are:
\
\ LRX) convert interrupt (A) to tx buffer full B and tx wakeup E
\ HRX) convert tx buffer full (B) to rx buffer full (C)
\ HTX) convert tx buffer full (C) to tx wakeup (D)
\ LTX) convert wakeup (data ready: D,E) to hardware send.
\
\ The pre--emption point is A: this causes no problems for the
\ low--priority task because of the decoupling provided by the receive
\ buffer. The only point that needs special attention is the LTX task,
\ which can be woken up by different events D, E and G, and care needs
\ to be taken to properly serialize message handling. To do this, both
\ D and E should invoke LTX with interrupts disabled. For E this is
\ trivial: just call the LTX task, for G is is already ok since it's
\ an isr, so D needs to explicitly disable interrupts.
\


     	     

Entry: todo today
Date: Sat Oct 20 15:52:49 CEST 2007

- write highlevel buffer code and try out with current serial before
  moving to I2C

- write mini 'hierarchial time' tutorial for sheepsint

- check mail just sent to technocore for delails of the next couple of
  days.

haha.. did none of them :) i suck at planning. what i did do is to
write a synth tutorial that is an introduction to the hierarchical
time thing + some explanation of a pattern language. what this doc is
leading me to is the need for some kind of dynamic variable binding
for code words: i already have 'hook.f' but something more general
should be used. something which directly deals with variables.


Entry: re-inventing C++
Date: Sat Oct 20 16:11:22 CEST 2007

i'm running into the need for polymorphy: i want to express generic
algorithms in a sane way. because of the philosophy of purrr, this has
to be done in a static way, with dynamic built on top of that later
maybe.

Oei this is going to lead to a whole lot of doubts about namespace
management.. Let's concentrate on the practical issues first.

EDIT: i'm going for name mangling.. see below.


Entry: hierarchical time
Date: Sat Oct 20 18:13:38 CEST 2007

One thinking error i made is: if a note word is SYNC followed by
CHANGE, then you can't compose words that start at the same sync. as a
result, SYNC needs to follow CHANGE, and the toplevel invokation needs
to provide proper synchronization.




Entry: the 'i' stack
Date: Sat Oct 20 23:14:45 CEST 2007

what about this: i'm using an extra byte stack, and 'x' is a symbol
that's useful in other contexts.. why not call the stack the 'i'
stack, since it's already used as a loop index in for .. next loops?

hmm.. great idea, but not really feasible without an automated
identifier replace.. it's everywhere.



Entry: dynamic words
Date: Sun Oct 21 00:50:39 CEST 2007

basicly, i need to find words to properly handle execution tokens.
there are 3 uses for a symbol related to dynamic code:

      * declare
      * invoke
      * change behaviour

if it's avariable, invokation will be explicit: because i don't want
the thing on the stack, an extra level of indirection should do it:

    2variable BLA
    BLA invoke
    : changeit BLA ->  ...... ;

another possibility is to use a parser word, which i'm not so keen on
using.

what syntax is better depends on the usage: do invokations dominate,
or do behaviour changes? i used the "->" word in ForthTV to set the
display task: that's a single vector, invoked in only one place, but
muted in a lot of places. let's go for this approach. results in
vector.f (hook.f is basically the same, left there for forthtv)


Entry: todo
Date: Sun Oct 21 16:15:32 CEST 2007

- hierarchical time
- highlevel buffer code (requires some polymorphy)
- tonight: fix purrr.el, clean up stuff in doc/



Entry: hierarchical time
Date: Sun Oct 21 16:37:48 CEST 2007

so what's the problem?

you want to have a class of words which "snap to" a timing grid, but
you want to be able to call a collection of fine scale words from
coarse scale words, without messing up the sync. the problem is that
if you do:

  : foo   8 sync-tick bar bar ;
  : bar   7 sync-tick .... ;

there are too many waits: "8 sync-tick" followed by "7 sync-tick"
waits for the next 7-scale tick.

somehow the sync word needs to know that the current time is already
ok. either:

  * assume that the caller does the outer bounds, and have callees do
    only subdivision. this works, but is cumbersome.

  * find a way to see that we're running synchronized.


how can a 0->1 transition in bit n be recognized in the bits < n?
they're all 0. but that's not very helpful.

damn i need coffee.

the question to ask is: did we recently sync? this can be answered by
copying the whole counter register to some place, and computing the
diff. this also allows to trigger on edges.

what about this: use some dynamic scoping for syncing. there is only
one word 'sync' which will synchronize on clocks given the current
time scale. 


for each time scale one needs:

        a word that can compute the current phase count. this needs a
        bit offset and the last sync point. bit offset might be easily
        stored as a bit pattern.

global:
	the counter
	the last sync point

\ compute time difference from last saved sync point, using mask to
\ ignore fine scale.
: sync-diff
    sync-counter @
    sync-last @ -
    sync-mask @ and ;
macro
: sync-inphase?
    sync-diff nfdrop z? ;
forth    

actually that doesnt solve anything.. it's quite easy to wait until a
condition changes, but it's a lot less easy to determine whether the
condition just happened.

really, the only thing i see is to have patterns like this:


_|_|_|_

which can be nested in larger scale patterns like

_______|_______|_______|_______

_|_|_|_ _|_|_|_ _|_|_|_ _|_|_|_


there the first and last syncs are removed, and only the subdivision
is synced to. it's then te responsability of the caller to turn things
on and off.

it looks to me that this is a real pain to work with.. maybe i should
just write a couple of words and see if it's actually sane to get
something working.. one thing i thought about was to bind the current
sync level to the word "|"

: hihat [[ noise 10 for | next ]] ;

where the [[ and ]] save and restore the synth config on the x
stack. that's 7 bytes per level, which is a bit too much probably, so
stick with manual saving/restoring.


ok. there really is only one decent solution: escape continuations. in
order to make proper use of synchronization, the caller needs to
indicate how long a word is allowed to last.

now, instead of that, think of there being only one voice at all time,
which simply accepts events from a separate entity. so the synth looks like:

[ CONTROL ] -> [ VIRTUAL SAMPLE PLAYER ] -> [ CORE SYNTH ]

each virtual sample is a word that loops forever. this requires multitasking.




Entry: generic functions
Date: Tue Oct 23 16:09:09 CEST 2007

When trying to implement the buffer algorithm, i ran into the need for
abstract objects: each buffer (queue) is going to have the following
interface:

  read
  write
  read-ready?
  write-ready? (maybe.. in case buffer-full condition is used..)

I have enough with a static object system: anything dynamic has to be
handled explicitly on top of that using byte codes (route) or vectored
words. So what is needed is simply a static (compile time) method
dispatch.

Should there be special syntax for messages, or do we just use a
single flat namespace, with some words dedicated as messages? For
example: 'read' could be such a message: always requiring a literal
object. This seems simplest, let's try that first and change it if it
is not appropriate. So:

   - WHERE is 'read' defined
   - HOW is 'read' defined

Suppose we use a 'method' keyword for creating new methods. This
probably trickles down to making the parser also generic. Let's use
CLOS terminology.

So what am I doing?

    I am roviding a means for static namespace management so I can
    write generic algorithms (as macros). As of this point NO effort
    is made to implement dynamic generic algorithms: this should be
    built on top of the static version.

My approach is going to be very direct: if more abstraction is needed
i will fix it later. Currently multpli dispatch is not yet
implemented. The interface should be:


    class BLA	          \ create a new object (a macro namespace)
    method FOO	          \ declare a new method object
    BLA method: FOO ... ; \ define a new method FOO of object BLA
    BLA FOO	    	  \ invoke method FOO for object BLA


So, how to implement.. This was the easy part:

   ;; Dictionary lookup.
   (([qw tag] [qw dict] dict-find) ([qw (dict-find dict tag)]))
   

Now the thing to do is to store the dictionary somehwere. This has to
mesh with the macro definition part of purrr.. let's see (using s-expr
macro definition syntax on the rhs)

     class BLA	    == 	       (BLA '())
     method FOO	    ==	       (FOO 'FOO dict-find compile-message)

Here 'compile-message' depends on what's exactly stored in the
dictionary: macro objects or a mangled symbol. It's tempting to just
go with symbol mangling: that way ordinary syntax can be used, and
interface to the rest of the language is really straightforward.

Let's go for the simple symbol mangling, which doesn't even need
dictionaries:

     A class is a collection of methods. Classes are identified by a
     symbol. A method is a macro which dispatches to another macro
     based on the symbol provided.

     class BLA      ==         (BLA 'BLA)
     method FOO     ==         (FOO 'FOO dispatch-method)
     
     : BLA.FOO ... ;

     FOO BLA        ==         'FOO 'BLA dispatch-method


Entry: problem in macros defined in forth syntax: quote doesn't work properly
Date: Tue Oct 23 17:43:02 CEST 2007

suppose i want this:

	: broem  ' broem ;   ==  (broem   'broem)

how to do that? currently this just gives an ifinite expansion because
the quote is not recognised. why? because inside the 'definition'
parser, the parsing words won't work.. this is probably a good thing,
but quote does need to work.. let's separate parsing words from quote
parsing.

the lex stream should be made a bit more clear.

FORTH -> [load flattener] 
      -> [forth stuff: parsing words + definer environents] 
      -> [quoting] 
      -> SEXP


Entry: locals for macros?
Date: Tue Oct 23 20:08:46 CEST 2007

Once more than 50% of a macro's code is stack juggling words,
something needs to be done about it. The macro below is a typical
'multi-access' pattern: an EXPANSION instead of a CONTRACTION.

\ transfer bytes from one object to another
macro
: need not if exit then ;
: m m-dup m> ;    
: transfer-once  \ source dest --
    swap >m >m

    ' ready? m msg need m-swap
    ' ready? m msg need m-swap
    ' read   m msg m-swap
    ' write  m msg m-swap

    m-drop m-drop
    ;
forth

What i really want is a locals syntax for macros that perform a lot of
expansion:

: transfer-once
     { src dest }

     ' ready? src msg need
     ' ready? dst msg need
     ' read src msg
     ' write dst msg
;
    
The macro system already has a syntax for locals, so i just need to
add this to the parser + choose the right semantics (code or data).


EDIT: also, what about just . (dot) for name binding operation?


Entry: locals
Date: Tue Oct 23 21:53:04 CEST 2007

Actually i did this before. I guess in brood-2 there's a syntax that
takes words like this:

      (a b | a b +)

Resembling Smalltalk's syntax for anonymous functions. i just saw
Factor also uses the vertical bar.

What i could do is to combine this with my special quoting syntax:

(a | a)      == execute
(a | 'a)     == identity

Following the rationale that words are mostly functions, and constant
functions are the exception.

This kind of syntax took me a while to get used to, but it makes a lot
of sense: has lead to a lot of simplified mixing of scheme and cat
code.

So what about combining that with destructuring?

       ((a . b) | 'a 'b +)

Hmm.. Let's leave that as an extension. There's no reason not to
however..

I think I need a dosis of good old fashoned confidence to go for the
quoted approach. What is more important: to stay true to the fact that
symbols are functions, or to go for the lambda-calculus approach of
using symbols as values + explicit application.

Even though it looks strange, the issue is: do i stick with my
previous realization that his is a good thing dispite it's strange
look. So the choice is either (classic):

      (a b | a b +)   ==  +
      (a | a execute) ==  execute

this has the interesting property that permutations are easily
expressed. or do i go with my approach

      (a b | 'a 'b +) == +
      (a | a)         == execute


What I could do is to use 2 forms of binding, and i guess that's what
i did before. have | do the stuff abouve and || do the normal thing,
or the other way around.

      (a : a)  == execute
      (a | a)  == id
      (a : 'a) == id

using the ':' has the added benifit of reminding you of a
"definition".




Entry: lambda
Date: Wed Oct 24 23:11:02 CEST 2007

Having had a night to sleep on it, i think it's going to be:

       (a b | a b) == id


* Lambda is simply too important to gratuituously do different.

* Data parameters are used more than function parameters, which in
  turn are easily quoted.

* It is compatible with current stack comment notation.



Entry: implementing lambda
Date: Thu Oct 25 13:20:50 CEST 2007

apparently i need to be careful where to introduce local variables in
the syntax expansion. as long as there's a lambda expression enclosing
a (xxx: a b c) macro, all lexical variables are identified properly,
but in this they are not:

  (define (bar? x)
    (eq? '\| (->datum x)))
  
  (define (represent-lambda c source)
    (let-values
        (((formals pure-source)
          (split-at-predicate bar? (syntax->list source))))
      #`(make-word
         '#,((c-language-name c) c)
         (quote #,source)
         (lambda
             #,(if (null? formals) #'stack
                   #`(#,@(reverse formals) . stack))
           #,(fold (lambda (o e)
                     (dispatch c o e))
                   #'stack
                   pure-source)))))

the 'dispatch' operation doesn't recognize lexical variables yet,
because the enclosing lambda macro hasn't updated the symbols.. so
lambda syntax should be introduced at a higher level.

i need a shortcut, only for macros, and then work up the abstraction
if necessary. the thing to extend is the 'macro:' form itself.

hmm.. i'm making a bit of a mess of it..

the lexical scoping for the macros is a bit special, and is probably
best handled using the pattern matching transformer stuff: the lexical
variables in macros should be bound to literal arguments in the
assembly buffer.

  (a b | a b +)

   ->

  (([qw a] [qw b] it)  (insert (list (macro: 'a 'b +))))

which is really awkward in the current composition.. it's probably
easiest to make a special purpose matching word as a straight lambda
expression. something like:

  (match stack
         (((('qw b) ('qw a) . rasm) . rstack)
          (let ((a (literal a))
                (b (literal b)))
            (apply (macro: a b +)
                   (cons rasm rstack)))))

	 
Actually.. This is quite universal, except for WHERE to find the
arguments.. Anyways, let's get on with it.

  (make-word
   'macro-lex:
   '(a b \| a b +)
   (match-lambda*
    (((('qw b) ('qw a) . rasm) . rstack)
     (let ((a (macro: 'a)) (b (macro: 'b)))
       (apply (macro: a b +) (cons rasm rstack))))))


The first macro using lexical variables in synth-soungen.f

  macro    
  : sync bit | \ --
      begin yield bit tickbit low?  until
      begin yield bit tickbit high? until ;
  forth    

Subtle ay :)



Entry: theory
Date: Thu Oct 25 21:07:55 CEST 2007

in order to finish brood.tex, it looks to me that type theory is not
really the most important thing to brush up on: partial evaluation
is. there's a lot of stuff here:

  http://partial-eval.org/techniques.html

i need to give some proper attention. if only to relate my intuitions
to things people have spent some thought on.


Entry: multiple exit points
Date: Thu Oct 25 21:48:31 CEST 2007

instead of writing macros containing 'exit' which are really a loaded
gun, it might be better to write a proper while abstraction that uses
multiple conditions. unfortunately, an 'and' is not very easy to
optimize..

  macro
  : need not if exit then ;
  : m m-dup m> ;    
  : transfer; src dst | \ --
      
      begin
         ' ready? src msg need 
  	 ' ready? dst msg need 
  	 ' read   src msg
  	 ' write  dst msg
      again   

      ;
  forth

why is this complicatied: because i don't want to use 'and'. what i
want is a word 'break' which breaks from a loop on a condition. maybe
'transfer;' is good enough: since i already have arbitrary WORD
exitpoints, i can use this to get any control structure exit point: it
also prevents juggling of the control stack (macro stack).


Entry: move
Date: Thu Oct 25 22:19:09 CEST 2007

for this i need 2 pointer registers. thing is: i'd like to use the x
stack's register to do this a bit efficient, but then i can't use for
.. next !

implementation detail anyway.. 


Entry: buffers
Date: Fri Oct 26 13:04:40 CEST 2007

next on are data buffers. i have some code that uses 14 byte buffers
together with some dirty trick of storing read/write pointers in one
byte for easy modulo addressing. i could dig that up again?

what is a buffer?
     - 2 pointers: R/W
     - base address of memory region (statically known)
     - size (statically known)

suppose i represent it as 2 literal values:  rw-var offset

see buffer.f for draft (committing now)

but..

isn't it wise to write some code for generic 2^n buffers? where a
buffer consists of 2 variables, a mask indicating its size. ok, did
that but it leads to more verbose code.

a different strategy could be to store the read pointer or difference
at the point where W points, this saves a cell that's normally used to
distinguish between empty and full. hack for later..

anyways, i stick with the current: its probably good enough. i need to
move on.

nibble-buffer.f tested.

     

Entry: 0= hack
Date: Fri Oct 26 13:29:59 CEST 2007

i'd like to figure out a way to efficiently implement the 0= word,
wich turns a number into a condition. the problem is that 'drop'
messes up the zero flag, so i used a 2-instruction movff trick
before.. but using drop should be possible when using the carry flag.

hmm.. nfdrop is only 2 slots.. i don't think i can do better really.


Entry: I2C comm
Date: Fri Oct 26 16:16:20 CEST 2007

how to get this going? the typical 'debug the debbuger' problem: I2C
is going to be used for the debuggin network, but until that works


master/slave:

  to preserve symmetry, it might be wise to use a dedicated single
  master node which runs debug code, so all the kriket nodes can be
  identical (slaves).

  ideally, all cricket chips are free from ICD2 and SERIAL ports, and
  have only power, ground, and I2C clock and data.


send/receive:

  let's stick with the ordinary monitor protocol over I2C. the thing
  to do is to make a hub.


Entry: SD-dac
Date: Fri Oct 26 22:07:00 CEST 2007

A Sigma-Delta Modulator (SDM) can be thought of as an
error-accumulation DC generator: given a constant input, it will
generate the correct average DC output, with a quantization
error noise spectrum that is high--pass.

A First-order SDM is an extremely simple circuit: it consists of an
accumulator with carry flag output: at each output instance, the
current output value is added to the accumulator, and the resulting
carry bit is taken as the binary output, and discarded.

I had this idea of running an 'inverse interrupt' machine: instead of
loosing time in ISR, just run an infinite loop, but allow at each
instance one primitive to run, which needs to spend an exact amount of
cycles. Probably not worth the hassle, but could be interesting for
really tight budget.

Anyways, this could be an alternative to PWM for kriket sound
generation. It should in theory give better quality. but probably that
also needs a deeper accu. With fast interrupt it's only 3
instructions:

	movf	OUTPUT, 0, 0
	addwf	ACCU, 1, 0
	rlcf	PORTLAT, 1, 0

assuming it's bit number 0 in port, and the rest of the bits we don't
care about (i.e. are inputs)

the problem here of course is that it's not just output that counts:
the output also needs to be computed.

Looks like it's not really worth it. Best to use PWM interrupt with
plugin generator code. At 2Mhz to get the carrier above audible
frequencies would put the divider at 64, and the carrier at 31.25
kHz. (The interesting thing here is that it could also be used for
bit-bang midi output at the same time :)

To get this going: best to add a small modification to sheepsint to
switch it into PWM mode.


Entry: FM sheep
Date: Fri Oct 26 23:16:24 CEST 2007

ok.. let's see what's necessary to make an FM (PM) synth in style of
Yamaha oldies. using a proper synchronous fixed time sharing approach
a lot is possible:

1. 31.2 kHz  1  x SDM output
2.  7.8 kHz  4  x 8 bit synth voices
3.  9.7 kHz  32 x envelopes

for all this i have 64 instructions.

one envelope per operator is more than enough. i've been checking out
the code for table lookup, and it can be brought down to 4
instructions

movf PHASE
movwf TBLPTRL
tblrd
movf TABLAT

but i doubt if 8 bit phase resolution will be enough..




Entry: hub board
Date: Sat Oct 27 15:07:51 CEST 2007


make hub board, first for serial, then for I2C. the idea is that a hub
board can be placed inbetween a normal serial board and a PC host:
it's only goal is to provide control over the serial slaves.

the condition is that all slaves have identical code, which means that
host indeed can switch without problems between different slaves:

[ PC ] --- [ HUB ] === [ S1 ] === [ S2 ] === ...

requirements:

  * the interface that implements this should be transparent: there
    should be no need for calling code on the hub directly. (except for
    debugging the hub where the host has just hub's dictionary).

i suggest to do this we use the next slot of 16 interpreter commands
to pass through monitor commands to the hub.

again: if i manage to get things working this way (async serial hub) i
have no need for I2C to do networking.. in fact, in order to get I2C
working i better build a proper debug network!

and more: if i get this serial passthrough to work, moving to a
synchronous 1-wire approach should be no problem.

ok, i have 50 solutions now..

TODO:
	- make it work for serial = standard
	- use serial to bootstrap 1-wire
	- MAYBE use I2C after that, probably too complicated


Entry: 1-wire revisited
Date: Sat Oct 27 15:47:27 CEST 2007


yes, why not.. it's a cheap hack but might be worth it. and i already
have provisions for it on the CATkit board, so the solution should be
re-usable. (CATkit: COMM is RA4).

let's stick to the ordinary monitor protocol with RPC semantics: (host
asks question, client responds / acknowledges). this is already
half-duplex, so fits nicely in a shared bus context. a simple start
bit, 8 data bit, stop bit could be used for comm using the following
waveform:

	  1 1 X 0 1 1 X 0

with X the 'shared bus' point, we can have a bidirectional link:

    * there's always power in a cycle (at least 50%)
    * bus is high when idle
    * there's a sync point 0->1 for slave sync
    * the send/receive is software control

protocol could be somthing like:

    * master just sends (start bit, 8 data bits, stop bit)

for the CATkit board, the sync could replace the fixed TMR2. let's try
the following:

    * fix CATkit's no-serial cable detection. (OK)
    * drive a CATkit board with a square pulse
    * use TMR2 to perform timed read or write

next: config RA4
    * open drain output (needs external pullup - master side?)
    * does have protection diodes to both sides

so, in theory it should be able to feed the chip through the
protection diodes.. but as far as i can see, it doesn't boot
properly. after adding a diode RA4 -> VDD it boots on DC. i don't
understand..

so, on to the controller. from the host side, everything is
synchronous. so timing should not be an issue. driving a couple of
busses in parallell poses no extra problems.

hardware: the dallas 1-wire bus apparently drives the targets through
a resistor, instead of a transistor. i was wondering how to prevent
hazards on the bus, and this is probably it: brief inspection shows
that a faulty client can bring down a network easily by shorting
during charge phase. a resistor also limits the charging current. so i
guess resistors are good. (i wonder if the weak pull-ups can perform
this task.. probably better not.)

pic has quite a large maximum current sink (25 mA), which would
determine the minimal size of the pullup resistor, i.e. at 5V the
minimal is R = 200 ohms.

simplifications WRT dallas 1-wire:

  * one slave per wire: no elaborate synchronization protocol
    necessary: all flow control is done in software using the purrr
    protocol. (host initiates transfer by sending a couple of bytes
    and waits for reply)

  * multiple slaves: they need to be addressed. in that case, some
    protocol is necessary. i.e. addr = 0: broadcast, no
    reply. otherwize: address followed by a couple of data bytes.

  * can use a 4-phase regime 10XY, where the receiver samples
    inbetween X and Y.

  * in case no comm is needed: master leaves line high: no unnecessary
    drain when pulling the resistor low.



using RA4 on the 18F1220. and for sending? can probably use an 18F1220
as a hub too, if it uses just one output. which output to use? only
RA4. maybe one bus is really enough? this way i could use simple RCA
splitter cables to build a network.

ok, i thought i needed an open drain output. apparently not: just
switching between 0 and Z is enough.


Entry: CATkit/krikit debug board
Date: Sat Oct 27 21:39:41 CEST 2007

* in debug mode: one bidirectional power/clock/data per slave (raw
  byte protocol: no address). this makes it a drop-in for the normal
  async serial io for the monitor. in 'midi' mode the port can easily
  run unidirectional shared. bidirectional shared is a software
  problem that can be solved later.

* using the 18f2620 for driver. the package is small enough to be
  practical. it can run without xtal at 8Mhz and has on board i2c for
  more elaborate networking later on. it has enough pins to add some
  status output. (i.e. RGB led)

* port B is used for communication. RB4:RB7 have interrupt on change,
  so could be used for more elaborate slave comm later. 

* running CATkit on a full line through 1k gives a 2V drop = 2mA.:
  that sounds about right. since this is low bw debug comm, it sould
  be possible to just leave the line idle = high. that means no clock
  is coming in.


so what about this: 

 - run CATkit TMR2 at a higher rate, i.e. 31.25 kHz. this would give
    * a decent timebase for SD sample tests
    * a 7.9 kHz bitrate for debug comm
    * ability to send MIDI data from CATkit board


I wrote the code for the network debugger. The 4-phase modulator and
receiver transmitter framing words are done and tested. The remaining
thing is how to switch between receiver and transmitter. Probably
something like this:
    
    - start with receiver
    - receiver gets idle -> check tx buffer -> tx / rx
               gets data -> start rx state machine
    - transmitter stop -> check tx buffer -> tx / rx
                  data -> tx state machine

these can be taken into one loop, and activated depending on rx/tx
flag.
         

Entry: sheepsint urgent todo
Date: Sat Oct 27 23:06:44 CEST 2007

LIST MOVED DOWN


Entry: nasty sub bug?
Date: Sun Oct 28 14:32:32 CET 2007

the following code leads to incorrect asm:

    123 @ 124 @ -

	dup	
	movf	123, 0, 0
	subwf	124, 0, 0

that should be subfw ??

the problem is in "123 @ -"

took the - and -- words out of the 'binary' meta patterns and fixed.


Entry: rtx
Date: Sun Oct 28 16:41:44 CET 2007

looks like it's +- working, at least the transmitter.
one little problem still, if client syncs to 0->1 transition, what
happens when it picks up in the middle of a data stream? suppose #x55
which is just a bunch of:

0100 0111 ...

syncing to the right frame is not a problem: per bit there is only one
0->1 transition to sync to. so the problem is that each client should
start with an idle line. it's the same problem as async serial.

so..

receiver for sheep. let's stick with a RX state machine only. the deal
is this:

- interrupt on change: detect 0 -> 1
     reset TMR2 + RX state machine

all logic from hub.f can be re-used, except for the top sequencer,
which should be
      route   ; ; ; rx-bit ;



Entry: comm on catkit
Date: Sun Oct 28 17:38:26 CET 2007

there are 2 ports left: RA4 RA6

both are not very interesting: no interrupt on change, or interrupt
facility. interrupt pins that can be reused are:

	  RB5 (INT1/TX)
	  RB2 (INT2)     not without cutting traces or removing R8
	  RB0 (INT0)     not without removing last pot
	  RB5-RB7 (KBI)  multiplexed with switches
          RB4 (KBI0/RX)

can it be done using polling only? i.e manually synchronize on each
start bit or something. need to think a bit more, but it looks like
manually polling is going to be problematic. the easiest thing is
really RB2/INT2: it's a proper interrupt, and its functionality is not
used atm.

maybe i should leave catkit out of it and try to get it to work on
krikit first.. catkit needs an update anyway, and this could be a nice
addition. reminder:

    - ditch AUDIO- for INT2
    - external rectifier diode
    - serial RX 100k pull-down
    - fix pot distance
    - fix switch distances
    - room for LED
    

Entry: Manchester
Date: Sun Oct 28 19:30:21 CET 2007

i'm wondering whether it's not simpler to use Manchester code. (BPSK
with square waves)

symbols are 01 and 10

once synchronized, the signal can be locked by allowing resync in on
the fixed transition at half symbol. syncing can be done on an idle
line, all one (10).

catch: for uni-directional with sender = master this works fine, but
bidirectional is problematic.




Entry: eliminating the pullup resistor
Date: Sun Oct 28 19:40:41 CET 2007

In case there's one slave only, the pullup resistor can be eliminated
by using a current-limiting resistor to prevent short-circuit on
collision.


Entry: slave on krikit
Date: Sun Oct 28 19:59:46 CET 2007

Got one spewing 123, now need another one listening.

Slave uses RB0 (INT0). Apparently i can't pull the line all the way
down.. Probably on-resistance (i'm pulling down 100 ohm..)

Sequencing is an interplay between INT0 and the TMR2.

INT0 -> reset timer phase + call 'rtx-next'
TMR2 -> 'rtx-next'
        

The other one got 123, and some shifted version out of sync. To get
better sync during debugging, bytes could be interleaved with a 10 bit
idle preamble.  This would guarantee resynchronization after the first
faulty reception.



Entry: strong 1
Date: Mon Oct 29 05:17:06 CET 2007

          --- Vdd
           |
          [Ru]    /--[Rl]-o SLAVE I/O
           |      |
MASTER o---o------o--|>|--o SLAVE Vdd
                          |
                         === C
                          |
                         --- GND

0 1 2 3
0 1 X X

phase 1 is 'strong drive' directly from Vdd, not through a pullup
resistor. this avoids strong sink currents and large voltage drop.

during phase 0 and 1, MASTER is OUT. also if it's sending in 2 and
3. when receiving, master is Z, so Ru pulls up the line.

a slave can still mess up by pulling a line high, but the short
circuit is prevented by Rl. 


Entry: intermezzo: macro vs. return stack
Date: Mon Oct 29 15:50:14 CET 2007

actually, this is quite simple. if i change the terminology a bit,
compilation of local labels for jumps and run-time control flow using
execute and exit could be unified somehow.


Entry: about named macro arguments
Date: Mon Oct 29 19:46:11 CET 2007

maybe it's better to stick to prefix syntax to not gratuituously move
away from forth syntax. after all,

: 2@ var |
   var @
   var 1 + @ ;

is not too much different from

: 2@ | var |
   var @
   var 1 + @ ;

It will also simplify the implementation.



Entry: urgent stuff
Date: Mon Oct 29 20:52:04 CET 2007

time flies. i need to get     debug network running today. it should
not be more than patching the interpreter to the rtx: the hub should
just be a loop that polls the serial port, and possibly executes some
special purpose commands. the slave needs a new dispatch table
connecting to rx, tx from the slave rtx.
                          the

so todo:
- get this debug patch-through to work: nothing fancy, just repeat
- fix some of the urgent problems


Entry: hub interface
Date: Mon Oct 29 23:25:15 CET 2007

i'd like to do this with changing as little as possible. connect to a
hub just like any other project, but there should be a way to execute
its application without needing knowledge about the dictionary of the
hub device.

let's change interpreter.f to

\ token --
: interpret #x10 min route
             ; receive ; transmit ; jsr    ;
     lda     ; ldf     ; ack      ; reset
     n@a+    ; n@f+    ; n!a+     ; n!f+   ;    
     chkblk  ; preply  ; ferase   ; fprog  ; 
 
     e-interpret ;

the last word should lead to a reset if it's not implemented, or to
the interpretation of an extra set of byte codes. in any case, it is
required to be filled in by specific monitor code.

now: if there's no extension implemented, should invalid commands be
ignored or not? there's no proper way to react to invalid commands,
since they can quote the following bytes, leading to a completely
non-interpretable state... just reset is probably good enough.

another problem: if the hub just passes through, how to control it
after switching to passthrough mode? serial break is an option. need
to figure out how to send that in mzscheme though..

there should be a more elegant solution, but this requires either the
traffic to be quoted, or the new interpreter to actually understand
(parse) the traffic to see what comes through. the latter is not so
easy because of quoted bytes.

a better solution is to completely override the boot interpreter. that
way all traffic can be properly redirected.

i guess i'm making it too difficult. the real problem is: this hub
thingy doesn't fit in my debug or run view: the cable can't determine
whether a boot interpreter should be started or not. let's start there.

next: name for the protocol.. i'm going for E2: it's the binary
representation of 0 and 1: 0100 0111 = E2 with lsb first.



Entry: the big questions
Date: Tue Oct 30 00:25:54 CET 2007

probably the huge shot of caffeine i got today, but i'm in delusion /
big-idea mode again.. i run into a lot of bootstrap problems. today's
boostrap problem is debugging the debugger. somehow i think
bootstrapping is really the only significant problem.. it's the
"getting there" that's important practically, not so much the staying:
that should be obvious.

i find it a facinating subject. i need to read more about it:

 * need to play with piumarta's cola stuff: objects and lisp as ying
   and yang (though lisp has it's own ying and yang: eval and apply, i
   wonder if this is the case for objects? probably something with
   v-table lookup).

 * need to read about 3-lisp and reflective towers

 * i'm not so sure if writing a proper language bootstrap is valuable,
   but somehow it looks like yes. brood is a bootstrape exercise
   really. i'd like to end up, not necessarily at scheme but at a
   dynamic language to run on small machines.. maybe cola is the way
   to proceed?



another thing i need to read up on is partial evaluation and C code
parsing and refactoring, but that's secondary really.. maybe bootstrap
is indeed the only real problem


Entry: parsing again
Date: Tue Oct 30 04:07:36 CET 2007

* added packrat parser code from Tony Garnock-Jones
  this should "end all _real_ parser woes" when i switch to a
  different syntax frontend.

* for the forth regular parser, i just need to add proper syntax for a
  regular syntax stream pattern matcher: i have no real recursive
  parser need for the forth (really out of principle: to stick to the
  roots and make the language simple to understand. there's something
  to say about a simply parsed language when teaching!)

* the only reason i'm using syntax streams is to be able to recover
  source location information and to use syntax-case. the latter is
  probably not the right abstraction.


what i want to say is something like:

(parser-pattern (macro forth)
   ((macro <stream> forth)    ----))

where the '<stream>' is bound to a syntax stream

from portable-packrat.scm :

       (packrat-parser expr
		    (expr ((a <- mulexp '+ b <- mulexp)
			   (+ a b))
			  ((a <- mulexp) a))
		    (mulexp ((a <- simple '* b <- simple)
			     (* a b))
			    ((a <- simple) a))
		    (simple ((a <- 'num) a)
			    (('oparen a <- expr 'cparen) a)))

i read on the wikipedia page that a packrat parser is necessarily
greedy. i'm not sure in what sense..



Entry: finite fields
Date: Tue Oct 30 07:32:21 CET 2007

http://www.lshift.net/blog/2006/11/29/gf232-5

in 8 bit, the biggest prime is 2^8-5 = 251

i'm not sure what this is useful for though.. some error checking /
correcting stuff? the article talkes about "a" finite field, as if it
mostly doesnt matter which..

ah.. the wikipedia article on coding theory mentions subspaces of
vector spaces over finite fields. a naive way would be to use i.e. a
3-space in the 4-space over GF(251).



Entry: fixing the assembler
Date: Tue Oct 30 16:35:18 CET 2007

made the dictionary into a parameter to move code from internal ->
external definitions. now i need to abstract away the control flow of
the assembler: eliminate assemble-next

let's see, the type of control used is:

      * comma -> expand to list of instructions
      * comma0 -> same, without updating instruction pointer
      * register -> dictionary mutation
      

primitive ones:

      * again (retry) -> retry assembly with updated (dictionary?)
        state

properties:

      * assembling an instruction is 1 -> 0, 1 or more


looks like the major difficulty is in assembler operations that
recurse.. currently it's handled by just pushing an opcode in the
input buffer and calling next. i'm going to make this recursing
explicit.

how should non-resolved symbols be handled? just returning the
instruction seems best. i do need to fix absolute/relative addressing.

what about leaving restart to the sequencer? the idea is to provide
some expansion to plug-in assemblers (asm-find) 


so the point where i need to make some changes is the way 'here' is
used: chicken and egg:

      * can't determine 'here' untill all previous instructions were
        assembled.

      * can't assemble instruction intil it is know how far a forwared
        reference is.

what about trying to solve this with backtracking? is that overkill?
maybe backtracking with memoization? maybe assembly itself is cheaper
than memoization :)

maybe every instruction should be compiled to a thunk that takes just
the absolute address?

hmm.. need some time to sort it through.. it should be possible to
write this in a lazy way..

roadmap:

 - get it to work like it did before
 - change the implementation of 'here' to a parameter
 - create a graph data structure from 'label'
 - figure out the control flow for some backtracking like thing
 - write some graph opti (i.e. jump chaining)

another remark: having 'labels' as pseudo instructions is bad. they
shoul really be true graph elements: pointers to instructions.

hmm.. i need a break.

lap.. goes wrong 'somewhere' :)

maybe i should fix the 'here' thing first before trying to get it to
run, since it's somehow messed up. i need to start over:


Entry: here kitty
Date: Tue Oct 30 22:54:39 CET 2007

what with 'here' ?

now that it's separated out a bit more, it's easy to see it is a bit
of a mess: i'm using org-push and org-pop so i can't just eliminate
it..

i need to separate these concerns:

* ORG / ORG-PUSH / ORG-POP   = telling where things go

  it's easier to cut out the intermediate part and handle it
  separately. a bit of a crazy way of doing things..

* 'here'  = using self-referencing blabla

i need a proper way of expressing all these dependencies: once 'org'
stuff is dealt with, and the absolute/relative problem is solved, the
remaining problem is one of relaxation.


Entry: relaxation problem
Date: Wed Oct 31 00:45:26 CET 2007

some choices need do be made before enough information is present, but
instead of completely starting over (backtracking) the form of the
problem is such that the intermediate solution can be updated. as long
as a complete dependecy graph is present, the solution is quite
trivial: just recurse over all dependencies.

some hints for finding the right data structure

1. instructions that do not reference code locations, either as jumps
   or just as literal words, are irrelevant and can be ignored.

2. labels point _in between_ instructions

3. keep the cause of events abstract: any instruction that has a
   reference can grow.

4. this is related to functional reactive programming

let's stick to the idea of instruction cells: each cell contains a
single symbolic opcode with arbitrary length.

thinking of this as cells sending messages to each other, there are 2
kinds of messages:

- tell next cell it has moved
- tell cells that depend on a label they need to update

looks ok at first, but for non-contiguous code that doesn't have a
non-decreasing code distance between several nodes, this might not
terminate.. if i make sure code never shrinks, this should be ok
though..

hmm... i need to read a bit about this. i guess in general it's
"linker relaxation".

so most important notes:

   * downward updates from a size change can be eliminated

   * to ensure termination, only expand/contract in one direction:
     that way it will at least stop at the case where all references
     are expanded.

   * if a size change happens as a consequence of an update


Entry: a more traditional approach
Date: Wed Oct 31 03:43:54 CET 2007

http://compilers.iecc.com/comparch/article/07-01-038

    There is a type of assembler that does exactly the same thing on
    every assembly pass through the sourcecode. Pass 1 outputs to
    dev>nul and is full of phase errors, pass 2 has eliminated most
    (or all) phase errors (output to nowhere) and pass 3 usually does
    the job in 99%+ cases whereupon code is output. On each pass
    through the sourcecode (or p-code in your case) you check for
    branch out of range, then substitute a long branch and add 1 to
    the program counter, causing all following code to be assembled
    forward+1, then make another pass and do the same thing again
    until no more branch out of range and phase errors are found do to
    mismatched branch-target addresses.


That doesn't require my esoteric approach and seems a lot simpler
really: just keep it running until the addresses stop changing.

So just do as before, but:

   * keep a phase error log
   * use a generic branch instruction which gives short or long branches
   * every pass is completely new
   * split 'old' and 'new' labels, make new labels mutable?
   * put 'here' in a dynamic variable
   * make a quick scan for labels to find out undefined ones

NEXT:
	prepare assembly code so multiple clean passes are possible:
	- get rid of 'mark' for example.
 	- put 'here' in a parameter
	- remove all dictionary manipulations
	- find a way to handle var and eeprom.. maybe separate pass/filter?
	
the goals is clear enough.. just some disentangling to do first..

different approach: 

  * use the previous approach, but keep the dictionary after every
    pass (clean it inbetween)

  * keep a log of the name registerations to determine phase errors.



Entry: comparators and square waves
Date: Wed Oct 31 05:00:41 CET 2007

before trying anything with sine waves, it makes sense to at least
have a go at pure binary singnals spanning the entire bandwidth. i'm
curious as to how far i can completely eliminate amplification, and
use only a comparator?

i do loose all signal presence detection capability, and amplify noise
tremendously. but this does transform everything into a software /
filtering problem. i guess with some good codes i can actually get
things through..


Entry: shopping for opamps
Date: Wed Oct 31 19:59:42 CET 2007

@ maxim for low voltage rail-to-rail.

i can get as low as 2.7V for 
MAX4167   5MHz, 1.3mA      (DUAL)
MAX494    0.5MHz, 0.15mA   (QUAD)




Entry: name spaces and objects
Date: Wed Oct 31 23:32:41 CET 2007

i'm trying to figure out how to make the name manging work well enough
to create static metaprogramming interface which supports generic
programming at the macro level.

  * write algorithms in macro form
  * instantiate them statically as many times you need


what i'm really missing is higher level macros. with those, i can
build anything i want really..

so why is it impossible to have those? i probably need to give up on
forth syntax..

(let me finish my verbose buffer code before i try to answer..)

ok. i don't know really.

let's first try to get things like this out of the way:

: bbf.tx-empty>z bbf.tx buffer.empty>z ;  
: bbf.rx-empty>z bbf.rx buffer.empty>z ;  
: bbf.rx-room>z  bbf.rx buffer.room>z ;  
: bbf.tx-room>z  bbf.tx buffer.room>z ;  
  
: bbf.>tx      bbf.tx buffer.write ;  
: bbf.>rx      bbf.rx buffer.write ;
: bbf.tx>      bbf.tx buffer.read ;         
: bbf.rx>      bbf.rx buffer.read ; 
: bbf.clear-tx bbf.tx buffer.clear ;
: bbf.clear-rx bbf.rx buffer.clear ;

what i want is just

     ' bbf.tx- compile-buffer
     ' bbf.tx- compile-buffer

i can't even do variables since they are macros..

  * yeah i need to be able to generate macros
  * and fix name clashes within a compilation unit: both words and macros.

maybe the trick is really to define 'compilation unit' properly?

in my current approach, a macro can't pop up during expansion of code.


i need to get the philosophy right:

   * a flat namespace is nice for an application: everything is
     concrete. we're "among freinds" and last names are not necessary.

   * it sucks for writing library code

the solution in mzscheme that works for me is functions +
modules. local module namespace can be used for small specialized
utility words. i'd like to have something like that in forth.

the problem is: i'm taking a really static stance in which macros play
a central role, not functions. this works as long as macros are
sufficiently powerful, which means higher order macros.

now, let's pull the problems i'm having apart:


i wrote some buffer code, which is just macros. to instantiate a
buffer one doesn't simply do "bla create-buffer" or something, but it
is necessary to specialize a lot of functions manually. that's
completely unacceptable.


Entry: higher order macros
Date: Thu Nov  1 01:06:36 CET 2007

In order to solve some particular template problems, i'd like to have
higher order macros. this amounts to instead of splitting up a source
file as:

        MACROS -> PROCEDURES

splitting it up as

        ... -> MACROS^2 -> MACROS -> PROCEDURES

of course, there should be no limit to the tower.


The real problem is: i have no sane syntax space left! In macros i
can do this:

  macro : make-a-123
            ' a-123 *: 123 exit ;

which is already pretty ugly because of quoting issues. But what am i
going to invent to make higher level expansion work?

One thing is sure: taking out the reflection (making macros
side-effect free) killed the possibility of generating names at
compile time, EXCEPT for function labels. But those are really just
data: it's a hack that doesn't really count.

   So I have a GOOD THING: independent declaration instead of
   sequential variable mutation for creating new macro names, that
   causes a BAD THING: limited reflection due to improper phasing.

Actually, I already knew that, but i'm starting to feel it now:
artifical limits are no good. Even if they serve a higher goal.. Maybe
that makes them not artificial?

The limit i created is actually there for a reason: to use partial
evaluation to make it possible to perform compile time operations
without the need for an explicit meta language: without the need for
quotation like `(+ ,a ,b) or it's beefed up syntax-case / syntax-rules
variant.

(funny how the only 'meta' part of the language is the macro stack: it
punches holes in reality somehow ;)

So let's pat myself on the back:

   * the current macro / forth thing is GOOD. it is easy to use, easy
     to understand, and avoids most quotation issues that arise in
     practice by relying on partial evaluation. it gets pretty far
     without the need for an explicit metalanguage.

   * it's NOT GOOD ENOUGH because it's the top level: it can't be
     metaprogrammed itself!


The metaprogramming operations i'm looking for are those that create
new macro NAMES. Creating new macro BODIES should not be so terribly
hard: it is in fact what should be used for the quotation based
language.

So the core of the business should be the question why this works in
scheme:

(define-syntax make-macro
  (syntax-rules ()
    ((_ name body)
     (define-syntax name
       (syntax-rules ()
         ((_) body))))))

box> (make-macro bla (+ 1 2))
box> (bla)
3


Entry: poke
Date: Fri Nov  2 04:27:04 CET 2007

i'm taking a day off.. so technically i'm not allowed to write in this
log. however, i got into PF today, and wrote a rant on the PF list
about mapping and partial evaluation.

maybe it's time to start writing poke, or nailing down the
requirements. the idea behind poke is to have a machine model for DSP
like tasks that can be setup (metaprogrammed) by say a scheme
system. the idea behind an application is this:

	1. a program is compiled for a VM.

	2. a new VM is instantiated (on a separate core/machine)

	3. the VM now runs in real-time: doing its own scheduling and
  	   stack based memory management, being able to communicate
  	   with its host system and other VMs

each VM is a linear tree stack/tree machine.


i'd like to do this without writing a single line of C code: have it
all generated. that's the only way to be serious about generating
*some* code.

it should have an s-expression interface with which it talks to a host
scheme system. this acts as message passing: no shared state
allowed. this syntax should have an easy extension for binary data.

it should be 'ready' for multiprocessing. what i mean with this is:
each processing core should be able to run a single machine instance,
so instances should be able to talk among each other in a simple way,
and there should be a schedulure available on the VMs to handle the
message passing.

i was thinking about a 'binary s-expression' approach to limit
inter-machine communication parsing overhead. data should still be
list-structured though, and word-aligned. for human interface, a
simple front-end could be constructed. arrays can be allowed for ease
of wrapping binary data.

internally, cons based lists are used for all representation. cdr
coding is used to be able to represent programs linearly. memory
inside the machine consists of stacks only. each machine uses a
limited set of data types, making re-use lists efficient.

aim for the highest possible gcc code generation efficiency: i see no
point in targeting anything else than gcc, so all extensions are
allowed. i just checked (see doc/gcc/tail.c) for tail call support and
it seems to work when putting functions in a single file. it also
works putting the functions in different files apparently. that's good
news. state passed: 3 stacks: DS/RS/AS

the target language should be a pointer-free safe language. this is
going to be a bit more difficult, probably have to split in safe /
unsafe parts.

the 'system' language and the 'inner loop' language are different and
should be treated as such. i probably should start with the latter and
build the control language as a layer on top. the former is a
forth-like language extended with linear tree memory and the latter is
a multi-in-multi-out language to be combined with combinators.


  1. all C code generated: need a generator.
  2. message passing interface using s-expressions.
  3. run-time memory (stacks/trees) is locally managed
  4. other (code) memory is static/readonly, loaded by host
  5. safe target language (from a certain point up)

so poke seems like a really straightforward extension to
forth. getting it compatible with PF will be quite something
though.. all this is pretty low priority. the only difficulty is how
to deal with pointers for optimizing the linear stack/tree data
structures. 'safe poke' :)


Entry: mix
Date: Fri Nov  2 05:31:52 CET 2007

then the thing that could be used immediately in both PDP, PF and PD
modules: a language to describe inner loops and iterators, to yield C
code that can be straight linked into the projects.


Entry: instantiating abstract objects
Date: Fri Nov  2 15:19:40 CET 2007

i'm giving myself one hour to think about how to fix the verbosity of
the following code:


macro
: tx         #x100 tx-r/w #x0F ;  \ put buffers in RAM page 1
: rx         #x110 rx-r/w #x0F ;        

: tx-ready?  tx-empty>z z? not ;  
: rx-ready?  rx-empty>z z? not ;  
: tx-room?   tx-room>z z? ;
: rx-room?   rx-room>z z? ;
forth

2variable tx-r/w
2variable rx-r/w  
  
: tx-empty>z tx buffer.empty>z ;  
: rx-empty>z rx buffer.empty>z ;  
: rx-room>z  rx buffer.room>z ;  
: tx-room>z  tx buffer.room>z ;  
  
: >tx      tx buffer.write ;  
: >rx      rx buffer.write ;
: tx>      tx buffer.read ;         
: rx>      rx buffer.read ; 
: clear-tx tx buffer.clear ;
: clear-rx rx buffer.clear ;


the ONLY difficulty here is that i can't generate macros, including
variables. is there an other way to solve the problem?

is it possibly to hide everything in one single macro? yes. if
tx-empty>z is never expanded as a function this is actually
possible. then what remains is just:


      
macro
: tx         #x100 tx-r/w #x0F ;  \ put buffers in RAM page 1
: rx         #x110 rx-r/w #x0F ;        
forth

2variable tx-r/w
2variable rx-r/w  

tx >buf
tx buf>

maybe i can somehow make an 'un-inline' function work? like
memoization?

something which gets me half way there is a blocking read/write
operation: only for dispatch loops this then becomes problematic.


conclusion: i guess it's ok to go for this approach:

   On the subject of code reuse, there are 2 options. Either you write
   it as procedure words, or as macros. Using the procedure word
   approach will lead to smaller code size but slower speed (since
   run-time dispatch is probably necessary). 

   Using the macro word approach can lead to fast inline code which
   might be not optimial for code size.



Entry: e2 debugging
Date: Fri Nov  2 17:41:24 CET 2007

current setup: hub (master) connected to krikit (slave) which runs a
loopback. there is communication, but somehow a start bit gets
lost. there are 4 places where it can get lost:


1.    hub transmit (OK: clear on scope)
2.    slave receive (OK: sending #xFF all one gives reply)
3.    slave transmit (OK: reply has start bit)
4.    hub receive

i have no trigger scope or logic analyser so i need to construct a
steady state error condition i can sync my scope to. i can measure
slave transmit if i manage to add some wait code in the hub. such code
is probably necessary for other purposes.

so. running a couple of experiments makes it clear that 1-3 are
ok. the problem is with the hub receive that doesn't see the start
bit.

i don't see the problem. as far as i can isolate it, somehow the start
bit gets missed by:
    - the rx state machine is in the wrong state
    - the rx/tx switch comes a cycle too late
    - ...

i need something that's easier to test. i suspect the rx/tx switching
is the cause, so maybe i can make a better switcher?

i did notice a slightly borked waveform for the startbit
however.. let's see if i can get a better view and see where that's
coming from..

that was wrong. i start over:

     - fixed timer compensation, now at least the signal is stable
     - clearly to see that there's a phase problem

i'm wondering if it's not just a speed problem. timer is running every
64 clocks.. well.. it's easy to test by just running it slower
really.

YES! it was.. running 4x slower fixes the problem. time to do some
profiling then!



Entry: e2 + interpreter
Date: Fri Nov  2 23:21:38 CET 2007

i'd like to make 'transmit' and 'receive' late bound. that way it's
easy to switch the interpreter's default I/O. but i need to do it
cheaply: using vector.f and execute.f requires too much boot space.

wait.. that's the case for catkit. for the 2620 i have a lot more
room. maybe i should go that route then, and solve the catkit problem
when it poses itself.

time to make some decisions:
     * allow both serial + e2 ?
     * build e2 in boot loader?

actually, i do need e2 in boot loader. working as a safety
measure. hmm.. let's get it to do what it needs to do first.

ok. i can ping krikit. fixed the saving/restoring of the a reg so i
can access the stack. code upload doesn't work yet. i guess it has to
do with a missed 'ack' due to interrupts being disabled. maybe i
should build in the ack in the fprog?


  NOTE: about saving the a reg. if there are interrupts, the a reg
  needs to be saved anyway (or it's use protected with cli), so maybe
  it's best to just always save on clobber? alternatively, always save
  on clobber in isr.

i added an ack to fprog and ferase, but apparently that's not
enough. one line can be written, then it messes up. some code is
needed to properly resync the transceiver after programming so it
picks back up at the next idle INT0.

for debugging purposes, i should make a version that uses polling
only, so it can be used to setup interrupts. thinking about it, i
probably need to modify all opcodes so they give a sync themselves, so
no buffering is required. (uart has 1 byte). hmm.. it's not so simple
really.

actually, it is: all interpreter tokens have RPC semantics: they
return at least one value, except '00' which is a sink, and 'reset'
which can't have a return value. the 'ack' opcode can then be
eliminated, an possibly replaced with 'interpret'.

nop, reset -> no ack
receive	 -> ack
transmit -> value
jsr, lda, ldf -> ack
n@a+, n@f+  -> stream of bytes, no ack necessary
n!a+, n!f!  -> ack
ferase, fprog -> ack
chkblk, preply -> stream of bytes, no ack necessary

this should get rid of the requirement to have buffered io. remaining
timing issues can be handled with appropriate delays.


an interesting extension when 'receive' and 'transmit' are made
dynamic is to have them read from memory. that way a small program
could execute from ram.


Entry: boot protocol changed
Date: Sat Nov  3 02:27:28 CET 2007

      * fprog and ferase now give an 'ack' themselves. this is
        necessary for receivers that suffer when interrupts are
        disabled.

      * the #x00000000 password is eliminated: with boot code
        protection this isn't necessary.



Entry: separate compilation + name spaces
Date: Sat Nov  3 15:14:27 CET 2007

as a consequence of the way compilation works, it is possible to rely
on the fact that, per compilation unit, names can be overwritten. what
i mean is that it is possible to 'load' the same file twice, but with
different words/macros bound in its environment.

this comes close enough to the 'dictionary schadows' paradigm i'm used
to in PF, and which actually works pretty well: it avoids the need of
a name space mechanism fairly effectively.

an extension to this could be to allow for exports: provide only those
macros and words necessary.

then another extension: why not install the macro source in the target
dictionary? there's no realy reason not to, and it makes 'mark' work
for macros (given that i delete and re-instantiate the macro
cache). or.. i could use this as an indicator for using the macro
cache or not.

one thing that has been bugging me: if i define a word or a macro, i
do want it to override the previous word or macro.

i should make a list of the name space trade--offs for writing a forth
really.


Entry: roadmap
Date: Sat Nov  3 15:51:02 CET 2007

  - get programming to work over e2 (restart receiver after fprog: add
    a macro hook to interpreter.f) (done)

  - fix acks in interpreter.f and tethered.ss (done)

  - make it work without interrupts and put it in the boot loader

  - figure out the 'strong power 1' phase, and test with slave.

  - test over longer twisted pair cable.



Entry: no middle road
Date: Sun Nov  4 01:04:57 CET 2007

some thoughts about 'accumulative' code, due to lack of a better word.

in light of the recent remarks about higher order macros, i have the
impression i am mixing 2 paradigm in a not so elegant matter:

    1. functional programming, mzscheme's independent
       compilation with 'unrolled' metaprogramming dependencies.

    2. the accumulative image model, where a language grows by
       accumulating more power, which then can immediately be used to
       define new constructs.

i knowingly took out a part of 2. to get purely functional macros that
could be safely evaluated for their value at interaction time.
however, the the interactive compilation does work in an incremental
way. it looks as if i am forced into some middle road compromise.


Entry: embedded programming in 2007
Date: Sun Nov  4 15:53:35 CET 2007


the question i really like to answer: without too much bias (the the
tool i wrote) what is the point of writing static, early bound code in
2007, even if we're talking about microcontrollers.

  * is there really a 'complexity barrier' below which one HAS TO move
    to quasi manual compilation and allocation?

  * will this barrier remain in existence, or will better tools make
    a more high-level approach possible?

EDIT: some things i was thinking about yesterday:

  * leaky abstractions are hard to work with. starting from assembly
    and "thinking up", using purrr to help you write the application
    is the right approach. starting from some high-level understanding
    of the language and having to learn all its limitations doesn't
    really work. the problem seems to be the manual resource
    management: time, space, and synchronization between global
    variables, and hardware devices. 

  * it seems i loose most of my time in low-level configuration issues
    which give little feedback on error, and dealing with situations
    that are hard to debug due to dependence on external events. low
    level design really is a debugging problem: setting up experiments
    to try to isolate errors. hence the use of loads of specialized
    (hardware) tools used in professional environments.


Entry: concatenative introduction email
Date: Mon Nov  5 18:44:15 CET 2007

Dear All,

Allow me to introduce myself. My name is Tom Schouten. I live in
Leuven, Belgium and I'm 32 now, if that helps paint a picture. I've
been interested in concatenative programming for a while and lurking
here and there.. To educate myself, I wrote quite a lot of code in the
last couple of years, and I'd like to share some of the results, but
maybe even more the resulting questions. (warning: long post, story of
my life :)

My background is in electrical engineering. My heart lies in music
DSP.  I've been working up the ladder from electronics, to machine
language and C/C++, through Pure Data (a data flow language) to Perl &
Python to finally end up at Scheme and functional programming. I'm
flirting a bit with Haskell, but really just read because most recent
interesting functional programming texts use that language.

http://en.wikipedia.org/wiki/Pure_Data

The problem I'm trying to solve to guide me a bit is "Build tools to
write DSP code, mostly for sound and video processing, in a high level
language." I ran into limits of expressiveness writing video
extensions in C for Pure Data, about 4-5 years ago. Apparently there
are no freely available tools that solve this problem, so I take that
to be my mission.

About 3-4 years ago I started writing Packet Forth (PF) as an
attempt to grow out of my C shoes. It was at the time I discovered
colorForth, and I was wondering if I could create some kind of cross
breed between Pure Data and Forth. PF now looks a bit like Factor on
the outside, though is less powerful. PF uses linear memory management
(data is a tree), with some unmanaged pointers for data and code
variables. PF's point is a to be a scripting language which tosses
around some DSP operations written in C. It doesn't aim to be a
general purpose language. The darcs archive is here:

http://zwizwa.be/darcs/packetforth/

Some more more highlevel docs aimed at media artists here:

http://packets.goto10.org

I got a bit frustrated with the internals of PF, mostly because there
is still too much verbose C code, and a lot of C preprocessor macro
tricks that could best be done with a _real_ C code generator.

So I dived a bit deeper into Forth, and early 2005 I started at the
bottom again: I wrote an indirect threaded forth for Pure Data (mole),
and started BADNOP (now dubbed BROOD 1), an interactive cross compiler
for the Forth dialect Purrr, an 8-bit stack machine model for Flash
based PIC Microcontrollers.

http://zwizwa.be/darcs/mole
http://zwizwa.be/darcs/brood-1

Mole made me 'get' Forth finally: the first versions of PF were mostly
blind hackery to get to know the problems before the solution. For
mole, I actually followed tradition a bit more (Brad Rodriguez'
"Moving Forth"). This lead to a more decent PF interpreter.

The forth I wrote to write the cross-compiler for the PIC Forth was a
mess. I was experimenting with some quotation syntax but realized that
what I was really looking for was lisp, or a lisp--like concatenative
language. At that time, early 2006, I discovered Joy, so I ditched the
compiler and rewrote it in CAT (not Christopher's Cat) which was
written in scheme (BROOD-2). After some refactoring and rewriting due
to beginner mistakes I am now at BROOD-4, with the CAT core written as
a set of MzScheme macros. This CAT is a dynamicly typed concatenative
language with Scheme semantics. I consider it an intermediate
language. Currently it is only used to implement the Purrr compiler (a
set of macros) and the interaction system.

http://zwizwa.be/darcs/brood

Purrr is as far as I know somewhat novel. All metaprogramming tricks
one would perform in Forth using the [ and ] words are done using
partial evaluation only. I've tested this in practice and it seems to
work surprisingly well.

I am still struggling a bit with the highl level Purrr semantics
though. Concretely, it is a fairly standard macro assembler with
peephole optimization. Nothing special there. On the other hand, its
macro language is a purely functional concatenative language which is
'projected' onto a real machine architecture after being partially
evaluated. I tried to explain these concepts in the following papers:

http://zwizwa.be/darcs/brood/tex/purrr.pdf
http://zwizwa.be/darcs/brood/tex/brood.pdf

(for the latest versions it's always better to use the .tex from the
darcs archive)

The latter needs some terminology cleanup, but it contains an
explanation of the the basic ideas, and an attempt to clarify the
macro semantics in a more formal way. I'm interested to learn what i
need to read in order to frame these concepts in proper CS speak... It
looks like I'm either terribly ignorant of something that already
exists (I went through a couple of stages of that already), or I found
a clean way of giving cross-compiled Forth a proper semantics.

On a lighter note, I'm using Purrr to build The Sheep, a retro synth
inspired by 1980's approach to sound generation. It runs on CATkit,
and has been used successfully many times in beginner "physical
computing" workshops, as electronics is called in non-engineering
circles :)

http://zwizwa.be/darcs/brood/tex/synth.pdf
http://packets.goto10.org/packets/wiki/CATkit

(the scary guy in the picture is not me :)




Entry: krikit board design decisions
Date: Mon Nov  5 17:52:11 CET 2007

- 4 x AAA -> need at least 5V. alternatively, use a 9V cell and a
  transistor for speaker output.

- RGB led onboard

- debug connector = battery connector (RCA plug)


Entry: TODO
Date: Mon Nov  5 21:40:23 CET 2007

list has moved to TODO file.



Entry: polling E2 interpreter
Date: Mon Nov  5 23:15:31 CET 2007

it's not entirely without trade-offs to choose for a polling
interpreter for E2 in the boot code.

PRO: independent of interrupt routines which is useful for debugging
application isrs.

CON: completely synchronous and non-buffered. this requires some
careful coding in order not to miss any data.

Maybe the boot code should contain both versions?

This leads to objects really: a vtable is a dynamic route word.

2variable stdout
: do-stdout stdout invoke ;
: e2-stdout stdout -> route
       rx ; tx ; on ; off ;


hmm.. i messed up slave.f: diff tomorrow..


Entry: macros and procedure dictionary
Date: Tue Nov  6 06:03:54 CET 2007

maybe the trick is to just get rid of the distinction between
procedure words and macros: a single namespace, with procedure words
being equial to

: bla 123 compile ;

this combined with a preprocessing step that identifies all labels in
the source text. a single namespace is easier to understand. separate
compilation units gives shadowing, while inside a single compilation
unit circular references are possible.

what i want this to move toward is a more and more static declarative
structure. maybe i should re-implement namespaces and build them on
top of the mzscheme module system. i doubt the solution i can live
with eventually will be significantly different than mzscheme's..
maybe a bit more liberal? or is that just because of current
implementation?

maybe i should make the compiled macros into a real cache, and store a
master version as a s-exp tree..


Entry: redefine
Date: Tue Nov  6 15:59:00 CET 2007


i need to 

  * make it illegal to redefine macros: they use a caching mechanism
    which replaces names with values (procedures).

  * make it illegal to define a label that is already a macro


the real problem is that redefines need a proper semantics in CAT. for
the forth, i think shadowing redefines are best: 'empty' is practical,
and it should work for macros too.

CAT is currently designed so redefines are illegal: this allows the
use of values instead of boxed values. some possible routes out of the
mess:

   - prohibit redefinitions
   - use shadowing + proper cache
   - use boxed values (reset the code inside the 'word' struct)

a deeper question is: why not use mzscheme namespaces for all macros?
answer: because i rely on late linking. is there a way around this? it
probably makes it too complicated, since i need to figure out a way to
map it to BOTH modules and units..

let's stick to the current hash table name space, and go fore the
boxed approach: mutate the words themselves, instead of their hash
table entry.

OK. that seems to work. remaining problem: defining words that are
macros. a way to solve this is to define each word in the dictionary
as a macro, compiling [cw <name>]

let's not.. i've added a warning, which made me realize that i do use
this: macros can call words with the same name as a fallback. that
mechanism might be more worth than a safety net.. no, a safety net is
more important: can fix the delegation using a symbol prefix. what
about doing this automaticly? the last matching pattern is always
mapped to a runtime call?

i do need to fix dangling macros though. let's see if i can run into
that case again..

ok, it's clear: a dangling macro can be disastrous.

this is a mess..

assume there are 2 classes of macros: CORE and PRJ.

PRJ needs to be flushed whenever the project changes. i am not sure
whether macros from CORE will actually bind to those defined in
PRJ. there is no such plugin behaviour as far as i can tell, but
nevertheless it is possible to go wrong so i should do this:

             flush cache = 
               - invalidate all prj macros (make them raise an exception)
               - detach them from the namespace


looks like i got it now: ns-flush-dynamic-words! + support


Entry: asm rewrite
Date: Wed Nov  7 00:58:54 CET 2007

found another asm bug: variables get allocated on each pass now. this
doesn't seem to be fatal though, just inefficient. sheepsint works, so
it can't be the weird hub.f bug i'm chasing..

[EDIT]

several things might change here, but it could be a good idea to keep
the current operation until i have time to clean it up a bit. cleanup
would be:
  - move 'here' to a separate dynamic variable
  - handle different dictionaries better.

the problem now is that 'allot' gets called multiple times without
reset.. it's probably best to filter it out in a preprocessing step.






Entry: sheep transients
Date: Wed Nov  7 04:45:41 CET 2007

'sound' needs to be a stack: a circular one, initialized with valid
sounds, or a delimited one so a sound can end in 'done' to fill the
rest of a control slice with another sound.

the point of this is to create a concatenation at run time. it is of
course possible to do this at compile time, but the fun would be in
*mixing* sounds..

i think i have the solution there: each pattern tick a 'program' is
erased, and filled with instruments that are played after each other,
with the last tone = silence.



Entry: low impedance signal source
Date: Wed Nov  7 07:00:27 CET 2007

i'm trying to understand the difference between these 2 statements:

 * for a low--impedance source you best measure current, while for a
   high--impedance source you best measure voltage.

 * a current source has high impedance, and a voltage source has low
   impedance.

the deal is that these are 2 different kinds of "measurement" because
of the entire different scale of energy involved: for sensors, you
want maximum energy transfer, but to "measure" a current or voltage
source, you want minimal energy transfer.

looking at a sensor as a voltage or current source, you want to "max
it out".



implementation:

so, doing this with an opamp is really trivial. bias (+) on Vdd/2,
feed back from (out) to (-) using Ra, and connect the current source
between the virtual ground (-) and (+).

                    R
              /---/\/\/\/\--\
              |  __         |
              | |  \        |
       /--||--o-| - \_______|___o Analog -> uC
   |\  @     _o_| + /
   | ||@    |   |__/   
   |/  @    |          
       \----o          
            |          
o--/\/\/\/--o---/\/\/\/--o 
GND        2.5V         5V


then:
 * connect the speaker to 2 analog inputs, so they can be switched in
   analog high Z mode: not good to bias digital ins at Vdd/2

 * run the opamp and bias network off of digital output

  
let's see if analog Z1/Z2 can be PWM outputs. no such luck.. maybe use
a transistor to shield the detached (Z) driver pin from the Vdd/2 bias
voltage. or just not use pwm...

remaining question is if the opamp, when powered down, can take a
large differential input voltage.


EDIT: the circuit doesn't work without a capacitor: the coil is a
short circuit at DC, connecting (-) and (+). due to nonzero offset
voltage, this saturates the amp.

EDIT: i understand now why measuring current is not a good idea. the
impedance of the device is dependent on frequency: 0 for DC, rising
linearly. if you measure current, the signal will have a strong low
frequency content. however, if you measure voltage through a resistor
that's say 10x larger than the stated impedance, the response is
flattened out since the resistor dominates. so the classic one works
better:

   SPK                Rg
    o          /---/\/\/\/\--\
    |          |  __         |
    |    Rs    | |  \        |
    o--/\/\/\--o-| - \_______|___o Analog -> uC
|\  @        ____| + /
| ||@       |    |__/   
|/  @       |          
    |       |
   === Cs   |
    |       |
    |   Cn  |
o---o---||--o
|           |          
o--/\/\/\/--o---/\/\/\/--o 
GND        2.5V         5V


Here Cn reduces noise by lowering the AC impedance wrt to the high DC
impedance point at (+). Rs = 47R and Rg = 100k give decent
results. About 2000x or 66dB.

The values of Rs should be as low as possible to reduce noise. I'm
comfortable now i understand the trade-off.

EDIT: i switched to using closed loop current measurement again, this
time limiting the overall gain to about 100x (using Rg=1K). followed
by a second stage with 100x gain this seems to work better. i suppose
my original problam was just due to too high gain, running into opamp
limitations.


EDIT: going back to the circuit with Rs: in that one the opamp's input
could be decoupled from what goes to the speaker by switching the (-)
and the top of the bias network to ground.


Entry: e2 hub
Date: Wed Nov  7 17:37:52 CET 2007

now i need to find a way to program the e2 hub. on problem in the
boot protocol is that i have no way to packetize the stream.. what i
want is a hub which is mostly in repeater mode (for the commands
0->15) but responds to other commands itself.

there are 2 alternatives. either write a 'fake' interpreter which
simulates the state machine that parses the debug input stream, or
change the protocol so it is delimited.

the former is a stupid short term thinking hack.. let's packetize.

hmm... this is quite a change again: thinking about optimizing the
problem. oops bad word :)

a way to do this quickly is to just prepend every message with the
size. that way the core interpreter can ignore it, but the repeater
can transfer without being able to interpret.

let's try that first.

notes
  - should put 'ack' at 1, so a stream of ones gives ack messages

hmm.. i chickened out. it's a lot of changes at once. lots of places
to go wrong. will cost me some frustration.. let's find another way,
go for the stupid hack.

if i can make the message length not dependent on context, meaning a
previous message length, i can probably derive the lengths manually.

the only problems here are the block transfer words, stuff that comes
back from the uC can be echoed without problems. (i'm thinking about
ping reply..)

ok, made the protocol context-free in the host -> target direction.

so, next:
  * make hub understand protocol (OK)
  * add hub commands
  * move to polling implementation in bootloader

hub commands: these should be an abstract interface for things one
would like to do with a hub. arbitrarily
 
    set client  (0=hub)
    on client
    off client
    
    
hmm.. i don't see it so clearly. what am i trying to accomplish? by
default i should be in 'hub application' mode, but it should be
possible to switch to hub debug mode too. the latter can be permanent
switch (requiring access to the dictionary to switch back).

i got it sort of figured out now. 

  TODO: * make hub switch between hub-interpreter and interpreter
          using external resistor. do this when hub is finished.

        * until then, find a way to start the repeater without having
          the hub dictionary loaded.




Entry: how much amplification?
Date: Thu Nov  8 01:22:04 CET 2007

i have 12 bit at my disposal. amplification is mainly determined by
the ratio of distances. i'm measuring current, which should be
proportional to sound pressure, which is 1/r^2

so, suppose i use a gain factor of 64 = 8^2, this gives a range ratio
of 8. say 1 - 8 meters: don't put them closer than a meter, and
further than 8.


Entry: poke & precompiled loops
Date: Thu Nov  8 06:14:42 CET 2007

i think i need to separate out the c code generator so i can start
generating code for PF and Pure Data, which will not be anything
forth-like so doesn't really belong in brood. in fact, the way i think
of it now (something akin to functional reactive programming) it will
be quite the opposite.


Entry: RS next order
Date: Thu Nov  8 17:42:19 CET 2007

 * linear regulators
 * 9V clips / battery holders
 * transistors?
 * high ohm resistors
 * xtals + caps / resonators
 * blue bell wire
 * schottky diodes
 * small signal diodes



Entry: more modem design decisions
Date: Fri Nov  9 02:44:47 CET 2007

 - modulation or not?

  some modulation is necessary since i can't transfer DC. but
  frequency response looks really non-flat to go for a wideband
  approach. i need to experiment.

 - FIR or IIR

  what is needed is a decimating filter. i can probably get much
  further with a crude windowed FIR than an IIR.



Entry: demodulator
Date: Sat Nov 10 15:12:58 CET 2007

i'm not going to waste time on trying out a pure square wave
modulation. let's stick with some simple demodulator, and have a look
at the numbers.

i currently have the debug net running at 8kHz. this should also host
the filter tick, which consists of:
   - read adc + update filter state
   - once every x samples, wake up the detector tasklet

what if i start out with using a FSK, because it requires no
synchronization, and use a square window where the 2 frequencies are
placed each other's zero.

so.. square window does have perfect rejection for the harmonic
frequencies. it's only the stuff that lies inbetween that is
problematic. ok, this is obvious.

the problem can be entirely moved to synchronization and linear
distortion due to transitions. if the receiver listens during a steady
state part of the signal, perfect rejection is possible. so the main
questions are:

  - how to synchronize?
  - how to limit transitions?

which brings me back to PSK.. maybe it is just simpler to use? as long
as the start of a symbol can be detected (threshold) and the phase can
be corrected (preamble) the rest seems not so hard really.

again, from a different angle: demodulating PSK is a synchronous mixer
followed by a low pass filter. i assume that a rectangular window is
going to be good enough as an LPF, which just leaves the problems of
signal detection and synchronization.

if i leave the non-synchronized receiver on constantly, it outputs a
24 bit complex number. during synchronization this needs to move
toward zero. the phasor will rotate once per window length. which
direction? if the direction is known, it's possible to detect a
crossing. the direction is determined by the rotation direction of the
mixer phasor.

i need to up the frequency: 2 MIPS is not enough. maybe i should do
that first.. then the output stage then the receiver then a decoder.

next actions:
 * have a look at PSK31 demod code
 * build the output stage (either PWM or SD)
 * build board + move to 40 MHz (monday: can't find xtals, maybe test
 on the dude?)

Entry: PSK31
Date: Sat Nov 10 18:56:01 CET 2007


PSK31: Peter G3PLX
http://det.bi.ehu.es/~jtpjatae/pdf/p31g3plx.pdf

some ideas from the paper:

 * this is a protocol for live communication. Error correcting codes
   introduce delays.

 * use relaxed bandwidth for the filter for smaller delays and lower
   cost.

 * take advantage of high frequency stability of modern HF radios

 * demodulation by using 1 bit symbol delay and comparison. ??? i dont
   get this one.

 * synchronize using the amplitude modulation component!

 * viterbi decoder for convolutional code


Entry: a single port for debugging
Date: Sat Nov 10 20:37:18 CET 2007

wait a minute. if i manage to plug the E2 protocol through to the icd
port, i could standardize on a single set of connectives. however, the
connection is not standard, but it is 4-wire (can run over telephone
cable) is synchronous and has a clock too. what this would solve is
the bootstrap upload problem, which is a nasty one..


Entry: transmission bandwidth
Date: Sat Nov 10 20:51:22 CET 2007

something i never really understood: Fig. 4 in the PSK31 paper shows
the bandwidth for random data. why is this so wide? why is reversal
not the highest bandwidth?

other questions: try to explain what this 'bit delay' demod is + how
the amplitude demodulation sync works.


Entry: BPSK synchronization
Date: Sat Nov 10 21:25:37 CET 2007

there are 2 kinds of synchronization necessary: carrier
synchronization, and bit clock synchronization. the former can use a
PLL, the latter can use the 1->0 transition.

suppose the following bit encoding: 8N1, with 1 = idle, and 0 = start
bit. during idle the phasor needs to be predictable. this is either a
fixed value, or an oscillation between 2 signal states. picking the
former this gives

   1 = carrier
   0 = inverse carrier

during idle, the synchronizer works: this is a PLL state machine which
turns a single phase increment left or right depending on which
quadrant the phasor is in. there are 3 bits determining quadrant.

there needs to be an AGC which reduces the 24 bit phasor to an 8 bit
phasor for easier demodulation and synchronization.


Entry: so why not use AM?
Date: Sat Nov 10 21:50:43 CET 2007

somehow both FSK and PSK seem too complicated. maybe i should start
with AM, then later (never) continue down the road and try FSK (double
AM) and PSK (with synchronization).

the most important interference we're going to find is bursts. these
should be able to eliminate using stop bits: 1 = on, 0 = off, which
means a burst will probably lack a stop bit.

algo: continuous square window filter with signal detector feeds into
simple sampler. if the sampler is not active, every 1->0 transition
will wake it up.

before starting with AM, i can just use some noise modulated
protocol. hell, anything that can get a 1 accross.



Entry: roadmap
Date: Sat Nov 10 22:16:08 CET 2007

EDIT:

  * try strong phase and run it off E2 (OK)
  * level detector, use the RGB led.

  

Entry: E2 next
Date: Sun Nov 11 01:02:47 CET 2007

apparently the E2 signal interferes quite a bit with the amplifier,
which is not such a big surprise. so i guess it's time to mature the
debug network a bit: 

  * switch to idle mode (keep high) when there's no host -> target
    comm.

  * find out what the initial 'missed ping' is all about.

i'm going to add stop bit checks to at least eliminate 1->0 bus
glitches as a source of errors.

that wasn't the problem... something is wrong with bus startup. maybe
i need to make sure 'off' will actually switch on power state?

looks like the error is with the slave init.. i get a predictable
reply to a ping after bus-reset:

> 13 >tx
> rx-size p
3 > rx> px
2D > rx> px
AD > rx> px
F7 > rx-size p


i made a little progress here: 

> hub-init
ERROR:
time-out: 1
> rx-size p
0 > 5 >tx 8 >tx 0 >tx
> rx-size p
1 > rx> p
131 > 9 >tx 2 >tx
> rx-size p
2 > rx> px
FF > rx> px
D0 > 

this sends the commands for fetching the 2 bytes at rom address
#x0008, which indeed should be #xFF, #xD0

this is reproducable. so i can conclude that the bytes get received
properly. something goes wrong in either the slave transmitter or the
host receiver..

i give up.. i can't find it. a workaround which seems to be stable is
to send an 'ack' which will send back a garbled byte.

apparently, unplugging and replugging the E2 connector gives the same
behaviour: first byte coming from slave is corrupted. so it can't be
host side..



Entry: amp notes
Date: Mon Nov 12 15:29:42 CET 2007

i changed the circuit back to 1K input impedance, 100K feedback in
first stage. the second stage has 1K + 100nF, and 100K feedback, and I
have no idea why this works: less noise, and it seems to have a good
response in the intended range..

maybe because most sounds have a 1/f response? i don't know... it
responds well to whistles, which is nice.

this is a 10 kHz pole... so it's basicly set up as a differentiator?
maybe because i have GBW rolloff this works? i'm puzzled.

i tried with a LM358N and it gives a lot more noise.

i tried TL072CN and it gives too much bandwidth! so, i use a
compensated integrator with 1K/100n in the source and 220K/4.7n in the
feedback section. looks like this is final enough.. maybe beef up the
amp just a tiny bit more..

PARTS LIST:

2 x 220K
1 x 100K
2 x 10K
3 x 1K

2 x 15pF
1 x 4.7nF
2 x 100nF
2 x 10uF

1 x 10MHz
1 x 18F2620
1 x TL072CN
1 x LED(red)
2 x 6 PIN HEADER
                                           C2 4.7nF 
                                        /-----||----\
                                        |           |
   SPK            Rg 220K               |  R2 220K  |
    o          /-/\/\/\--\              o--/\/\/\---o
    |          |  __     |              |  __       | 
    |   Rs 1K  | |  \    |       R1 1K  | |  \      |  
    o--/\/\/\--o-| - \___o--||--/\/\/\--o-| - \_____o LINE
|\  @        ____| + /     C1 100nF     __| + /
| ||@       |    |__/                   | |__/
|/  @       o---------------------------/          
    |       |
   === Cs   |
    |  10uF |
    |       |
    |   Cn  |
    |  10uF |
o---o---||--o
|           |          
o--/\/\/\/--o---/\/\/\/--o 
GND        2.5V         5V


First stage gives 220 x amplification.

TL072 (TI version, i'm using ST version) has a GBW of 3 MHz, with 220
x amplification this gives rolloff at 13 kHz. so for the first stage
i'm good.

Second stage is a band pass filter with 22 x amplification:

G . . . . ._________
          /.       .\
         / .       . \
        /  .       .  \
         1/t2     1/t1

t1 = R1 C1 = 100us  -> 10kHz
t2 = R2 C2 = 1000us ->  1kHz


because f1 > f2 the gain is not R2/R1 but R2/R1 * f1/f2. 

a bit quirky, but it works.. maybe i should try with exchanging the
time constants so f2 > f1. 

looks like these changes keep the transfer function the same, with a =
sqrt(10)

R2 -> 1/a R2
C2 -> 1/a C2
R1 -> a R1
C1 -> a C1

so, there's a reason to do it like i did! the capacitors are
smaller. so where's the trade-off? maybe noise due to large resistors?
however, when f2 > f1 the gain is independent of the capacitors.

let's make this a bit more intuitive. what happens when C1 is made 10x
larger, so f1 = 1kHz, and C2 is made 10x smallr, so f2 = 10kHz? the
gain is now 10x more, so then the gain can be reduced by making R2
10x smaller, which again requires C2 to be 10x larger. so the net
effect is:

R1 -> R1
C1 -> 10 C1
C2 -> C2
R2 -> 1/10 R2

this gives a 1uF capacitor. so alternatively R1 can be made 10x
larger, which requires C2 to be made 10x smaller. giving 10K and 470pF
respectively. (EDIT: this is what i did. works fine).

makes more sense now. so is there a ciruit that has independent
frequency and gain?

hmm.. i just tried the LM358N again, and it gives good results
also. guess the TL022 was just too low bandwidth? yup. 0.5 Mhz. hmm
the LM358N is only 1MHz ?






Entry: building the first krikit boards
Date: Mon Nov 12 15:41:18 CET 2007

 - not using E2, use serial + icd2 instead
 - xtal 40 MHz operation

pins to determine:
     * opamp + bias power
     * analog input (maybe first stage also)
     * speaker out

and figure out if the opamp can take 5V when it's powered down,
otherwise it needs an extra pin to pull the (+) input to ground also.


Entry: new sheepsint default app
Date: Thu Nov 15 20:00:15 CET 2007


something like this:

buttons:
* noise on/off
* xmod
* reso
* reset = silence

take button state from ram, uninitialized, so it survives reset.

xmod control uses 2 x 2Hx - 20kHz log
reso needs robustness for reso freq < main osc freq

3 x frequency nobs

2 knobs left.. maybe some modulation? osc 2 frequency + modulation
index. (formant / noise frequency)


Entry: fake chaos
Date: Sat Nov 17 02:27:51 CET 2007


following the same line as the formant oscillator, a fake chaotic
filter could be made. such an oscillator (some / all?) contain
unstable oscillations that are 'squelched'. the points where such
squelching happens are randomly distributed, but the bursts themselves
are quite stable, leading to an approximation as randomly spaced fixed
bursts.

does this work with the current setup? no.. it uses a random
period. that's different.

so... using the reso algo, it boils down to randomizing p0 with fixed
p1 and p2.

randomizing could be fixed + variable. the question is when to update
the period. the easiest is a fixed rate..

continuous updates seem to work. now p0 is modulated with a uniformly
distributed, taking care not to over-modulate so p0 wraps
around. everything is moved to prj/CATkit/demo.f



Entry: amplifier noise
Date: Mon Nov 19 21:25:32 CET 2007

for the next iteration of the krikit board, it might be a good idea to
improve the amp a bit. there are 2 things to consider:

  * input stage noise (impedance)
  * filter/amp stage capacitor values vs. noise and power consumption



Entry: shopping
Date: Mon Nov 19 23:57:44 CET 2007

AITEC:
 - perfboard?

VOTI:
 - perfboard

RS:
 - perfboard (RS: 206-8648, manuf: RE 200 HP)
 - oscillator
 - 9V battery holders + linear regulators
 - 8pin sockets
 - small signal diodes
 - high ohm resistors
 - blue,black bell wire


Entry: krikit todo
Date: Tue Nov 20 17:58:26 CET 2007


 - determine pins: analog in, opamp enable, 
 - output transistor: speaker out pin.
 - debug net: E2 / serial: minimal slave complexity solution

also for catkit: it might be best to connect the E2 bus to the serial
TX pin, which is multiplexed with an INT pin. (also for 2620? no, but
can be connected externally.)



Entry: ditch E2 ?
Date: Tue Nov 20 18:16:27 CET 2007

simple TTL serial with a bit of careful programming to ensure enough
'on' time (basicly, large enough cap or some extra '1' bits in the
data) might be a better approach, since it doesn't require a special
decoder in the target chip.

i could use a 'standard' here: the stereo minijack used in some A/V
equipment. ftdi sells them too apparently:
http://www.ftdichip.com/Images/TTL-232R-AJ%20pinout.jpg

or: leave the choice between E2 and serial open. given a bit of delay
on the client side when sending, and a proper 'listen' phase on the
host, a serial protocol can be used using the same hardware as the E2
bus:


         1K
TX  o--/\/\/\/\----\
                   |
RX  o----------o---o---o BUS
               | 
       /--|<|--/
       |
VDD o--o
       |
      === 10uF
       |
GND o--o---------------o GND


another thing to do would be to make the E2 protocol compatible with
the hardware uart. the problem here is the factor 5...


          (+)                      (-)

SERIAL    client + hw simple       4 wires 
SER+POW   client simple            3 wires
E2        client complex           2 wires


hmm...  the thing which i find most attractive is the possibility to
have a POWER socket that can be used as a comm. the rectification
diode also acts as a protection diode this way, and diode drop is not
really a problem when powered from 3V-5V.

so the main question is: how to make SERIAL run over 2 wires, given
the setup above? this is a sofware problem: how to synchronize. the
question is whether this extra synchronization effort will lead to more
slave complexity, with the bound being the E2 rx/tx.

 POW: works as long as cap is large enough
 RX:  always works
 TX:  works as long as host leaves room on the cable..

so the problem is for host to create a window. this shouldn't be too
big, for POW reasons.. to solve the timing issues here, it looks to me
that complexity will be a lot higher. i guess it's best to stick to
E2, but keep open the possibility of unidirectional serial comm.

conclusions:

     (4) serial + separate power over 4 lead telephone wire
     (3) serial, power from data, using stereo audio cable
     (2) E2: 2 wire, power connector

which one for krikit?




Entry: problem chips
Date: Wed Nov 21 20:40:14 CET 2007

18LF2610-I/SP doesn't seem to want to program..
ok. that was stupid. they're not self-writable.



Entry: krikit pins
Date: Wed Nov 21 22:31:09 CET 2007

input works seemingly without problems. output is going to be a bit
more problematic.. i think it's not a good idea to drive the speaker
with the pin directly.. for 2 reasons: 8 ohms is to high a load and
the drive point needs to be tolerant for analog voltages (a CMOS input
is not, and i'd like to use the PWM)

with the current setup, a PNP switch is probably best.

so, design variables:

    * PNP / NPN  (cap to ground or Vdd)
    * suppression diode?
    * feed from battery (9V) or Vdd (5v regulated)

EDIT:
1K with PNP on 13/RC2/CCP1

EDIT:
ok. make sure the speaker is not full-on, the transistor gets really hot.

EDIT:
running into a problem: i'm using high = off, which apparently the PWM
doesn't like: it gives a single spike. so i need to explicitly turn
off PWM.

EDIT:
the chip resets unexpectedly. trying now with ICD2 attached: seems to
be stable. so something's wrong with my reset circuit probably. could
be power supply stuff. some spikes..


Entry: standard 16 bit forth
Date: Thu Nov 22 04:53:21 CET 2007

i keep coming back to a standard forth for sheepsint. purrr18 is there
to stay as a low level metaprogrammed machine layer, but teaching it
is a real pain..

maybe the time is there.. maybe a safe language is not the way to
go. maybe a simple forth is more important? maybe standard is
important after all? i have a lot of design choices to make, like
building the interpreter on top of a unified memory model or
not..

 [ mostly triggered by ending up at the taygeta site (from e-forth) ]

more questions. if i want to make a standard forth platform, wouldn't
it be better to go for the 18f2620 with a resonator and a linear
regulator, and add a keyboard in and video out while i'm at it? why
not the dspic then? ( because i didn't port to it yet, tiens! )

so possible projects for januari:
  - portable forth on top of purrr18
  - linear safe language on top of purrr18
  - dspic assembler + compiler
  - a home computer based on 18f2620

strategically, portable forth seems to be the best option, since this
solves most of the documentation issues.. dspic and the home computer
are more of a lab thing. the linear safe language is something i need
to figure out how to do first in PF context.

portable forth could use 'code' words to switch to purrr? maybe it's a
good exercise all in itself to try to write a standard forth, and not
care too much about optimization etc.. i have my non-standard forth
now, so it's good to aim for the average.


Entry: the circuit again
Date: Thu Nov 22 18:59:33 CET 2007

because the input impedance is so low, the 10uF cap is really not
neglegible! in fact, this gives a 10ms time constant with a 1K
resistor. that's 100Hz, but the filter cuts of at 1kHz, so it's ok..

it's not ok for what i wanted to do, which is to use only a single
transistor to drive the speaker without a capacitor. this might
reverse polarize the cap: it's probably best to keep the switching
frequency high enough so this doesn't need to happen.. i wanted to
replace it with a ceramic one, but that needs at least 1uF.

wait a sec.. maybe it's just not possible to drive the cap to a
negative voltage? yup.. the + side of the cap will be at saturation
voltage.

                           
                  Rg 220K  
               /-/\/\/\--\
               |  __     |
        Rs 1K  | |  \    |
    /--/\/\/\--o-| - \___o     o Vdd
    |           _| + /         |
   === Cs      | |__/           >|
    | 10uF     |                 |---/\/\/\---o SPK
    |          o Vbias          /|
    |                          |
    o--------------------------/
    |
|\  @
| ||@
|/  @
    |       
    0 
   GND


i wonder if it's possible to make the circuit such that the transistor
doesn't blow up if SPK is driven low for too long. check this:
http://www.winbond-usa.com/products/isd_products/chipcorder/applicationbriefs/apbr21a.pdf

somehow the DC path needs to be blocked or at least limited.

hmm.. i think there's really no better way than to use switching: that
produces the least amount of heat in the transistor. just be careful
to not drive it too long, and use minimal DC: start wave 'touching'
ground instead of symmetric around 2.5V.

i'm ordering some BC640, which are TO92(A), 1A dropin for A1015. i had
a BC516 PNP darlington on my list, but the BC639 is 1A. for the
switching loads i care about, i don't need high beta.


Entry: crap.. transistor won't switch off
Date: Sat Nov 24 20:50:04 CET 2007

using 78L05 regulator for the chip, but wanted to drive the speaker
straight from the 9V battery. problem is, i can't switch off the
transistor if i'm not using open drain output and a pull-up
resistor..

so i guess just stick with connecting the speaker to the regulator
output, and up the regulator a bit... tho it's 100mA, maybe it can
take a bit of peak current?

do i have everything now? i guess.. maybe an on/off switch, but that's
easy to do later. also, if possible, add a connector for the led, so
it can be brought out to the box.



Entry: got the first carrier on the mic amp
Date: Sat Nov 24 13:03:50 CET 2007

sort of a little mile stone.. but no time for celebration yet. it's at
600Hz, which seems low.. putting it higher gives less response. and
it's quite distorted. higer frequency, more distortion.

ok..

so the resonance frequency of the speaker, measured by moving it over
the table, is about 625 Hz. which is of course the reason why i get
such a good response at 610Hz :)

this will be really hard to get out, so why not use it? use either fm
or pm at that frequency, and adapt the filter / amp accordingly.

so what about this:

make the amp go from 450Hz to 1kHz, and use the resonance frequency of
the speaker as carrier wave.

22K 100nF (450Hz) - 1M 1nF (1kHz) - gain = 45

  init-analog pwm-on 1 freq ! 2 att ! 0 nwave


half of that seems to work fine too.. 305Hz. what about a golden ratio
FSK modulation scheme? that way the harmonics of the lower one won't
interfere with the higher one..

EDIT: sticking to one carrier seems best in light of this resonant
peak.




Entry: direct threaded forth
Date: Thu Nov 29 11:02:39 CET 2007

a couple of days of rest doing admin stuff.. going to amsterdam today
for the final sprint. some things that crossed my mind:

DTC FORTH VM

  * purely concatenative virtual machine code: implement literal as
    i.e. 8 bit literal, and 8 bit shift + literal: code never accesses
    IP. same for jumps.

  * unified memory model is probably more important than speed: it
    allows for other memory mapped tricks, since memory has a single
    access point. the real problem is that instruction fetch is built
    on top of the memory model. maybe this can be optimized somehow?


DEMODULATOR

  * bitrate ~ bandwidth^(-1)

    this can easily be seen in the response of a sharp filter: it
    rings a lot, so can't accomodate much time variation.

  * AM (data=power) -> PM (signal + data=phase). 

    i'd like to stick to a single carrier. the reason is that the
    resonant peak of the speaker is something that best can be used
    instead of fought. i didn't measure it, but it at least looks and
    sounds quite sharp. comparing waveforms on the scope, i would
    guess 12dB through both sending and receiving speakers.

  * sampling rate is only dependent on bit rate, not on carrier
    frequency: aliasing sampling can be used. this means that in order
    to accomodate more processing power on the same chip, the bitrate
    can just be lowered.


and the combination of both: since this is going to be quite
math-intensive, it's probably best to choose for a bit more highlevel
approach: construct a couple of decent abstractions, maybe some easy
to use fixed point math routines, instead of perfectly optimal
ones.. the chip runs at 10MIPS. if i aim at 100bps, that's 100
instructions per bit, if i aim at 10bps, that's 1000.. the idea is to
get it to work first.



Entry: math routines
Date: Fri Nov 30 12:21:49 CET 2007


time for math routines. some design decisions:

     * signed/unsigned
     * bit size
     * saturated/overflow

it would be nice to be able to reuse these later in the DTC standard
forth as math routines. i do have special need here, in the sense that
the input is only 8 bit.

the main problem is the multiplication routine. the standard has a
16x16 -> 32 signed multiplication.



2 approaches for the filter:
  * simple: 2nd order IIR bandpass
  * matched FIR filter as in PSK31


i have enough memory to perform FIR filtering. let's focus on trying
to understand the PSK31 demodulator. till now i only found code
examples, no highlevel pseudocode or diagrams.

here Peter G3PLX talks about AFC (automatic frequency correction):
http://www.ka7oei.com/fsk_transmitter.html#FSK31_Explained

with PSK apparently the frequency correction doesn't need to know
anything about the data, since the spectrum is symmetric around the
carrier. 

i'm not sure whether AFC is necessary in my scheme: all recievers and
transmitters are stationary, and there's no wind. on the scope however
i did see some slight variation in period, but this was probably due
motion of the speaker/mic (just sticking up by its pair of connecting
wires).

 "To get in sync. the PSK31 receiver derives it's timing from the 31Hz
  amplitude modulation on the signal. The Varicode alphabet has been
  specially designed to make sure there's always enough AM to keep the
  receiver in sync. Notice that we can extract the AM from the
  incoming signal even if it's not quite on tune. In PSK31 therefore,
  the AFC and the synchronisation are completely independent of each
  other."

So it's not completely true that the AFC doesn't need to know anything
about the data: data needs to be 'rich enough'. But the trick of
getting the AM straight from the signal is interesting. This means i
can probably proceed nicely from AM -> PM

Some alarming notions here:
http://www.nonstopsystems.com/radio/frank_radio_psk31.htm

 "Like the two-tone and unlike FSK, however, if we pass this through a
  transmitter, we get intermodulation products if it is not linear, so
  we DO need to be careful not to overdrive the audio. However, even
  the worst linears will give third-order products of 25dB at +/-47Hz
  (3 times the baudrate wide) and fifth-order products of 35dB at
  +/-78Hz (5 times the baudrate wide), a considerable improvement over
  the hard-keying case. If we infinitely overdrive the linear, we are
  back to the same levels as the hard-keyed system."

What i saw on my scope, is a strong 2nd harmonic, probably due to
non--linearity caused by the DC bias in the speaker. Using some kind
of feedforward correction based on a measurement it is probably
possible to correct this when it becomes a problem: the transmitter is
simple enough so all kinds of wave shaping corrections could be
introduced there.

 "The PSK31 receiver overcomes this (ED: side lobes due to square
  window) by filtering the receive signal, or by what amounts to the
  same thing, shaping the envelope of the received bit. The shape is
  more complex than the cosine shape used in the transmitter: if we
  used a cosine in the receiver we end up with some signal from one
  received bit "spreading" into the next bit, an inevitable result of
  cascading two filters which are each already "spread" by one
  bit. The more complex shape in the receiver overcomes this by
  shaping 4 bits at a time and compensating for this intersymbol
  interference, but the end result is a passband that is at least 64dB
  down at +/-31Hz and beyond, and doesn't introduce any
  inter-symbol-interference when receiving a cosine-shaped
  transmission."

 "PSK31 is therefore ideally suited to HF use, and would not be
  expected to show any advantage over the hard-keyed
  integrate-and-dump method in areas where the only thing we are
  fighting is white noise and we don't need to worry about
  interference."

So maybe it's not necessary yet? Since we're using a single frequency
in the first attempt, a demodulator that rejects nearby signals might
not be required.

Anyway. Conclusion: i need to have a look at the exact algorithm used
for matching + synchronization.


Entry: demodulator.f
Date: Fri Nov 30 21:07:31 CET 2007

i had some bottom up code (what can be done efficiently) using
8x8->16 unsigned multiplication and 24bit accumulation. this works
well for rectangular windows, but not so much for non-rectangular.

maybe rectangular is enough since we don't have interfering signals?
anyway, it might be wise to look at how to do a windowed one..

i guess the idea is like this: make the window obey some kind of
average property that can be removed using maybe a separate
accumulation of the signal.

it doesn't look that hard: ** is inner product

[ s(t) + s_0 ] ** [ w(t) + w_0 ]

so there are 3 correction terms:

   s(t) ** w_0  == 0
   w(t) ** s_0  == 0
   w_0 ** s_0

which requires the average signal s_0 as the only variable component,
which needs to be scaled with the window DC component (can be 2^...)
and a fixed offset.

so i can basicly use the same unsigned core routine for general
complex FIR filters: renamed the macros to mac-u8xu8.f, and added
complex-fir.f




Entry: drop dup
Date: Fri Nov 30 23:02:59 CET 2007

optimization: drop dup -> movf INDF0


Entry: implementing the filter loop : complex-fir.f
Date: Sat Dec  1 10:27:37 CET 2007

i have it down to about 31 instructions for an unsigned multiply
accumulate operation 8x8 -> 24, and an accumulation. both can be
combined after the loop to correct the offset.

offset compensation is implemented now, and all sharable code has been
moved to macros in mac-u8xu8.f

routines are tested and seem to work just fine. so:

    filter coefficients are centered at #x80, but the accumulator will
    shift one position to the left to compensate, so the filter
    coefficients behave as s.7 bit fixed point with inverted sign bit
    if the accumulator is seen as s.15.8

this means that: 

    11111111 -> +0.1111111
    00000000 -> -1.0000000


Entry: -!
Date: Sat Dec  1 11:49:04 CET 2007

\ value addr --
: -! >r negate r> +! ;


this subtracts the number on the stack from the variable, not the
other way around. note that this has the argument order of '!' not of
'-'

the reason for doing it like this is that this occurs the most:
subtract a value from an accumulator variable.


Entry: subsampling
Date: Sat Dec  1 13:43:35 CET 2007


A baud rate that sounds like 16th notes would be nice, which is about
8 Hz. A carrier of 600 Hz, this gives a ratio of 75. The sampling rate
needs to be > 16Hz, so let's take the one in the neighbourhood of
32Hz.

Care needs to be taken though when using aliasing: the frequency error
will amplify. Let's see.. using 10 Mhz, the subdivisions become:

2^20 -> 9.5 Hz     baud rate
2^18 -> 38.1 Hz    sampling frequency
2^14 -> 610.4 Hz   carrier frequency
2^12 -> 2.44 kHz   4 x carrier
2^7  -> 78.1 kHz   PWM frequency

the carrier/baud rate here is 2^6 = 64

going from carrier -> sampling frequency is a subdivision of
16. what's the error of the oscillators?

CSTLS10M0G53 has 0.5 % precision. times 16 that becomes 8.0 % which is
quite a lot.. so probably it does need continuous phase compensation??

another reason to not subsample is to get better noise performance and
better frequency rejection due to longer integration time. using 4 x
carrier frequency is still only 2.4 kHz which is at 2^12 subdivision,
or 4k instructions per sample, which is absolutely no problem. this
gives 2^8 samples per symbol.

another thing to think about: synchronization. this can be implemented
using time shifts or phasor rotation. the latter is probably not a
good idea due to problems with filter matching. so actually, the
carrier needs to be significantly oversampled, or at least mixed.. i
think i need to make a new table with variables..


Entry: synchronization
Date: Sat Dec  1 14:34:11 CET 2007

if there are enough symbol alternations present this causes
significant AM modulation which makes synchronization easy: sync to
the zero crossings. this means the preamble needs to be 01
transitions.. probably best to use simple async with 1 = idle =
transition and 0 = no transition.

next: 

 * AM send
 * AM receive


Entry: multiplication again
Date: Sat Dec  1 15:16:13 CET 2007

funny how this is starting to be an exercise in multiplication
routines :)

since i'm using unsigned multiplication for the filter for efficiency
reasons, i have to implement signed multiplication in cases where
correction can't be moved to outside of a loop, which is the generic
case.

the 16bit multiplication performs correction using conditional
subtraction. for 8 bit it's probably easier to use conditional
negation, since that doesn't require extra storage.

this sucks.. -1 * -1 overflows to -1..  maybe it's better to use a
representation that can actually encode 1, even if this means giving
up one bit of precision?

s1.6


Entry: userfriendly
Date: Sun Dec  2 10:37:42 CET 2007

i need to weed a bit in the userfriendlyness and SIMPLIFY the way some
things are used, because it seems as if some combinations cannot be
made. i wanted to create scheme code that generates forth macros, but
it looks like this is not so easy!

another thing is 'splitting' the host and target, so the host can run
some kind of query program in cat or scheme.. maybe the 'current-io'
parameter should be set back again in prj and scheme modes?


Entry: clicks
Date: Sun Dec  2 13:03:06 CET 2007

need ramp-up and ramp-down to prevent clicks. ramp-up time should be
in the order of 20ms = (50Hz)^(-1)

after ramp-up, carrier fade in should be used. this can use the
'attenuation' variable. OK. using 25ms ramp to bias.

now, how to intialize the phase?

  OOK: can start envelope with -128 (amp = -1)
                 carrier with -64 (amp = 0)

  BPSK: needs carrier fade-in.


looks like BPSK sounds smooth enough without envelope fade-in when
starting the carrier at phase = -PI/2

doing the same now for AM, so there's no problem with envelope
frequency = 0.



Entry: transmitter
Date: Sun Dec  2 15:08:19 CET 2007

time to get the transmitter sorted out, so i can make a standalone
device that sends out a known data sequence:

  * combine the framed rx/tx with sending/receiving
  * figure out OOK and BPSK transition based send modes

return to zero, i don't see the point in that, so transition based
seems good enough. let's say 1 = trans, 0 = notrans. this has the
advantage that an idle line is the richest signal, good for sync
purposes.

transition based is easiest to implement using the current code.  in
case transition based is not desired (i.e. because it accumulates
error), this can still be pre-coded as long as a transmission starts
with a known oscillator phase.


Entry: signal rates revisited
Date: Sun Dec  2 18:50:57 CET 2007

4 independent frequencies:

  * PWM TX rate: determines high-frequency + aliasing noise
  * carrier:     only important for path (i.e. speaker reso)
  * baud rate:   bandwidth -> noise sensitivity
  * RX rate:     selectivity (related to FIR length) 

(EDIT: carrier and baud rate are not independent wrt data filter
qualoty. see below)

important for the receiver are :

 - baud rate, which limits the maximal integration time (dependent on
   symbol length).

 - RX rate: enables longer filter lengths, which gives more
   selectivity and noise immunity.


it doesn't make sense, for constant baud rate, to up the RX frequency,
but keep the FIR length constant, so:

        FIR =  k . (RX / BAUD)

        RX = Fosc / OPS

where k is the number of symbols the FIR spreads over, probably 1 or
2. and OPS is the amortized number of operations per sample
(processing and aquisition). the filter is 32 in the current
implementation.

this gives about 300kHz at 10MIPS. looks like we have some
headroom..

anything more than 8kHz is probably not going to make much sense.


( I was thinking about noise and dithering, and that at this high
  frequency because of absence of noise there will be no 'extra'
  sensitivity due to the dithering at levels close to the quantisation
  step, but there probably will be extra due to pwm effects. So it
  looks like all small bits help.. )


EDIT: another variable i forgot to mention is symbol rate
vs. carrier. using a mixer, it is desirable to have large separation
between the two so a simple data filter can be used.


Entry: matched filter
Date: Sun Dec  2 21:22:33 CET 2007


differential BPSK data stream:
.   .___.   .   .___.
 \ /     \ / \ /
  X       X   X  
./ \.___./ \./ \.___.
  1   0   1   1   0

Using cosine crossfading as implemented in modulator.f is effectively
the same as using symbols 2 baud periods wide with a 1 + cos
envelope. this wavelet is the output filter which maps a binary +1,-1
PCM signal to the shaped BPSK signal.

This output filter needs to be matched in the receiver.

Now, about matched filters..

A matched filter in the presence of additive white gaussian noise is
just the time-reverse of the wavelet: one projects the observed signal
vector onto the 1D space spanned by the wavelet's vector in signal
space. This gets rid of all the disturbances orthogonal to the
wavelet's subspace.

When the noise is not white, the noise statistics are used to compute
an optimal subspace to project onto, such that most of the noise will
still vanish.

I don't have noise statistics, and I'm not going to use any online
estimation, which leaves me to plain and simple convolution with the
time-reversed signal.

I do wonder what all this talk is about 'designing' matched filters
for PSK31...


Entry: phase synchronization
Date: Sun Dec  2 22:27:44 CET 2007

I'm confusing 2 things:

  * bit symbol synchronization
  * carrier phase synchronization

Using a complex matched filter, phase synchronization can be done
entirely by using an extra phase rotation operation: it doesn't really
matter what comes out, as long as:

 - the matched filter's envelope is synchronized 
 - we're using I and Q filters

It's clear that mismatching the symbol clock has a lot less effect
than mismatching the carrier phase. When compensating the carrier
phase, we compensate what's aliased down after subsampling at symbol
rate.

This still needs 2 separate synchronization methods: bit symbol
synchronization (which sample point to start filtering) and carrier
frequency/phase compensation.



Entry: recording sound
Date: Mon Dec  3 10:35:52 CET 2007

let's go for this: a symbol is 256 samples. This allows easy buffer
management. This fixes the sample rate at 4.88 kHz

recording seems to work ok: it's at 8x the reso frequency of the
speaker: scratching it over a newspaper gives a nice saturated wave
with period about 8.

maybe it's time to add gplot in the debug loop.



Entry: IIR or FIR
Date: Mon Dec  3 15:39:49 CET 2007

IIR:
      * mixer + lowpass
      * sync mixer to carrier
      * start bit detection = zero crossing
      * no messing with blocks
      * only approximate matching

FIR:
      * possible to construct optimal matching filter
      * no phase distortion
      * synchronization more complicated (filter freq is fixed?)
      

It looks like IIR + PLL is really simpler to implement (same at every
sample block, no buffers necssary).

NEXT: have a look at how to implement a PLL.

( Actually... It should be possible to mix the assymetric tail of a
  stable IIR filter in the transmitter! Though not simple due to
  rounding.. Something like that can probably not be computed exactly,
  so this would require a bit more expensive transmitter.. )

Using a PLL it's probably best to first try to synchronize to a clean
carrier. Since a mixer is necessary as part of the processing chain,
that can be used to perform the correction.

                                     
-> [ MIX ] -> [ LPF ] ----> I --> [ AGC/HIST ] -> bits
      ^               --o-> Q
      |                 |
      \-------[ OSC ]<--/

The quadrature component can be used as an error feedback. This always
works, since it's not present in the signal.

reading this:
http://rfdesign.com/mag/radio_practical_costas_loop/

two things are mentioned to perform carrier recovery:
    * squaring + division
    * costas loop

note that's about an analog implementation.

so not all roses in the IIR world.. what about taking the best of
both? perform carrier recovery using a mixer + PLL and use a similar
approach for the data sampling recovery.


a nice place to go back to this paper:
http://www.argreenhouse.com/society/TacCom/papers98/21_07i.pdf

where the signal is sampled using a 1-bit dac, and the mixer has
values {-1,0,1}. after integration, an adaptive rotation is performed.


Entry: simplified
Date: Mon Dec  3 16:56:50 CET 2007

 * AGC: or the absolute value of a symbol buffer + compute shift count
 * INTEGRATE: sum the entire buffer (no sideband rejection)
 * sample at say 8 points per symbol

since there's no filter other than the analog 450-2kHz this should
perform pretty bad. but i guess it's time for a fail-safe.. use noise
modulation first :)

NM -> AM (async) -> PM (sync)

a genuine problem doing this al experimental is the program
sequencing.. there's a huge difference between being able to do
something per sample and having to store some for later..

EDIT: it doesn't make sense to write an AM demodulator without
thinking about the BPSK that will follow, so i need to do AM with a
separate mixer + LPF.

mixer seems straightforward. the remaining problem is the LPF. If i
can make that work with simple shifts, where's as good as there..




Entry: triangular window
Date: Mon Dec  3 17:33:12 CET 2007

however.. it is probably possible to use triangle windows and
'recompose' things later, since a triangle window is self-similar!

given a number of sample points, from this construct 2 numbers: one
weighted with ramp up, one with ramp down. these can be easily
combined, so one could shift the center of the window and recompute
easily.



Entry: interrupts
Date: Mon Dec  3 17:46:08 CET 2007

looks like the real question is whether or not to use interrupts. doing
this as state machines leaves too little room for block-based FIR
techniques. i'm also not very convinced about trying the AM first,
because i'm already trying to optimize the layout for that algo: i
need to go for mixer + IIR LPF, and implement AM in that framework.


Entry: data filter coefficient
Date: Mon Dec  3 18:31:07 CET 2007

The constraint is: we don't care about the delay, but attenuation
shouldn't be too big. What about this: pick the pole at half the bit
rate, and round upto the next power of 2.

  t <-   1/sqrt(2) = (1 - 2^(-p)) ^ t

EDIT: how to pick p ?

it's easier to use this approach, where we require the decay time to
be such that a the response will drop below the 1/2 threshold in one
symbol time:

  (1 - 2^(-p)) ^ t < 1/2

where t is the number of samples in a symbol. this is equivalent,
since the t in the previous formula is related to half the baud rate.

if t is large (in our case it's 64), the linear term is the one that
dominates the lhs, so the above can be approximated by

  (1 - 1 + t2^(-p)) 

which gives an expression:

  p = log_2 (2t)




Entry: AM vs PM
Date: Tue Dec  4 11:38:18 CET 2007

something i missed yesterday: demodulating AM with a non-properly
tuned mixer might give trouble.

no, this is not the case as long as both the I and Q components are
computed: it only gives a problem for PHASE (which will rotate on
mismatch) not AMPLITUDE.




Entry: data filter implementation
Date: Tue Dec  4 12:03:34 CET 2007


the easiest way to keep precision is to never loose any bits. the data
filter has the form:

   x += (1 - a) x + a u

where x is state and u is input, and a = 1 - 2^(-p)

the current settings give t = 512 (5kHz sample rate and 9Hz symbol
rate). which means p = log_2 (1024) = 10 as the approximation of the
bound. speeding up the filter by a factor of 2 gives p = 9.

it might be worth relaxing it even further to 8, so shifts are
eliminated.

( so, just out of curiosity.. is it possible to use unsigned
  multiplication? just doing this without thinking introduces a scaled
  copy of the original modulated signal in the output. if the lowpass
  filter allows, this might be not a problem: requirements are just 2x
  as strict. )

problem with signs: it might be simpler to work completely with
unsigned values since signs make multi-byte arithmetic more
complicated (need sign extension). a simple solution is to run the
multiplication as signed (to get rid of the component at the carrier
frequency) but run the filter accumulation as unsigned. the DC
component in the result is completely predictable and can be
subtracted later.

first experiment i measure something: noise is at around 5 and maximal
measured signal is around 150. that's a significant difference. now
it's time to map the 24bit range to something more managable.

now to be careful not to overflow the filter input: it seems
reasonable to ignore the lower byte.

looks like i have a bug in the signed 16bit multiplication
routine. EDIT: yep.. type TOSL replaced with TOSH



Entry: better debugging tools
Date: Tue Dec  4 15:07:04 CET 2007

i need a way to print reports from ram.. before this can be done in a
straightforward way, the interaction language (which will need to be
cat or scheme) need to be defined properly + some way of adding code
like this to the project needs to be defined.

what i need now is a way to inspect 24 bit numbers.. what about adding
inspectors to the code? these are forth words that send out data in
the form of a buffer. i could then make inspectors for any kind of
thing.

EDIT: yes.. i really have a good excuse to make proper debugging
tools. just fixed the prj console to be able to connect to the
target. was thinking about properly specifying interactive commands as
an 'escaped' layer over the target interaction.. basicly every
possible 'island' in the code needs to be extensible.. most
importantly: macros, interactive words, prj words, scheme code, ...

EDIT: considering the amount of time i'm loosing to get this thing
going, it might be wise to standardize on method.. i.e. all 16bit
signed fixed point or something.



Entry: double notation
Date: Tue Dec  4 16:15:02 CET 2007

there's some things to distinguish:

    1 x 1 -> 1      word (standard words)
 
    2 x 2 -> 2      _word (16bit variants of standard words used in DTC)

    1 x 1 -> 2      2word, 3word etc.. nonstandard, any combination
    1 x 2 -> 2      that makes sense
    1 x 3 -> 3
    1 -> 2
    ...


Entry: costas loop
Date: Tue Dec  4 23:09:21 CET 2007

Have a look at the HSP50210 datasheet. It gives a nice general idea
about how a PSK receiver would work: 3 tracking loops
(AGC,carrier,symbol), user selectable threshold, matched filter (RRC
or I&D), soft decisions.



Entry: saturation
Date: Wed Dec  5 12:27:13 CET 2007

It's important to be able to prevent wrap-around distortion. Some kind
of saturation mechanism might make this easiest: it's easier than
carrying around high precision data. So where to saturate? Most
straightforward is the LPF, but at first glance it's better done at
the point where power is calculated, since LPF seems to have enough
dynamic range.

The properties of a non-saturated word are:
    * sign word byte is #x0000 or #xFFFF
    * both words have the same sign bit

This can be reduced to:
    sign word + lower sign byte == 0

OK


Entry: weird LPF output
Date: Wed Dec  5 15:42:12 CET 2007

i think i need to focus on building some more debug tools
today.. something's going wrong and i can't find the cause. the
problem is amplitude modulation in the LPF power output, going from
100 -> 400, with a period of about 35 = 140Hz. and a component at 4 x
that frequency, not locked, which is probably the carrier.

i measure this with the modulated signal, and with an unmodulated
carrier.

go one by one: it's probably best to try to eliminate the DC offset,
so at least that is not drowning the signal component, which is a lot
smaller.. EDIT: this is already happening: sample is converted to
signed then multiplied.

ok.. questions

 * why is there a 1/8 Hz component in the power output? i would expect
   the power to be smooth.. not modulated -> this is just noise. the
   level is really low, so it's the accumulation of (2^(-8) * u).

 * why is there a 1/64 Hz component in the power output?
   EDIT: the frequency is a mixer mismatch = 1/8 - 1/8', where 8' is
   the not quite =8 measured carrier frequency.

 * what does the filter input look like?

dit some input signal measurement, and the first thing i notice is
that the carrier frequency is quite off. i get 28/4 is T=7 instead of
T=8. which would give a beat at 1/56. that might explain a lot...

ok.. i get it. the convolution of these 2 spectra:

     |  .  |     cos(w1 t)
    |   .   |    cos(w2 t)
        0  

gives:

 |     |.|     | cos(wd t) cos(ws t)
        0

with wd = w2-w1 and ws = w2+w1

the sinewave that gets folded near 0 will interfere with the signal
data! so this approach just doesn't work without synchronization!       

it looks like the only way to do this is to either have proper
synchronization, or use a band pass filter, not a mixer.

AM: first order lowpass with complex coefficient, followed by output
    power computation.

PM: requires AGC or cartesian->polar conversion for properly scaled Q
    -> phase feedback.

the quick and dirty way is to just filter the absolute value of the
input. then add a more selective filter. hmm.. i still need to kick
out DC, so better go for the frequency-selective filter.



Entry: 1/8 or 1/4 frequency filter?
Date: Thu Dec  6 17:12:32 CET 2007

it's probably easier to separate the problem in 2 parts: (1 -1 ; -1 1)
with sqrt(2) amplitude, compensated by a single arbitrary
multiplication to get the gain to 1-2^(-8). this requires at least
16bit. incorporating the scaling factor in the matrix seems to lead to
the same precision problem, but requires 4 multiplications instead of
one.

so what about a 1/4 filter? that's even simpler, and doesn't require a
sqrt(2) scaling factor, so the (1-2^-8) scaling can be done without a
multiplier.

so.. the lpf filter i had before can be re-used. the only thing to add
is to cross-add the filter states, and add in the input signal.
rotating the signal can be done using a 4-state state machine, which
will add/subtract the signal to/from one of the states. +x +y -x -y

give this approach it's probably also possible to reduce the LPF state
from 24 to 16 bit. check this. in a stable regime, using 2 bytes, the
high byte will have the amplitude of the input at the frequency, so at
least for strong signals it would be stable (gain = 1). looks like it
has only effect on noise and rejection.


Entry: too much carrier drift
Date: Thu Dec  6 19:16:16 CET 2007


so, to get a bit of full-circle understanding: why not mix a signal to
DC and filter its absolute value? looks like the thing i did wrong was
not the mixer, but the place where the smoothing is going on.

or: 
  * mix + filter: isolates a frequency region
  * full-wave rectify + filter.

2 filter operations are essential here, so it's probably easier to do
only one, and instead of full-wave use the amplitude/power of a
filter.


But but but... maybe the filter is actually too sharp? I measured the
carrier at 1/7Hz, expecting it at 1/8Hz.. i'm missing a parameter:
bandwith and time decay are related, but increasing the sample rate..

look: this is just a shifted one-pole filter: it's the equivalent of
passing the difference signal 1/7-1/8 to the lowpass filter. that
probably won't survive.. so i'm stuck with the same problem: the
carrier shift is much more than the bandwidth of the signal!

it's about 80Hz at 600Hz, while the signalling frequency is around
9Hz. this means i have to do something about it... it's either going
to be manual tuning, or adaptive tuning.. synchronous demod is
starting to look like the only solution. or i should just use the 2
filter approach of above:

  * wide filter to eliminate noise: it should be wide enough to
    capture the carrier tolerance.

  * narrow filter to perform demodulation after full-wave rectify.

it's starting to look like synchronous is going to be a lot less
hassle.. again.. what do i need? an AGC to normalize the Q output such
that i can use it as feedback to phase offset.

go over this again.. something's wrong.



Entry: cordic
Date: Fri Dec  7 09:36:31 CET 2007

the most elegant solution seems to be to use a cordic I,Q->A,P
transform, so both the AGC and PLL have proper data to work on.

For use in the demodulator, the constant scaling factor is not a
problem. What I would like to do is to perform sequential updates: use
Q to update I and then use the updated I to update Q. With a=s2^(-n)
this amounts to:

| 1  a |   | 1  0 |    | 1 + a^2   a |
|      | * |      | =  |             |
| 0  1 |   | a  1 |    | a         1 |

Which is no longer a scaled rotation. Correcting this looks like more
hassle than just performing the update in parallel.

I don't need a lot of phase resolution. 8 bit is definitely enough.

Hmm.. Is going to be a lot of work.. 


Entry: simplified PLL
Date: Fri Dec  7 11:21:40 CET 2007

What about using a 2 bit phase detector which just detects the
quadrant and accordingly adjusts the frequency?

 -2 | -1
----+-----    
 +2 | +1

With + meaning counterclockwise.  Since we're not using the Q
component, both directions of I should be allowed, so a better
approach is:

 +1 | -1
----+-----    
 -1 | +1

Filtering this signal and using it to increment the frequency gives
the right amount of feedback near the lock. In the phase diagram, what
needs to be done is to slow down the oscillator. The design parameters
here are:

     - smoothing of the phase error
     - gain of the phase error

I'm not too sure about oscillations though.. Maybe linear error
response is an essential element?

I guess i'm missing some experience here. Gut feeling says it should
be possible to design a PLL by filtering a 2 bit phase detector. Gut
feeling also says that this will lead to oscillations.

I'm off track again. These are the choices to make:
  - go for CART->POLAR transform with high resolution (i.e. 8 bit)
  - use AGC and Q component for feedback.

The latter seems simpler. Maybe i should try that first. Cordic isn't
as straightforward as i thought since it needs a barrel shifter. Which
could be implemented using the multiplier, but then why not use proper
coefficients?

So.. AGC.

Stick to the mixer algo, but figure out how to perform variable gain
so the error signal used to drive the phase adjustment is properly
scaled.

Estimate the gain using a filtered sum of absolute value of the I and
Q components.


Entry: PLL analysis
Date: Fri Dec  7 13:32:07 CET 2007

Using linear system theory: around the error=0 point, the system is
linear and behaves like a controlled integrator. We control frequency
(velocity) and out comes phase (position) which is the integral of
frequency. Such a system with a proportional controller is stable
because it is first order with negative feedback. It can be sped up by
increasing the gain. However, faster also means more susceptible to
noise of the control signal (in the PLL case the Q signal)

This is in absence of a disturbance signal. This can be modeled by a
signal d which drives the integrator directly. In the PLL case this is
the frequency mismatch. This will result in some permanent error. The
ratio between the 2 is determined by the error amplification.

Questions:

  * add or subtract from rx-carrier-inc ?
    -> depends on whether one wants to sync to +I or -I

  * how to prevent mixer drift?
    -> looks like the DC component of the error should not have any influence?


Entry: discrete control systems
Date: Fri Dec  7 14:00:14 CET 2007

Looks like the thing i'm confused about is the difference between
analog control systems and digitial ones. An analog 1st order
proportional control system can never overshoot, but a naive
discretization of this can!

The problem here is instability of integration methods.



Entry: the problem with the frequency offset
Date: Fri Dec  7 14:48:07 CET 2007

i think i found it: really stupid.. first i thought it was an
oscillator problem. didn't occur to me to try with 2 different boards
to see if that's actually the case. anyways, after trying, i got
exactly the same result. looking at the code i find this:

: sample>
    16 for wait-sample next
    0 ad@
    ;

which, if the processing takes longer than 1/16 of the clock period,
is wrong of course!

the solution is to solve this using the timer, or perform the sampling
in an isr. let's try the postscaler. OK.

need a break.. what i'm doing wrong is to use the integral of the
error to compute the frequency.. frequency should be just F_0 - e.

after the break.. looks like i'm still making too many mistakes: of
course, if i just restart the tracker at a random point of carrier
phase, chances are that there are going to be some transient
fenomena. i just need to run it longer probably.

OK: sync works to plain carrier.


Entry: synchronization to modulated carrier
Date: Sat Dec  8 11:30:05 CET 2007

i tried to following: use the sign of I to steer the direction in
which the feedback works. works ok for clean carrier, but in full
reversal this leads to problems.

looks like a conceptual problem.

maybe the synchronizer should be slowed down? in a sense that a symbol
transition, which moves through a zero feedback point (in which the
carrier is effectively not controlled), has no noticable effect on the
setting of the tuner, but when this transition is complete, full
feedback is in effect to pull the oscillator in sync again.


using just Q feedback, the PLL seems to stabilize around Q = -120,
with an amplitude of about 30.

say -128, that's -#x0080

#2000 -> #1F80

it's 1/64 th of the frequency, which is exactly the difference between
symbol rate and carrier: the PLL locks to another attractor..

a simple solution seems to be to limit the PLL frequency correction.

anyways, the sign stuff is necessary.



Entry: symbol synchronization
Date: Sat Dec  8 12:07:52 CET 2007

because i'm using locked synthesis and no non-synced downmixing,
the symbol synchronization can be derived from the carrier
synchronization. so maybe i should forget about syncing to the
modulated carrier?

pulling the oscillator in sync using a plain carrier might help a lot
actually. let's try a 7/1 test tone.



Entry: first packets: pll and reversals
Date: Sat Dec  8 13:25:20 CET 2007

apart from some problems related to gain (probably too much drive
which kicks the PLL out of sync: moving the things apart gives better
results.) it seems to work just fine.

looking at an I,Q plot i suspect the slow rise of the I signal is not
due to filtering, but due to loss of sync: Q gets thrown off, and the
PLL needs to re-sync. maybe it's more important to filter the error
feedback..

aha! it seems as if the PLL switches to the negative frequency
attractor. indeed. with wide spaced reversals it is clear that Q moves
from around -13 to +13

the problem is that by suddenly moving from subtract to add changes
the frequency of the oscillator from bias+corr to bias-corr. how to
solve this? aha.. maybe it's not necesary to flip the sign? since a
phase reversal in the I plane doesn't change the Q component?
actually, it does. switching of the sign compensation resynchronizes
the oscillator on transition to I = +A.

it looks like i need a controller with a zero error, which effectively
means a PI instead of a P controller. note that i already had an I
controller, but that's unstable.

i'm measuring an error of about #x10 / #x2000 = 0.2 % -- the spec
sheet says 0.5 % max. looks normal.

thinking about this PI controller: P + lowpass can't work, because
there is no zero-error. so i need an integrator. the problem is the
time constant / gain factor.

the error (Q) does seem to go to zero now. however, there is stil a
transition at the reversal.

now that i have a zero error, it's maybe best to multiply the I and Q
to obtain the error signal? for after dinner: i'm stuck with yet
another scale problem.. fixed point without a barrel shifter is
madness.. it might have been better to just implement the tools
necessary, even if they are inefficient: it is definitely doable
(which is what i wanted to prove really..) but it's difficult.

NEXT:
    - I * Q
    - AGC

preferrably combined such that I * Q and error feedback become simple.

i just saturated the error output to +-127.. i get nice results for I
amplitude around 100-150. but still: the Q component wiggles when the
phase transforms.

reading the costas-loop paper mentioned above: the 3rd multiplier is
called a phase doubler. it's only point is to make +-180deg both
stable lock points.

so, i'll write up the problem below.



Entry: more questions
Date: Sat Dec  8 15:04:31 CET 2007

why does the PLL response oscillate? the analysis by linear
approximation i made above showed it was first order.. something's
wrong there.


Entry: generic lowpass filters
Date: Sat Dec  8 16:24:02 CET 2007

it's no longer managable to have these special 1 * 2^(-8) filters.. i
need a special purpose 16bit lowpass filter, with saturation,
operating on proper 16 bit signed values, with possibly 8 bit
coefficients in a decent range.

it looks like there's plenty of room to do it in a proper
object-oriented fashion.

not doing it in proper object-oriented fashion, but a macro operating
on 4-byte state: 3 byte signed filter state, and 1 byte unsigned
filter coefficient: .00AA





Entry: AGC
Date: Sat Dec  8 22:24:17 CET 2007

it's not so straightforward, since it needs a division
operation.. currently, with the multiplication doubler (also with the
sign doubler) locking seems to work fine around 100-150 amplitude.



Entry: lock problems on transition
Date: Sat Dec  8 22:40:44 CET 2007

i still get the same problem: on transition, the phase is messed up
again. maybe the oscillator phase should rotate too? i'm
confused... at the point where the I component goes into transition,
the Q component gets kicked off.

the integrating controller works well: error goes to zero
eventually. i just need to figure out why the phase bumps..


something strange tho..  the Q spike only happens on a +1 -> -1
transition. the -1 -> +1 transition is clean. this smells like some
kind of wrap around bug..

sending #x01 bytes instead of #x11 bytes seems to contradict this:
spike on every transition.



Entry: emergency solution: AM
Date: Sun Dec  9 00:02:13 CET 2007

tomorrow it looks like the best thing to start with is gain control,
to find an optimal feedback coefficient for the PLL. once this works i
can try to find a bitrate that works with the phase error still
happening. then i could hand it over and try to fix the
sync/transition error.

normalization:
  * agc (division + filtering)
  * arctangent

previous conclusion about cordic artangent was that it's hard to do
without a barrel shifter.. i can probably unroll most of this by using
the multiplier and double buffering.

good thing is that this can be used for AM also, without the need for
quadratics.

EDIT:
actually, is should really just do AM by measuring the power. the
previous error (large carrier mismatch) is solved.



Entry: articles
Date: Sun Dec  9 09:15:49 CET 2007


R De Buda "Coherent Demodulation of Frequency-Shift Keying with Low
Deviation Ratio" -- IEEE Transactions, 1972, COM-20 pp 429-435.

S Pasupathy, "Minimum Shift Keying: A Spectrally Efficient Modulation"
-- IEEE Communications Society Magazine, July 1979, Vol 17, pp 14-22.



Entry: AM
Date: Sun Dec  9 10:13:57 CET 2007

i got very nice reception it looks like.

what about the following algorithm:
  * set threshold to an estimate of the noise threshold (say 50)
  * wait until something comes in: interpret it as a start bit
  * find the max amplitude during the start bit
  * start sampling 9 bytes, by waiting for half a symbol length, and
  compare to half the dynamic threshold


looking at some sampled data of #x55 + start and stop bits, which is
01010101, with 0 = ON, 1 = OFF, it seems that putting the threshold at
half is not a good idea.. also, the time it takes to reach from going
above noise threshold (50) to the peak of the start bit is exactly the
symbol length.

maybe it should be compared with a lowpass envelope?

tried this, but looks like LPF delay is going to be a
problem. however, it should be possible to keep the same filter, but
perform the comparison with delayed versions?

another possibility is to just save the sample points, and perform the
filtering at a later stage.

or.. it could just be compared to the previous sample point? if lower
it's the reverse? that will probably work just fine: this might give a
problem for stable 0 or 1..

next algo:

   * start sampling s - s_0 after detecting a start bit. s_0 = rise
     time to threshold level.

   * collect 10 samples.

   * postprocess


what i'm doing: s_0 = 0, and watching the output of the sampling with
a #x55 byte. it looks pretty decent. now trying the number station.

next approach: 

   * compare with previous (differentiate)
   * maybe hysteresis? 

differentiate is no good.

i'm probably fighting something else.. maybe the data rate is just too
fast? i had to move from 512 samples to 256.. so looks like something
else is going on..

what about this: change the special purpose lowpass filter so it takes
16 bit coefficients, and then reduce the filter pole a bit.


Entry: confused
Date: Sun Dec  9 16:22:43 CET 2007

let's see.. there's something wrong with my symbol rate. i thought it
was 512 samples, but it's 256. corrected for this, i can receive
signals. however, it seems the bandwidth is mismatched. so i have 2
calculations that are probably erroneous:

  * necessary bandwidth -> filter coeff
  * symbol rate at the transmitter

doing some manual experiments, i got the filter pole fixed at #x0100,
which gives very nice waveforms. making it bigger only increases the
noise, but doesn't seem to influence the shape too much.

so everything looks pretty good, but it seems there is too much
inter-symbol interference due to the assymetry of the receive
filter. i could try to hack around this by doubling each bit, but
keeping the envelope constant.


got now:
    * halved symbol rate (transition + stable)
    * 3 x bandwidth (100 -> 300)

now at least the filter makes it roundtrip from 0 to max amp.

now try to subtract startbit from each bit, then use sign.

this works!

reception seems quite robust. at least when it's not receving
bogus. so i need a way to eliminate the worst kinds of noise, which
are transients that trigger a start bit. these could then be used as
human input.

it's not very robust tho.. probably i need to compute the maximum, and
use half of that (or less..) as a threshold.

  looks like this is relatively robust:
        threshold = 1/4 of maximal power
        translated to amplitude this is 1/2


Entry: next
Date: Sun Dec  9 18:32:53 CET 2007

i think it's ok to forget about synchronous stuff for a while... also
speeding it up is for later maybe. what i need first is:

  * extra stop bit to eliminate transients
  * cleanup code for blocking send & receive.


Entry: krikit -> reflections on Forth and DSP
Date: Sun Dec  9 22:52:16 CET 2007

looking at the code i write, it is full of global variables (temporary
storage for multiple fanout). and inlined early-bound math ops,
operating directly on memory instead of the stack. also macros that
unfold to criss-cross variable access are much more useful here than
compositional forth.

the problem with DSP is that speed matters, and it's easy to get to
order of magnitude savings by early binding. so macros are
important. algorithms are often not terribly complicated. the stress
is more on mapping things to hardware.

now, i realize i'm stretching it trying to do DSP on a PIC18. it
misses essential elements like a barrel shifter, large accumulator,
and rich addressing modes. these things REALLY make a huge
difference. but, keeping data in memory (registers) makes things
relatively fast on a PIC18.

if the specs are clear (if the algorithm doesn't change)
implementation can be straightforward, though a manual endeavor. but,
what i've learned, experimentation REQUIRES more highlevel
constructs. i lost too much time and energy in mapping to hardware
before things actually worked.

which leads me to the following strategy: if experimentation on the
hardware is essential, experiment on hardware that 10x faster, or use
data rates 10x slower such that high level abstractions can be
used. for purrr on the PIC18 and 16/24 bit DSP operations this means:
USE A DTC FORTH! when it's done, core routines can be done in purrr18
or in machine language. what i need is:

  * confidence that 10x speedup is possible
  * confidence that slowing down ACTUALLY WORKS
  * patience and discipline to get it to work FIRST and THEN speed it up  

what i missed in this project is the availability of an easier to use
16bit forth, and a policy for doing fixed point math. the former would
have made the latter more easy to use.

and second, it is probably a good idea to start looking for a dataflow
language: one that 

  * automates the allocation of temporary buffers (variables).
  * enables abstract boxes (made of networks of abstract boxes)
  * automates iterated boxes (+ possible 'folding')
  * separates registers from functions (all feedback = explicit)
  * frp vs. static sequencing ?

so i'm not so sure anymore if forth is really useful for the
dsPIC. maybe in the sense that it should map to the 16bit arch just
like purrr maps to the 8bit arch, but leave the dsp-ishness alone:
provide only an assembler.



Entry: local names
Date: Sun Dec  9 23:15:27 CET 2007

which brings me to macros and local variables.. i'm using the wrong
tool for the job: i can't bind new names to old ones, like in
scheme. for example:

: bla state |
    state 3 +

...

the 'state 3 +' can't be bound to a single new name. this really
screams for a new language syntax and semantics. or at least enable
local macro definitions (there's no real reason why not..)

: bla state |
    : foo  state 1 + ;
    : bar  state 3 + ;

    foo @ bar ! ;

but.. that's getting ugly. what i want here is some form of
pre-scheme. downward closures.

(bla : state |
   (foo : state 1 +)
   (bar : state 3 +)

   foo @ bar !)

another thing: when allowing local variables, it makes more sense to
put them in front of the name, to correspond better to how they are
used.

  (square | dup *)
  (x square | x x *)

and i need to figure out how to solve the anonymous function
problem.. i.e. 'define' vs 'lambda'... these are conflicting.. what
about using

   : in a context that requires a named definition. i.e. a global
     definition or a local let

   | in a context that requires an anonymous definition, i.e. the
     argument to ifte. (| ...) is then equivalent to (...)

  (x square : x x *) vs (x | x x *)
  



a function definition can then be something like

  (a b c superword :
     (e : a 1 +)
     (f : b 1 +)

     a b + e +)

where local definitions are possible at the beginning of a definition



Entry: dtc forth
Date: Mon Dec 10 14:21:35 CET 2007

a unified memory model is not so hard to implement efficiently. but, a
point that could make a huge difference is to use this memory model
inside the interpreter. a trade-off between speed and flexibility. i
can imagine it being interesting to be able to test code in ram before
flashing it.. at the least, the option should be kept open.


Entry: RGB led
Date: Mon Dec 10 17:57:57 CET 2007

trying to figure out where to put the LED. 

  * all connected to analog ports
  * one extra digital connector with 220R resistor
  * all connected to pins, so they can be reverse biased for light detection

pinouts: (common anode)


 |
 ||
||||
4321

 4       3       2    1
 |   R   |   B   |    |
 o--|<|--o--|>|--o    |
         |            |
         o--|>|-------o
             G

the region that's free is between 21 and 26. the anode needs to be
connected to a pin that can be switched to analog. on the board the
best option here is 23/AN8. leaving pin RB0/INT0 free for debug net
might be a good idea. AN9-10-11 are then all digital to control the
LED cathodes, AN8 is analog to tolerate the analog voltage. this also
won't conflict with the necessary digital outputs already on the
board. the anode resistor could go to.

the RGB led is connected like this:

26 RB5      o--[220]---o
                       |
25 RB4      o-----o    |
                  |    |  
24 RB3      o--o  G    |
               B  |    |
23 RB2/AN8  o--o--o----o
               R
22 RB1      o--o



Entry: dsp language
Date: Tue Dec 11 09:33:07 CET 2007

what is necessary? i could take the PD sound processing as a model.

  * box = primitive | composite

  * composite = box + interconnect

  * things should be parametrizable in grids (from which an iteration
    structure is defined)

  * can we have lexical scope?

  * don't force serialization

  * don't force naming of intermediates, but don't restrict it
    either. (box combinators)

  * allow scheme (expression trees) to be a subset of the
    language. the exention is no more than a way to abstract 'parallel
    scheme'.

it would be nice not to go too far away from lambda abstractions. the
problem is multiple outputs. these could be multiple functions. so
what about common subexpressions? keep it manual for now..

maybe use scheme-like syntax based on 'values' but called
'output'. the latter will be more general than values: it can be
re-arranged in time. it's an essential observation.

not forcing the naming of intermediates can be problematic, since it's
the whole point: dsp code is very graph-like, and naming is more
efficient for this.. it looks like naming IS essential.

brings me to composition: a new box consists of 'node' sections which
name nodes. 'lambda' could be replaced with 'in' since it will name
the external inputs. all other nodes have to be named. 'not forcing
naming' can be implemented by special purpose box combinators.

nodes are different from locally created 'specialized' boxes.

names can be replaced by box expressions if they are tree-like (return
a signle value) otherwise they need to be named in a 'node'. similarly
'out' can be discarded in a definition. this allows the use and mixing
of scheme functions.

(in (a b c)  ;; 'in' is the parallel equiv of 'lambda'
   (box (mula (x) (* a x))    ;; create local specialized box (like 'define')
   (box (mulb (x) (* b x))   

   (nodes                     ;; naming intermediates
     ((q r) (div/mod a b))

     (out
      (+ (mula c) (mulb c))
      (- (mula c) (mulb c)))))))


so, concretely:

  'in'    is like 'lambda' but it has parallel outputs
  'nodes' is like 'let-values' 
  'box'   is like a local 'define'
  'out'   is like 'values' but defines parallel outputs


so the principles:

  1. the ONLY point of the language is to extend the many->one lambda
     calculus that can create expression TREES to something that can
     create expression GRAPHS.

  2. it is important that the lambda calculus is a subset which uses
     it's original lisp tree notation.
      * 'out' is redundant for single outputs
      * intermediates from single output boxes do not need to be named

i'd like to extend this to grid processing: systolic arrays etc: box
compositions that connect boxes in several dimensions, such that
iterators can be derived from a highlevel description.



Entry: driving led
Date: Tue Dec 11 11:17:14 CET 2007

driving the led during reception is going to happen at 5kHz, which
when using PWM is probably going to be too little. say 256 steps gives
about 20Hz. so what about using SD modulation? i wanted to try this
for a while, maybe now is the time.

yup. works like a charm. since red is less bright, i give it a double
time slot, which leads to a 4 phase state machine.

at receive sample rate there's some noticable flicker at low
intensity, at about 5Hz. it's easy to avoid by introducing a minimum
of 5 or 6 as color values.



Entry: more state machines
Date: Tue Dec 11 12:27:35 CET 2007

the send and receive functionality should also be implemented as state
machines. or.. stick to a single application thread, and run the other
state machines from the blocking operations? maybe that's easiest.

sending and receiving are mutually exclusive. currently there's only
the LED that works in parallel.



Entry: rx/tx interference
Date: Tue Dec 11 16:50:53 CET 2007

there seems to be interference with driving the led and reception. i
added "red blink" in the demo app whenever there is a bad reception,
however, this seems to completely throw it off..  (edit: not the led
but tx)

so i need to add some pauses probably. which brings me to: there is no
generic pause word, so i'm going to use just a double 0 for loop.

the interference seemed to be due to the absence of 'ramp-off' :
before switching to rx-mode the speaker was still being driven. i
added those and some pause, now it seems to work.


Entry: project scheme extensions
Date: Wed Dec 12 09:38:21 CET 2007

i need to move away from loading scheme extensions as individual
macros, but towards associating them to a project. they are
different. the distinction to make is:

 * macros from forth code: incremental, can be redefined
 * brood extensions: fixed per project

this of course leaves in the dark brood extensions as libraries.. it's
a hodge podge. what i could try is to keep the target namespace
management intact: typical forth style shadowing for both words and
macros and allow it to call scheme code.

what about a unified dictionary:
      * macros stored as symbolic code
      * ram addresses stored as macros

+ macros are allowed to postpone expansion if they reduce to single
constants?

it looks like the seed of the plan is there: it's simple and i can't
see any problems. the main difficultie lies in the difference of the
way the cat namespace works (declarative: no re-definition, all names
defined at once) and the purrr one (shadowining, incremental)



Entry: TODO list cleanup
Date: Wed Dec 12 09:55:06 CET 2007

DONE:

* fix the assembler: i'm running into word overflows, code is getting
  too big. maybe use a trick: whenever a word overflows, just add some
  new code after the code chunk, jump to there, and have that chunk
  jump to the original word with a far jump. as a quick fix: at least
  print the name of offending symbols to they can be manually patched
  to jong jumps.

* switch the assembler to a mutating algo so proper jump graph opti
  can be performed easily. i see no point for pure algos there.. asm
  is a black box anyway.


IMPOSSIBLE:

* if 'invoke' is a macro anyway, why not combine it with execute/b ?

  ANSWER: it's awkward to set the return stack to the word after
          invoke without using a call. that call might as well be
          execute/b

* nibble buffer is not interrupt-safe: the R/W thing is
  shared.. probably need separate R/W pointers! (FIXED)


REMARKS:

* make it possible for a macro to create a variable. more specificly:
  make it possible to create any couple of words and variables
  together. (this means a macro can create a macro.. probably means
  re-introducing some reflection).

  if the macro dictionary is merely a cache of a linear dictionary,
  with the linear dictionary containing macros, this kind of
  reflection should be possible to introduce without the disadvantage
  there was before: mutation in the dictionary hash.. there would only
  be shadowing, and 'mark' could handle macros too. syncing cache
  means (lazily) recompiling the macro cache.


Entry: mzscheme slow text
Date: Wed Dec 12 10:34:26 CET 2007

i just tried:

(define (all)
  (define stuff '())
  (let next ()
    (let 
      ((c (read-char)))
      (if (eof-object? c)
          (reverse! stuff)
          (begin
            (set! stuff (cons c stuff))
            (next))))))


(printf "~s" (length (all)))


tom@del:~$ time bash -c 'cat ~/brood/doc/ramblings.txt | mzscheme -r /tmp/text.ss'
606700
real	0m0.332s
user	0m0.319s
sys	0m0.012s

so it's at least not read-char..

maybe i need to write a fast tokenizer for forth using just read-char
instead of the yacc clone from mzscheme? probably the same goes for
sweb.

tokenizer has 3 states:
    * whitespace
    * comment
    * word

easy enough to just do manually.

it could be implemented as a 'read-syntax' word which adds source
location information to the symbols and comments read. a syntax-reader
is essential since they can be pluggen into the module loader system.



Entry: incremental static binding
Date: Wed Dec 12 12:40:22 CET 2007

about static binding, redefine and linear dictionaries: it's better to
have something that is predictable, but a bit rigid, than something
that's flexible but harder too use.

what i mean is redefining lowlevel words: it's possible to do so, but
dependency management then becomes manual. the rule is: later code can
never change bindings in earlier code, but it can redefine behaviour
for future code. this is dirty, but the simplicity is very managable
and it allows for predictable hacks. the only real alternative is a
proper dependency management system and name space isolation. david
and goliath.


Entry: sane conditionals
Date: Wed Dec 12 16:15:42 CET 2007

time to give up on the crappy >? constructs.


Entry: conditional optimization
Date: Wed Dec 12 16:28:22 CET 2007

what i need is a way to optimize away a conversion from flags ->
number -> flags, but without hindering the construction of proper flag
bytes.

the macros like '=?' can still be used as optimizations that need to
combine with 'if' immediately, but the others should definitely
produce flag bytes.


Entry: >z
Date: Wed Dec 12 17:29:29 CET 2007

i wonder why i'm not just using flag>c instead of >z.. since carry
flag is unaffected by drop. maybe to save carry flag some places?

0 -> carry = 0
any other -> carry = 1

that's just "255 + drop"

well.. it doesn't matter so much in that it's never inlined.


Entry: then opti
Date: Wed Dec 12 17:46:02 CET 2007

it looks like this is mostly broken. maybe since the introduction of
'drop save' elimination. i see that "z? if drop 123 ; then" doesn't
eliminate to one instruction.. see 'swapbra' and extend it to other
conditional execution macros.



Entry: dtc primitives
Date: Thu Dec 13 10:54:01 CET 2007

towards a standard forth.

 1. get it to crosscompile
 2. write a kernel in itself

the important things to note about the implementation is that it is
concatenative: there are no 'parsing codes', meaning, there is no
lookahead.

  * every word is an instruction
  * 'return' is marked by a bit

as a consequence, each word has only 14 bits of payload. two bits are
reserved to distinguish between data and code, and implement the
return instruction.

now the criticism: maybe it's best to ditch the return bit, since it
limits the addressable memory. with 14 bits only 16k words can be
addressed. the trade-off needs somei think it's best to ditch the return bit, since it prevents easy
access to primitives by just reading them from code.

i'm not sure where this can bite, but using the LSB as tag bit
(0=data, 1=code) and making execute ignore the tag bit allows the use
of 15 bit numbers, which can represent addresses.

maybe it's not such a good idea.. i'm a bit uncomfortable with not
having 16 bit width.

 statistics. a return bit makes
only sense if the words are expected to be short. padding is an
option, but awkward, since every label needs to be prepended by a nop
if it's not aligned.

rebuttal: tail recursion. this is the thing that's handled with the
return bit.. i forget a lot of thought already went into this
thing. tail recursion justifies the inconvenience of handling the
extra bit.

remark: a tagged data system can be built on top of this forth. i'm
not comfortable with giving up a 16 data/return stack in favour of a
14 or 15 bit tagged system.


Entry: signed/unsigned comparisons
Date: Thu Dec 13 12:45:45 CET 2007

two issues. are they the same or not, and what should the default be?

they are not the same: 

     pos neg >
       * always true in signed
       * always false in unsigned

unsigned: carry
signed: sign of result (might overflow)

it's a bit silly, but i think it's time i admit i don't fully
understand it.. carry in addition is simple. carry in subtraction is
also not so difficult, since subtraction is addition with negative.

a carry on addition means overflow: the word's not big enough. simple.

but what is a carry on subtraction? let's isolate some cases.

            result carry sign overflow
10 3 -        7      1     0    0
3 10 -       -7      0     1    0

100 -100 -  -53      0     1    1
-100 100 -   53      1     0    1


http://en.wikipedia.org/wiki/Overflow_flag

  The overflow flag is usually computed as the xor of the carry into
  the sign bit and the carry out of the sign bit.

In other words: addition adds one extra bit to the representation. In
order to not have overflow, for unsigned addition/subtraction this bit
needs to be 0, and for signed addition/subtraction this needs to be
the same as the sign bit.

So, for a signed comparison, take the sign bit of the result, and
assume there is no overflow. For unsigned take the carry bit.



Entry: dtc remarks
Date: Thu Dec 13 14:34:51 CET 2007


 * size or speed? in the end it should run on CATkit, which has little
   flash memory, so i should really go for size.

 * FOR..NEXT is not standard, so i can just make something up?


can't get for..next going.. debugging return stack stuff is
hard. wanted to have a quiet simple puzzle day, but it requires 'real
work' :)

about size vs speed. the primitives need to be fast, so they can be
used in STC code with the VM eliminated, but the VM needs to be
SIMPLE. the return stack really should contain the same stuff as can
be found in straight line code.

i'm going to eliminate some macros. hmm.. too much thinking because
it's already too much optimized.. i find it difficult to throw this
kind of stuff away.

what to optimize:
  * inner interpreter loop
  * maybe math primitives (used elsewhere)

not so important:
  * enter/leave + RS (once per highlevel word)


Entry: eForth / tail recursion + concatenative VM
Date: Thu Dec 13 16:05:24 CET 2007

why is not optimizing so difficult? i see factors of ten everywhere..

the vm-core.f i have is nice, but i'm still quite stuck at trying to
solve multiple problems at the same time: * interoperability between
STC and DTC: both primitives and brood.  * tail recursion

it needs to be simplified a lot.. in the same way that PF needs to be
simplified to get to a proper VM architecture: it's the same problem.

i can do with primitives what i want, but all CONTROL FLOW needs to be
based on 2 simple instructions: _run and _?run - the duals _execute
and _?execute are only for primitives.

so what's the definition of _run, such that it can be turned into a
jump..

IMPORTANT:

   conditional run is not the same as conditional branch..  this
   points to an inconsistency: things that JUMP are incompatible with
   the exit bit.


another problem is that 'immediate' won't work: no compile time
execution: a simplified forth. can i have a macro mode? before i can
implement these i really need to take a look at putting back
incremental extension in the language, this time without implementing
it using mutation.. (it starts to look like this cutting of the
reflective wire was a really bad idea..)


Entry: macro code concatenation
Date: Thu Dec 13 19:20:09 CET 2007

what i'd like to postpone expansion of constants until assembly. but,
i can't influence the meta functions from forth code.. this is another
one of those arbitrary complications.

what about:
  - putting macros in the project dictionary
  - by default, they are expanded
  - when present in data positions, they are evaluated

i can't see a reason why this wouldn't work. the only concern is
stability: each invokation needs to reduce. i.e. '+' in meta dict is
special because it's different from the '+' in macros (the latter
can expand to symbolic code containing '+')

  the problem i'm trying to solve is to get a minimal symbolic
  representation of things that are constants by delaying their
  evaluation, or by somehow recombining?

i.e.: if there is a macro

  : foo 1 + ;

i want the code "123 foo foo" to expand to the machine code:

  (qw (123 foo foo))

instead of

  (qw (123 1 + 1 +))

the thing that decides what to do here is '+' but can this decision
somehow be transformed to the point where 'foo' executes? if every
macro inspects its result, and if the result is ONLY the combination
of constants->constants, this combination can be made symbolic, since
it can be re-computed at assembly time.

i.e.

   (qw a) (qw b) foo -> (qw (c d e f))

can be replaced by

   (qw (a b foo))

because probably "c d e f" is not going to be very helpful to
understand where the constant came from.

this would enable the unification of:
   * constants
   * variables
   * macros
   * meta words
   * host code

does the subset of these macros need to be explicitly defined?
probably not. they are just macros, and qualify if they map qw's to
qw's.


Entry: partial reduction
Date: Fri Dec 14 10:16:22 CET 2007

maybe macros should be made greedy, such that when completely expanded
they reduce. what i mean is that "1 2 +" -> (movlw 3) but "abc 1 +" ->
(movlw (abc 1 +)). combined with the mechanism described above, this
could be the key to unification.

as a result, macros will be the only evaluation mechanism, which just
need to be provided with a symbol lookup. there are 2 phases of macro
execution:
  
  - phase 1:  compile to literals + instructions, names symbolic
  - phase 2:  compute literal values using resolved names


it looks like making the effect of 'meta' into a local effect is the
way to go. it would be nice to find a way to fix the 'postponing'
operation first, so at least generated assembly code looks nice.


Entry: meshy finished?
Date: Fri Dec 14 15:59:31 CET 2007

looks like we're at the end. got 8 devices talking to each other. so
time to make a "what learned?" section..


  * for DSP, use a dsPIC instead of a PIC chip, OR write a highlevel
    (but slow) set of primitives on PIC. i spent too much time in
    writing "fast" code that eventually didn't get used, or
    extensively modified to destroy the optimizations.

    DSP apps have the property that a lot of the code volume needs to
    be fast, which screams for a SEPARATE algorithm design and
    implementation/optimization phase. the problem here is on-target
    debugging. as long as the app scales time-wise (rate reduction
    without changing other variables) optimization can be postponed.

  * get it to work FAST, and start with the most difficult part, even
    if it means dirty hacked up proof of concept, then incrementally
    improve while keeping it working. don't spend time on things that
    solve needs that are not immediate if there are other immediate
    needs.

       - debug network: eventually didn't get used
       - the hardware layer: it delayed everything else

    the mistakes had quite severe consequences in the end. i could
    have gained 2 weeks by not making the debug network.

    the cause of the mistakes seem to be 

       - mismatch in skill (no analog electronics hands-on experience,
         and dusty theoretical understanding) but mostly misplaced
         confidence in non-tested skill.

       - underestimation of importance of debugging.

  * debugging deserves its own bullet. ironically, i lost a lot of
    time building a debugging tool. building that tool was a good
    idea, but i forgot a couple of steps:
       
       - underestimated the difficulty in getting the debug net
         working properly. this actually required an intermediate
         debugging phase to monitor the behaviour of both send and
         receive. i didn't anticipate these problems, which was a
         mistake. lesson to learn is to never underestimate the
         problems that can arise, even if the application seems really
         trivial.

       - doing high-bandwidth work (DSP) requires high-bandwidth
         debugging tools or at least a large storage space on chip for
         traces and logs. a solution here would be to make a separate
         circuit only for logging, or use a high-bandwidth host
         connection. an example could be a circuit that records to
         a flash card, or a USB connection to host.

       - need better host side software extension system for
         special-purpose debugging tools. it should be the same as the
         way the host system is written, so that tools can be moved
         into the main distro when polished. to make this easier, the
         number of extensible points needs to be limited such that
         they are better accessible. i.e. the console's need to be
         programmable.



so, to summarize:


     DESIGN then IMPLEMENT

     don't optimize and design at the same time if there is a lot of
     opportunity for optimization (i.e. DSP app on PIC18 where an
     order of magnitude of speed gain is easy to find). as long as
     time-critical cores are small, this is ok, but when the core is
     all there is, you need to get it to work first using a highlevel
     approach, and ONLY THEN make it fast.


     ELECTRONICS is DEBUGGING

     do not underestimate the difficulty of getting something right in
     reality, even if the logical model is trivial. programming
     problems seem to be about managing complexity, while electronics
     problems are about managing external influences, non-ideal
     behaviour, and tons of exceptions and hacks. these are entirely
     different. programming = abstraction, electronics = debugging.


Entry: meshy presentation -- technical
Date: Fri Dec 14 16:54:00 CET 2007

hardware

  goal = as simple as possible
     - 40mm speaker used as mic
     - input:  2 opamp mic bandpass amplifier + 8 bit A/D
     - output: switching transistor (PWM)
     - PIC18 @ 10 MIPS
        - prototype uses large chip (64kb - 4kb - 28 pin PDIP)
        - possible to downscale a lot (8kb - 256b - 18 pins SMD)
     - RGB led (single resistor, S/D alternated pulsed)
     
lowlevel software

  - purrr
     - Forth dialect
     - simple but powerful
     - bare metal vs. abstraction mechanisms
     - interactive (debugging!)
     - bottom up programming
     - metaprogramming (scheme)
     - emphasis on debugging

  - sound modulation:
     - OOK  (on-off keying)
     - BPSK (binary phase shift keying)
     - 10 baud framed bytes: 1 start, 8 data, 2 stop
     - 610Hz carrier (speaker reso)
     - speaker driven with 7 bit PWM @ 78kHz

  - demodulator
     - input sampled at 5kHz
     - downmixer (cross modulator) + lowpass filter
     - OOK: asynchronous, power detect
     - BPSK: synchronous costas loop


Entry: simplex LEDs
Date: Sat Dec 15 11:24:21 CET 2007

the most efficient way (wire-wise) to connect a bunch of LED is to
place them on the midpoints of simplexes, where you connect the
simplex points to +/- drive points: this makes it possible to switch
on 1 hop vertices, but 2 or more hop vertices stay off since they will
not reach threshold voltage. this structure is also called a "complete
graph".

http://mathworld.wolfram.com/Simplex.html
http://mathworld.wolfram.com/CompleteGraph.html

mapping this to a 2D or 3D structure in a nice symmetric way is not
that trivial. however, the most symmetrical planar arrangement is:

place the points in a circle. if the number of points N is odd, you
get (N-1)/2 concentric circles each containing N points, with a
criss-cross network below it. even works similarly, only one of the
circles has half the elements.

this structure can be wrapped around half a sphere. wrapping it around
a full sphere gives easy access to the control points, and gives a
spherical or cilindrical structure.

the coverage grows ~ n^2 so taking more points is relatively more
efficient. however, overall connection might get too complicated. a
different approach is to take some kind of 'primitive circle' which
can be unfolded in a line, for example the pentagram with 10
LEDs. transporting then could be done using a bus. i.e. a ribbon
cable. maybe it's possible to use a ribbon cable with pins?

using a linear solution, it might be possible to make something that
is composable. i.e. take an N solution, add a wire and some N
primitives and make an N+1 solution.

this turns out to be just cyclic permutations. for example, starting
with the 2-terminal primitive L2, it can be extended to a 3-terminal
primitive L3 by means of the primitive 3-permutation P3, and adding an
extra wire to P2, so:

L3 = L2 P3 L2 P3 L2 P3 = (L2 P3)^3
L4 = (L3 P4)^4

   
   in general: L_N = (L_{N-1} P_{N}) ^ N

this is probably a lot easer to do than networking, since it's basicly
braiding. a linear projection is easy to control, but i'm not sure if
it's really a good approach for construction.. if i find an easy way
to solve the permutation problem, then yes, it's a good thing.

simplification: it's probably ok to leave out the last permutation,
and compensate for it in software.

now, permutations and braids: they are not the same. transpositions
have no direction, and are self-inverting. a twist on the other hand
has a sign, and is not self-inverting.

braids can implement permutations while giving structural integrety.
for example the most typical 3-strand braid:

__   ____
  \ /
   \
_/  \   _
     \ / 
      /
_____/ \_

implements a 3-element cyclic permutation as a right crossing followed
by a left crossing (nomenclature: rotate the image 90 counterclockwise
and progress upward: direction is the strand that passes over the
other one.

compare this to a double right crossing:

__   ____
  \ /
   \
_/  \   _
     \ / 
      \
_____/ \_

this is a simple twist and provides no structural integrity, but
implements the same permutation.

can this somehow be used as a building block for the other cyclic
permutations? sure.. as long ass you work with twists from left to
right, and make sure the twist pattern gives you structural integrety,
the same logic applies: the result is just a cyclic permutation.



Entry: interactive mode
Date: Sun Dec 16 10:13:47 CET 2007

from interactive.ss :

  The end goal of Purrr is to have only 'live' and 'macro'
  interactions: the system should be powerful enough so excursions to
  the underlying prj: code is not necessary. This gives a separation
  between 'tool development' and 'tool usage'.

I've come to believe that this is not a good idea in general. It is OK
to be able to access the most basic host code, such as compilation,
upload and inspection, but for real work you'd want to automate those
and have a 'real' programming language behind it. In other words:
access to prj or scheme code is necessary.

  * it's ok to have a small collection of host words in interaction
    mode which are hidden using prefix parsing.

  * this set of mappings (parsing words) should be extensible: prefix
    parsing needs a simpler definition form.

  * the functionality behind those words should be extensible

Concretely this requires interactive.ss to be adjusted so it can
accomodate parsing code in a different way. Maybe it can be made
extensible together with the other parsing words.. The problem right
now is that it is a single method, and the way it's defined is
difficult to make dynamic (it's a scheme macro).

Actually, compile mode forth parsers are already registered in the
global namespace tree, so making them extensible can be done
incrementally by adding some more name spaces.


Entry: extensible interactive parsers
Date: Sun Dec 16 10:52:30 CET 2007

two conflicting views here:

  * currently interactive parsers are isolated functions, which is
    nice and clean.

  * what is required is extensibility and re-use.

the solution seems to be to put the components in a global name space,
which is used as the unified extension mechanism, and replace the
function with a stateful one that refers to the name space.

key elements here are 'with-member-predicates' and
'predicates->parsers'. these form a construct that needs to be
attached to the global namespace tree.

the former creates a collection of membership predicates.

the latter creates a map (finite function) from atom -> parser.

the problem with the current approach is the generality of the
parsers: they don't just map names to functions, but also create
'classes' with similar behaviour, so there is a level of indirection
that needs to be captured. the live parser map is

   * symbol -> parser  (parser primitive)
   * symbol -> symbol  (parser class)
 
if they are stored in this way, interpretation is quite
straightforward. the approach is:

   * provide alternatives for 'with-member-predicates' and
     'predicates->parsers' so they postpone their behaviour and store
     it in the global namespace.

   * provide an interpreter.

OK. implemented + tested.

Some further cleanup. Maybe it's best to not store symbols in the
dictionary, but parsers: use cloning instead of delegation? This way
the dictionary IS the finite function. The real problem is that macros
have a delegation method (function composition) but parsers (and
assemblers for that matter) have not.

so:

  Forth syntax parsers (lookahead) have no composition
  mechanism. Therefore cloning is used to give some form of code
  reuse. It used to be delegation, but this gives dynamic behaviour
  which contrasts with the static, declarative intent of the global
  name space, regardless of its implementation as a hash table.

and about ns:

  The global namespace is used as:
   * declarative symbol table (single assignment, mutual refs)
   * cache (forth macros should eventually be defined in state file)

Maybe forth.ss should be separated into generic forth style parser
macros and functions and the definitions of the parser words.



Entry: static composition and extension
Date: Sun Dec 16 11:19:16 CET 2007

i chose for a hierarchical dictionary as the main means of program
extension. the way it is used is not dynamic binding, but 1. postponed
static binding and 2. cache of a linear dictionary.

as a consequence, it can probably be completely replaced by mzscheme's
module composition approach, together with some means (units?) to
solve circular dependencies and plugin behaviour.

however, i see no point in changing this until the dependency on the
method that implements this linking part can be abstracted
away. currently that seems problematic, because the name store is
everywhere: it is the backbone of the system.

i find it very difficult to see what is the right thing to do
here. 1. i'm not using abstraction mechanisms provided by mzscheme to
do namespace management, which makes me miss some static/dynamic
checks, and is in general just a bad idea. 2. my approach is more
lowlevel so flexible to shuffle it around and find the right
abstraction. the thing is i'm not sure yet if i need this flexibility
(over the built in functionality).

the only way to really resolve the ignorance is to implement a toy
project which doesn't use the global namespace, and only uses mzscheme
units and modules.

Entry: future dev
Date: Sun Dec 16 15:48:03 CET 2007

  * fix problems in TODO (mostly peval)
  * finish 16bit DTC
  * dsPIC forth
  * lisp-like dsp functional dataflow language for PDP/PF/dsPIC
  * CATkit 2
  * sheepsint 8-bit synth engine (envelopes + FM)
  * E2 debugging
  * CATkit midi
  * USB






Entry: inspecting macro output
Date: Mon Dec 17 10:07:27 CET 2007

finding a common tail in 2 lists is necessarily quadratic. but i
probably don't need that, since i'm looking only for common subtails
in substacks.

i'm still looking for a good description of the problem.. the problem
of finding the common tail seems to be the one to give insight.

what about this:
 1. split input and output 'qw' atoms off
 2. check if remaining tail is the same

this is the only behaviour that's valid. once this data is obtained,
it could be peeled to isolate the behaviour of a macro, at which point
cold be decided to 'unevaluate' it.

now, what does unevaluate means?

... (qw 1) (qw 2) +   ->  ... (qw 3)

this could be replaced by (qw (1 2 +))

this is always the case: since the evaluation can be performed again
later. the only information that is extracted at this point is whether
the macro does anything else.

the change in macro code seems to be here:

    (([qw a ] [qw b] word)         ([qw (wrap: a b 'metafn)]))

the 'wrap:' form needs to be replaced by something that might return a
value if the variables contain numbers.


running into a small namespace problem.. trying to use scheme names,
but it might be better to leave the meta dict in there to do this kind
of stuff, but only call it from the macros. basicly, the stuf after
wrap: should be symbolic if the parameters are symbols, and computed
if both are numeric.


Entry: benchmarking
Date: Tue Dec 18 16:21:16 CET 2007

the current reader is problematic.. it's slow, and i don't understand
the reason. i don't think it's usage of streams, since it wat slow
before, and it's not read-char, since i tried that..

so... 

  1. make a test for the current reader
  2. replace it with a new reader
  3. build 'read-syntax'


first text: the problem seems to be somewhere else..

  (define f (forth-load-in-path "monitor.f" '("prj/CATkit" "pic18" )))

is virtually instantaneous like it should be..

so where did i get the idea that this is slow?

indeed:

  '(file monitor) prjfile prj-path forth-load-in-path

is instantaneous also.

otoh, 'forth->code/macro' isn't instantaneous at all..

compiling the code 'code/macro!' is instantaneous also. i think i got
it. why is the code/macro splitter so slow?

tracking down to forth.ss : forth->macro.code which uses
@forth->macro/code which uses @moses

it can't be @moses since that's just a filter.. so it's probably down
the stream in the macro processor. need to test that separately.

running into some inconsistencies.. probably best to switch everything
to syntax objects, including a syntax-reader.


Entry: read-syntax
Date: Tue Dec 18 17:54:50 CET 2007

from 

http://download.plt-scheme.org/doc/371/html/mzscheme/mzscheme-Z-H-12.html#node_chap_12

  (datum->syntax-object
   ctxt-stx v [src-stx-or-list prop-stx cert-stx])

converts the S-expression v to a syntax object, using syntax objects
already in v in the result. Converted objects in v are given the
lexical context information of ctxt-stx and the source-location
information of src-stx-or-list. If v is not already a syntax object,
then the resulting immediate syntax object it is given the properties
(see section 12.6.2) of prop-stx and the inactive certificates (see
section 12.6.3) of cert-stx. Any of ctxt-stx, src-stx-or-list,
prop-stx, or cert-stx can be #f, in which case the resulting syntax
has no lexical context, source information, new properties, and/or
certificates.

If src-stx-or-list is not #f or a syntax object, it must be a list of
five elements:

  (list source-name-v line-k column-k position-k span-k)

where source-name-v is an arbitrary value for the source name; line-k
is a positive, exact integer for the source line, or #f; and column-k
is a non-negative, exact integer for the source column, or #f;
position-k is a positive, exact integer for the source position, or
#f; and span-k is a non-negative, exact integer for the source span,
or #f. The line-k and column-k values must both be numbers or both be
#f, otherwise the exn:fail exception is raised.

(datum->syntax-object
 #f word
 (list source-name
       line
       column
       position
       span)
 #f #f)


EDIT:

why do i run into the need to have a port object that can put back a
character? scheme needs this too, so maybe the port objects need to
support putback?

it's the other way around: scheme ports support a peek operation.


looks like it works now, and the code looks clean.
next: create syntax objects.

this seems to be rather straightforward by using
'port-count-lines-enabled' and 'port-next-location'.

ok. seems to work now.


Entry: syntax cleanups
Date: Sun Dec 23 13:57:33 CET 2007

what about the '|' character for lexical variables?

things to be aware of:
  don't break code / or break it verbosely
  
again, i want to write a state machine.. i need to think a bit about
the abstractions used in forth.ss

'parser-rules' works well. the rest is hard to read. the problem seems
to be parsers that segment data, instead of taking a fixed amount of
data from the stream. these need state machines.

let's rewrite the def: parser as an example.

basicly this is forth-lex.ss, but then recursively.

OK. i've got a definition parser working which produces name, formals
list and body. now this needs to be passed upstream somehow. looks
like that is the next part to cleanup: macros can have formals, and
they need a symbolic representation for this, i.e. in the state file.

now the question is: should this be the (a b | a b +) syntax, which
requires another lexing step, or should it be an s-expression with
explicit formals list?

what about this: make lexing steps easier, and just use more lexing
steps.  forth handles parsing (recursive) at a later state than
lexing.


Entry: regular grammar
Date: Sun Dec 23 22:00:27 CET 2007

the essential property of a regular grammar is that, each production
rule produces at most one non-terminal. intuitively, this means there
is no "recursive" tree structure, only a sequential one: there is no
"replication gain".

so it looks like i need a way to express some of the state machine
parsers as simple regular expressions based on membership functions,
instead of the more specialized character classes.

 (vaguely related: note that the Y combinator is essentially a copy
 operation)



Entry: regular expressions
Date: Tue Jan  1 15:48:20 CET 2008

the data is a stream of tokens, so regular expressions can be
constructed in terms of membership functions and modifiers like '*' or
'+'. symbols can be converted to membership functions.

that should be enough? not really. need some form of abstraction: a
pattern can be a composition of patterns.

so maybe it is better to stick with the lexer language in mzscheme?
since what i am going to re-invent is ultimately going to be a generic
regexp tool. EDIT: looks like it's really character-oriented. maybe it
is a good exercise to try to write a lexer generator? can't be that
hard.. also, i run into this problem so many times with low-level bit
protocols that it might be a good idea to take a closer look: white
space is essentially the 'stop bit' in async comm.

  which brings me to the question: i think i read on wikipedia (i'm
  offline now) that regular expressions and FSMs are somehow
  equivalent. how is this?

how about forth-lex.ss: a specification not as production rules of a
regular language, but as regular matching patterns? what is the
problem i am trying to solve? find a function (or macro) that maps

  lex-compiler : language-spec -> token-reader

stream = token stream | EOF
token = word | comment | white

at the same time, i am trying to stay true to the forth syntax: simple
read-ahead. (keyword + fixed number of tokens) or delimited read
(keyword + tokens + terminator).

  note: there seems to be a difference between reading UPTO a token,
  or reading UPTO AND INCLUDING a token. is standard forth always of
  the latter form?

to answer the question partially: the current forth-lex.ss performs
segmentation, and thus is not of that form: it cuts INBETWEEN tokens.
but forth is. can i learn something from this? yes: cutting AT a token
makes the automaton simpler, since it doesn't require peek. let's call
that 'delimited' until i know the technical term.

i think the important lesson is that:

  1) forth should be delimited: this simplifies on-target lexing
  2) exception: first stage tokenizer in brood = segmentation

the latter is an extension to make source processing in an editor
(like emacs) easier by preserving whitespace and delimiting
characters. BUT, it should not introduce structures that 1) can't
interpret.

it looks to me that before fixing the higher level compiler and macro
stuff, the lexer should be fixed such that it can be replaced by a
simple, reflective state machine (true parsing words). looking at
forth, there are 2 reading modes:

   - read upto and excluding character
   - read next word (= upto and excluding whitespace)

by fixing some of the syntax (comments and strings) editor tools can
be made exact: a list of DELIMITED words will read upto and including
a delimiter.



Entry: rethinking forth-lex.ss
Date: Tue Jan  1 18:00:25 CET 2008

a proper markup language is necessary. one that will not throw away
information, but gives perfect parsing of source code. note that in
order to transform source code to markup, a tokenizer is
necessary.

the tokenizer is a form of 'unrolled' parser: it describes a
segmentation that CAN be parsed by a reflective delimited
parser. ('reflective' means words have access to the input stream and
can thus influence the grammar).

in order to make the right decision, it is necessary to have a look at
the standard word ." which quotes a string up to but excluding the "
character and prints it: this words interprets the first whitespace as
the delimiter, and any subsequent whitespace is part of the string. in
order to properly segment code, this behaviour needs to be respected.

instead of (pre word post) a different segmentation is necessary which
can properly encode eof. what about a word/white distinction?

(word    pos string delimiter)
(comment pos string delimiter)
(white   pos string)

another question: is EOF error or not, when it follows a word? i think
the answer should be YES: otherwise it violates concatenation of files
= file.

got forth-lex.ss simplified now.. it looks really familiar ;)
i need to give it the standard names, but this looks like it.

NEXT: add delimited parsing to parser-rules. this should capture all
parsing need, since there are no more non-delimited constructs. i.e.

  (parser-rules ()
    ((_ macro : name words ; forth)
        ---------))


Entry: declaration mode
Date: Wed Jan  2 19:15:54 CET 2008

embedded in standard Forth syntax is a "declaration mode" where all
definitions are interpreted as macro definitions instead of
intantiations of words.

i'd like to express the state machine that implements this mode using
an extension of the 'parser-rules' syntax, one that implements (a
limited set of) regular expressions.

let's start with a summary of current constructs (-> means "depends on")

  parser-rules -> @syntax-case -> @unroll-stx + syntax-case

where 'parser-rules' creates a function with parser prototype (stream
-> stream,stream) and @syntax-case is like 'syntax-case' but
applicable to the head of streams.

most of the real action is in forth.ss, where i'd like to eliminate a
number of constructs. the current way to collect a number of
definitions is using 'def-parser' which creates a definition parser
parameterized by a type tag. recently i wrote this as a straight state
machine. this i'd like to replace now with some regexp based matching
approach.

the key elements in a def parser are:

    * a definition is of the form   
         : <name> (optional | <formal> ... |) <word> ... ;

    * a list of definitions is terminated by the word 'forth'

previously i came to the conclusion to only allow delimited
constructs, which are clearly marked with a start and stop
marker. these constructs require no lookahead, and thus have a simpler
automaton implementation.

i'd like to use the '...' construct to indicate zero or more, just
like the syntax-case macro, but necessarily limited by a fixed marker
symbol. a '...' at the end of a match means pattern recursion.
optional constructs can be handled by multiple match rules. this makes
a def parser look like:

   (parser-rules (: ; | forth) 
     ((: name | formal ... | word ... ; ...) ((def name (formal ...) (word ...))))
     ((: name word ... ; ...)                ((def name () (word ...))))
     ((forth)                                (()))

can this form of ellipsis be mapped to the default meaning of multiple
occurances? this looks like an important question: a core difference
between tree and sequence matching.

question: what is better?
  * special meaning of '...' at the end of a sequence (self-recursion)
  * explicit recursion?

the def parser could be constructed as a 2-phase machine: one that
dispatches between staying in the mode and calling a single def parser
or exit the mode, and the def parser itself.

'...' could vaguely mean "multiple times", but there's a difference
between: multiple times upto XXX, or infinitely many. it looks like
explicit recursion is better than looping, so i'm going to drop the
special meaning. this brings a single def parser to:

   (parser-rules (: ; |) 
     ((: name | formal ... | word ... ;) ((def name (formal ...) (word ...))))
     ((: name word ... ;)                ((def name () (word ...)))))

now, what i can use is this:

  (syntax-case #'(a b c end bla) (end) 
    ((stuff ... end r) #'(r stuff ...)))

=> (bla a b c)


yep.. it looks like there's a fundamental difference between the tree
matching and sequence matching problem. maybe i need to give it a
special symbol. let's take *** to mean: collect upto following
terminator, so ... can still be used for tree matching.

   (parser-rules (: ; |) 
     ((: name | formal *** | word *** ;) ((def name (formal ***) (word ***))))
     ((: name word *** ;)                ((def name () (word ***)))))



what about a simpler approach? the only thing that needs to be done is
to collect syntax object between marks into lists. these lists are
easy to process with a @syntax-case parser later on. so the thing
that's necessary is a way to construct a stream parser that collects
up to a certain predicate. sounds familiar?

ok.. this leads to simpler code. i could use the current 'def-parser'
as a template for a more general delimited parser expression.

i think i can ditch '@split' now. it leads to convoluted code.

ok. 'mode-parser' is now written as an explicit recursion now. this
probably means i can start throwing out some stream processing
code. wait.. need to check the macros-with-arguments thing..

OK. fixed. commented out a lot of code from stream.ss that was related
to chunking/splitting.

so.. the lesson:

     * linear streams: use explicit delimiters for embedded sequences:
       simplifies parsing: no lookahead necessary.

     * convert delimited sequences to lists + use scheme's tree
       matchers


Entry: next?
Date: Thu Jan  3 00:48:28 CET 2008

* connect the syntax reader to the parsing/loading code.
* unify all evaluation to execution of macros + manage evaluation time


Entry: moving to stx objects
Date: Thu Jan 31 12:49:59 CET 2008

what needs to be done now is to:

* replace all compile words so they accept syntax object in addition
  to lists.

* convert all generators to syntax generators

* add print routines for them


so.. start in badnop.ss: string->code/macro (for compile mode, which i
can test now). i'm replacing forth-string->list with forth-string->syntax

got string->syntax stuff working. now trying the path/file
loader. this needs @syntax-case instead of @match.

except for the weird problem below which i worked around, it seems to
work now. printing works out of the box (snot).



Entry: weird @syntax-case problem
Date: Thu Jan 31 13:54:43 CET 2008

the 'load' symbol in this doesn't want to work. if i replace it with a
different name, it does.. what's that about?

    (@syntax-case
          stream tail (load-ss load)
          
          ;; Inline forth file
          ((load name)
           (begin
             (printf "load\n")
             (@append (@flatten (f->atoms (stx->string #'name)))
                      (@flatten tail))))
          
         ....


Entry: possible cleanups
Date: Thu Jan 31 15:56:06 CET 2008


 * asm buffer from tagged list -> abstract type?

   there's a lot of room for improvement in that department. it would
   allow some kind of instruction annotation that's not possible right
   now. i think were i to start from scratch, i would build it around
   this..
 
 * macro unification

   (from the TODO)
   
   unify dictionaries: put macros in the main dict as lists, store ram
   addresses as variables, and find a way to postpone compilation of
   macros to their corresponding values if they reduce to values (are
   constants/variables/labels...)

the former is cosmetics (atm), the latter is a tough problem, but can
lead to a gigantic simplification.



Entry: target name space unification
Date: Thu Jan 31 16:01:12 CET 2008

name space unification would mean that the dictionary stored in the
.state file contains not only addresses, but also macros (in a form
that's specific enough to recompile).

this form needs to include lexical variables. so a dictionary item is
either a number, or a macro. target words are then just macros:

((abc 123)            ;; literal / constant / ram variable / ...
 (go  3235 execute)   ;; code
 (bla abc def))       ;; any macro code

taking into account lexical variables this can be simplified to a
single format:

((abc () (123))
 (go  () (3235 execute))
 (bla () (abc def))
 (arg (a b) (a b +))) 

where the first parens are the macro lexical variables.
code that has no lexical variables is purely concatenative.


this requires quite a deep cut, but should lead to great
simplification.

fork point is here.



Entry: declarative namespace + cached linear dictionary
Date: Thu Jan 31 16:53:39 CET 2008


make dictionary abstract? maybe the most important point to ensure is
cache consistency. on one end, there is a symbolic representation of a
dictionary, on the other end there is a compiled version, which
resides in the NS (macro) part. how to ensure these are never out of
sync?

so the next step is to define what the NS object actually is. it is a
collection of namespaces, where each element is STATIC. the
IMPLEMENTATION allows mutation, but the use should be restricted to
single assignment. otherwise the cache is invalid.

the main function the NS object provides is PLUGIN behaviour: late
binding of some identifiers to allow the system to be composed of
several individual pieces, without needing the strict tree-based
structure of mzscheme's module system. maybe units are the right way
out, but right now i'm stuck with this more lowlevel model. what's
necessary is to define some proper interfaces to this:

  1) NS as graph binding (single assignment)
  2) NS as cache object for target macros

i made this remark before.

the first access pattern is easily enforced: never overwrite
anything. the second one is more difficult. need to google a bit,
looks like a popular pattern: cache association list with a hash
table.


Entry: caching an association list
Date: Thu Jan 31 16:54:05 CET 2008


the problem can be solved by making the operations abstract.

association list:
   * push
   * pop
   * find

as long as the access pattern contains no pops, the caching mechanism
is quite simple. on pop, one could re-generate. this is effectively
what i'm already doing, however, it's not guaranteed synchronized.

so.. the elements: 2 dictionaries:

  (macro)         ;; defined in core, and untouched by prj
  (macro-cache)   ;; cache of prj macros

for this to work, the code in (macro) should NOT depend on the code in
(macro-cache). this means the core macros are not allowed to have
pluggable code. this is only allowed in the static load part.

let's rephrase: macros are subdivided in 2 parts:

  1) declarative with cross-resolve (pluggable components)
  2) linear dictionary extension on top of this

does this in any way interfere with local name re-definitions?

i think i just need to try it out..

re-iterate the model from the forth side:

each compilation unit has a name space that can shadow/extend the
previous one. all extensions in one unit need to be unique.  this
model resembles incremental compilation per word (strict early
binding), but allows for cross-reference within one unit.

path: 
 * get rid of constants.
 * get rid of ram dictionary.
 * move macros to target dictionary.

constants already were eliminated. they can still occur in rewrite
macros that generate asm code though.

the ram dictionary is more problematic. it's probably best now to move
to abstract access methods for the dictionary. it does look like
that's the way out. pulling those changes through the assembler will
shuffle things quite a bit. macros can follow quite easily from there.

maybe it looks like this: in assembler.ss -> 'label 'word 'allot
represent the points where the dictionary is augmented. what will
happen here is that macros can be defined also, no?

there seems to be a conflict between allowing the definition of labels
(ram or flash) and allowing those of macros, when they are all
unified..

there is a difference however: as long as the thing which creates a
new macro definition, only dumps it in the assembler buffer, there is
no problem.. the entire buffer will be assembled with the current
macro definitions.. wait, there's something warped about that.

pushing through some changes, i arrive at the assembler. it might be
best to turn the running variables (rom and ram top pointers) into
real variables, and use the dictionary as a stack.


going to try to do some things at once:
 - allot needs to be rewritten in terms of ptr@ ptr!
 - adding new dictionaries won't work any more

fading out.. next = (code . 0) (data . 0) etc.. data is missing.

ok, cleaned that up a bit.. also made the running pointers mutable.


Entry: macros in dictionary
Date: Fri Feb  1 12:28:01 CET 2008

that's the next step.  now i need to think hard about where this can
go wrong, with the semi-separation i have.

basicly, the preprocessing step SORTS all names, to make sure macros
are active before the rest of the code is compiled. this shouldn't
give any trouble.

the thing to look at next is the path macro definitions
travel. probably it's best to parse everything in one go: formal list
(empty for concatenative macros). forth.ss is again the place to
be. looks like make-def-parser is the function to modify.

that modification seems to work. now adjusting badnop.ss and
macro-lambda-tx.ss to build a compiler function that uses the parsed
representation to build a macro.

the problem here is that it doesn't really fit in the rpn-compile
framework.

so.. i made it fit. the "body" for macro-lex: compilers consists of 2
elements. a list of formals and a body. this is the standard format
used in the state file. md5 sum still checks.

NEXT: move the 'macro dict into the normal dict.

ouch.. can't have "123 execute" as macro.. or can we? maybe that's one
that should be delayed.. i need sleep. this smells like the beginning
of something new.. a proper way to organize the code.

a question to answer: why did i violate source concatenation by
introducing locals? the answer is of course out of convienience, but
is there a real disadvantage? the macros themselves are still
compositional.. this is just about source.


Entry: name change
Date: Sat Feb  2 12:12:30 CET 2008

it's time to start thinking about a name change for the cat
lanugage.. problem is of course cat-language.com

i have 2 alternatives: KAT and SCAT. the problem with KAT is that it
sounds the same as CAT. the problem with SCAT is the same as the
problem with SNOT.. do i really care though? programming in scat could
then become scatology. i still think that's humor ;)


Entry: reflection
Date: Sun Feb  3 10:43:19 CET 2008

i was thinking yesterday about macro unification, and wondered whether
it might be better to go back to the accumulative model for name
resolution / redefinition.

the main problem before was that compilation of code had side-effects
(definition of new macros in the NS hash), which made it impossible to
evaluate code for its value only. however, there is probably a way to
put this accumulative behaviour back, by taking the assembler into the
loop: let the asm 'register' the macros.

the REAL problem i'm trying to solve is still macro generating macros
and the generation of parsing words. both are a opposed to declarative
code model, but in the end, the model isn't declarative at all.. it's
a bit of a mess in my head now.

GOAL:

      i need macro generating macros: limiting the reflective tower in
      any way will always feel artificial.

how to do that?

      * accumulative (image model) is the simplest, and the original
        way of dealing with this problem. however, it doesn't give a
        static language.

      * declarative (language layer model) is the cleanest way of
        doing this, but requires some overhead that might look as
        overkill.


can we have both? the declarative approach needs s-expr syntax to be
managable. it won't be Forth any more..

let's see.. image model: simplest, highly reflective forth
paradigm. declarative: cleanest for metaprogramming purposes.

i guess i need to isolate the exact location of the paradigm
conflict. what do i want, really? 

GOALS:

  * generating new names (macros) should be possible within forth
    code. currently, the only way are the words ':' and 'variable'.

  * cross reference should be possible. this currently works for
    macros, because they use a two-pass algorithm (gather macros
    first, then compile the code) and works for procedure words, also
    because of a two-pass algorithm (ordinary assembler).

  * linearity in chunks should be possible, which is the current
    model.

questions from this:

  - is it possible to unify the 2 different ways of emplying a 2-pass
    algorithm for cross-references?

  - how to move from a fixed 2-layer architecture (macros + words) to
    an n-layer architecture. is this doable without a language tower?
    is it desirable? (is reflection really that bad? does it conflict
    with automatic cross-reference?)


the more i let this roll around, the more a certain light goes to this
solution: split the problem in 2 languages. use a reflective forth
which 'unrolls' into a layered language description, and a static
layered s-expression based language that uses the same macro core.

this gives the convenience to use forth syntax and the reflective
paradigm, and at the same time the flexibility to use the language
tower when reflection is too difficult to get right, or the automatic
layering doesnt work..

so, the current question becomes: can the GOALS be kept by moving back
to a completely reflective machine (including parser!) which unrolls
automatically?

remark: it looks as if i really need the equivalent of 'define' which
would be really 'let'.. it all seems to boil down to scope (Scope is
everything!). a forth file should be transformable into a collection
of definitions and macro definitions. it probably makes a lot more
sense to see the dictionary as an environment which implements the
name . value map of a nested lambda expression.

let's see.. 

   the current model (macros are compositional functions) is really
   good. the remaining problem is scope: when to nest (let*) and when
   to cross-ref (let-rec).

another idea.. instead of looking from the leaf nodes and building a
dependency tree, what about starting from the root (kernel) node, and
build an inverse dependency tree? the linear model is the intersectin
between the two.


Entry: future CATkit
Date: Wed Feb  6 13:43:32 CET 2008

some possible roads to travel with CATkit, and associated problems:

* boot loader programmer: instead of going with the USB TTL cable, it
  might be more interesting to create a complete solution for
  programming with brood: one that can program any of the target chips
  straight from the factory. it's pretty clear to me now that freezing
  the bootloader spec is going to be really problematic: they are
  project-specific. building a single all-in-one programmer/debugger
  solution is the way to go. maybe the E2 ideas can be unified with
  this too?

* to make the programmer doable, it might be wise to start using
  available Microchip C code: which means being able to link Purrr
  code to a MPLAB or Piklab project. also for ethernet based pics this
  might be wise. time to get a bit less radical if i want to get
  things done..

* a fairly standard 16bit Forth language. i'm far removed from this if
  i first want to fix the internal representation back to a more
  reflective approach with automatic unrolling into nested namespaces,
  and integrated parsing.. (EDIT: not true.. since the Purrr18
  language should remain fairly stable, writing the Forth while doing
  the macro changes might work out just fine.)

* pre-assembled kits for Forth-only workshops. what is necessary there
  is to work for minimal cost: basicly shrink and eliminate
  through-hole components. however.. the big cost is really not the
  board if it has pots on it. the deal is: there's no point in
  competing with arduino.


Entry: overall design changes
Date: Fri Feb  8 11:45:31 CET 2008

assembler

  it's been fun, but it might be good to start outsourcing code
  assembly. especially regarding the future use of different
  architectures, and interfacing with object code formats. it fits
  better in C code generation too.


interaction

  this needs some thought, but at this point an abstract interface
  between the compiler and the target system is necessary. the road
  towards this consists of writing a double backend: one for PIC18,
  and one for ARM (philips) or MIPS (microchip 32bit). i'm thinking
  about moving most of it back to scheme, and phase out the cat code
  in prj.ss and badnop.ss


forth language

  i'm a bit in a ditch here.. the current attempt to unify the
  namespaces into a single nested macro name space brings up questions
  about maybe unifying the parser too.. however, looking at radical
  forth changes like colorForth, a move towards a rather fixed parser
  can be observed. in my approach, the parser takes out a lot of dirty
  forth-isms while at the same time keeping the syntactic convenience
  they bring, at the price of not being so extensible.. the core idea
  is still: the current functional macro approach is good, i just need
  to figure out how to organize the name space and keep everything as
  declarative as possible (relationships, not state changes).


Entry: CATkit 2
Date: Fri Feb  8 16:53:48 CET 2008

Keeping the current code in Purrr18 as the implementation language,
moving to an on-target interpreter seems like the only sane way to
decouple the CATkit community project from the evolution of
BROOD. CATkit/Sheep core could still be done in Purrr18, but the
availability of a straight no-hassle Forth would make things a lot
simpler. Clear separation of kernel / user also serves as a good
psychological barrier.

This has huge implications for the architecture. The 18F1320 won't be
enough. Probably a move to 18F2620 is necessary because of memory
requirements.

Using the current architecture though, there is a possibility to take
the following path:

 * create a different debug bus over the ICD2 connector
 * use the serial port for Forth console

Actually, that's not really necessary.. All this can be multiplexed
over serial. Another qestion is: does it make sense to have an
intermediate dtc layer like i have now, which essentially uses a
double implementation of the compiler (macros): one in brood and one
on the target? Really, the only thing to do is to replace machine code
with Purrr18 and for the rest build a standard console based Forth
machine.


Entry: stand-alone Forth
Date: Fri Feb  8 17:14:43 CET 2008

rationale:
  * more standard (documentation)
  * no dependency on Brood (decoupled from scheme + emacs)
  * no double implementation of compiler (host + target)

roadmap:
  - look at Flashforth and Retro Forth.
  - start building dictionary -> interpret mode -> compile mode
  - possible on 18f1320 ?
  - macro/immediate?
  - tail recursion?


Entry: goals
Date: Sat Feb  9 10:24:59 CET 2008

to prevent ending up in a random walk, it's time to clearly state some
goals on the PIC18 front.

  BROOD core + PURRR18: target audience is mostly myself, or people
  with assembler/electronics background. most important features are
  flexibility (focus on macros and code generation), speed and code
  size. BROOD is a tool for the "kleine zelfstandige".

  stand-alone PURRR: target audience is much broader. less emphasis on
  absolute control, more on simplicity, language stability and
  compatibility across platforms. it's the "configuration
  language". i'm thinking ANS + tail recursion + concatenative VM.

non-PIC18 things are quite open still. core needs more modularity (see
entry://20080208-114531)


Entry: pragmatics of macro namespaces
Date: Sat Feb  9 15:25:56 CET 2008

what about this: 

  * design an s-expression syntax that has all the desired properties.

  * make the name-value binding explicit and unique: this gives
    problems with multiple entry and exit points.

  * write a translator from forth syntax

  * regenerate the macro cache, each time the language nesting level
    changes.


(language <macros> <words>)

(language
 ((a () 1 2 3)
  (b () 4 5 6))
 ((help a b)
  (broem b b b)))

nested syntax: at each point the current language sees the enclosing
macros. a compilation step compiles code into macros containing the
addresses.

<macros> <defs>  ->  <macros+> <code>

each macro block begins a new language layer.

time is not right yet. maybe i should do the forth first?

no.. i need to start breaking things and building them back up to get
more insight on how to disentangle before changing the current code.



Entry: breaking macro storage
Date: Sat Feb  9 17:40:04 CET 2008


simply replacing '(macro) with '(dict) now..

  secondary: prj.ss is really hard to understand. maybe more of the
  cat code should be moved to scheme? or at least to a more functional
  approach.. the state management is still difficult to understand.

looks like this just works for the monitor. now why is that? i
expected it to break somewhere..

it indeed breaks somewhere: interactive mode. looking up words doesn't
work. time to move that to a more abstract implementation in target.ss

next thing that broke is 'mark'.

  prj.ss: is so dirty because there's a lot of mutation going on, and
  the naming of words is really inconsistent. this really needs
  cleanup.

  another hidden assumption about "org" in bin->chunk. the problem
  seems to be that absence of 'org' leads to problematic asm blocks.

what about structured asm? i read something about this in olin
shiver's comments about a summer job he did implementing a scheme
compiler.. maybe that's what i need to go to? anyways.. there's a lot
lot lot of work cleaning up data representations.

  the whole ifte/s and run/s business is a bit rediculous.. it doesn't
  feel natural, and requires deep thought each time. i think it's time
  to ditch the way state access works, and move most code to
  functional programming with prj.ss doing nothing but state
  management (no control logic!)


Entry: state management / the point of prj>
Date: Sun Feb 10 14:50:08 CET 2008

something really smelly about it. i think i'm better off with true
mutation in the scheme sense, instead of working around it the way is
done in prj.ss

the base line is: this prj> mode should be usable for DRIVING THE
TARGET. the whole functional state business is overkill: most code can
really be made functional, and possibly more understandably written in
scheme. whenever state recovery is necessary, it can be moved to the
functional domain (i.e. assemble and compile as they are now..)

the problem i'm trying to solve is discipline: not gratuituously using
global state. maybe i should read some haskell tips, since this is the
way haskell programs seem to be written: a bulk of pure functions and
a central state management module through monads.

let's see some important properties:

  - the interactive forth layer translates to prj scat code
  - the macro code is purely functional code with a threaded asm state
  - staying close to scheme keeps things simple


other remarks:

  - base and prj are different. this is clumsy.
  - there are 2 namespaces: NS and the prj state namespace.
  - prj already behaves as true mutable state. is permanence necessary?
  - atomic failures

preliminary conclusion is: scat code is important as intermediate
layer between scheme and forth, both for interactive and compile time
use. the compile time part needs to be functional because it makes
computations easier: compilations should be really just functions. the
interactive part however is intrinsicly stateful: ultimately it
manages the state of the target and the current view (debug UI).

the only place where current scat/state approach is useful is atomic
state updates. these however, can be replaced by purely functional
code and a transaction based approach: each command is a state
transaction and either fails or succeeds. compositions of transactions
should maintain that property. aha, holy grail identified:

   COMPOSABLE TRANSACTIONS


maybe i just need to start reading again. this is very related to COLA
(combined object lamda architecture) and the recent transactional
memory stuff in haskell.


Entry: transactions
Date: Mon Feb 11 09:35:00 CET 2008

the way it works now: every console command that updates the state
store in snot.ss is a transaction. if it fails, the previous state is
maintained. something like that can be implemented differently.

what i'd like to avoid is to have to copy NS in the current
implementation. a possibility is to transparently replace part of the
NS tree with an association list. then parameters can be used to make
a copy.

it looks like the let* / letrec problem wants to propagate deep into
the structure of the entire program.. why is that?

maybe i should start using a persistent object model for the store?

ok.. this is shaking up the roadmap again. TODO:

  - fix the problems with macro unification
  - implement reverse macro lookup properly
  - think about making evaluation time concrete (entry://20071217-100727)
  - work towards a cleaner state representation


about haskell and monads: looking at state management, monads somehow
solve the bookkeeping of 'current' data. this can take many forms, but
two crystallized constructs are: global and dynamic environments,
which in scheme would solve most problems involving the passing of
data outside of function arguments. thanks to the type system in
haskell, the red tape can be hidden, and all is implemented using just
functions. 

EDIT: being able to use state restore on failure on the command line
level is really nice. this should not be given up. however, once the
target is being modified, errors can't be fully recovered.



Entry: variables
Date: Mon Feb 11 10:17:08 CET 2008

running into trouble with recursive variable expansion. the problem is
that a variable is this:


 #`((extension: name () 'name)  ;; macro quotes name 
    'name #,n buffer)   

which uses:

 (([qw name] [qw size] buffer)   ([variable name] [allot 'data size]))

and this in the assembler:

  (define (variable symbol value)
    ;; FIXME: no phase error logging?
    (dict-shadow-data (dict) symbol value))

so eventually, the name will get shadowed. the problem now seems to be
that there's some recursive lookup that messes things up?

lets try a test case.

   variable broem
   broem  \ <- infinite loop

ok.. conceptual error or just small bug?
just small bug: forgot parens around 'name in (extension: name () ('name)
which gave (quote name) -> recursive call


Entry: intermezzo -> snot + interrupt
Date: Mon Feb 11 10:22:48 CET 2008

this is getting on my nerves. it's been fixed a while ago in mzscheme
cvs, but maybe i should just go for 3.99 atm? see if it breaks
things..

went pretty well. had to replace some reverse! by reverse, and use
mutible pairs in the decoder.ss

another thing that changed is manual expansion of user paths
(tilde). this is a bit more problematic.

another thing that gets on my nerves is the absence of stack
traces.. what am i supposed to do with this:

  ERROR:
  car: expects argument of type <pair>; given {#f . #<procedure>}

ok.. it is pretty deep: the srfi-45-promise uses mutable pairs.
fixed + fixed the plt sandbox code and sent mail to plt-scheme list
fixed break stuff in brood + snot.
breaks work now.



Entry: more fixes
Date: Mon Feb 11 16:14:53 CET 2008

the 'empty' needs to be fixed. something wrong there.  doing reverse
asm would be an interesting next step + moving some code to hex
printing.


Entry: moving more code to scheme in tethered.ss 
Date: Tue Feb 12 13:57:00 CET 2008

  * mzscheme with modules is quite a nice namespace management tool to
    write nontrivial programs. the big flat namespace with
    late-binding plugin behaviour in brood is a bit messy. maybe i do
    need the extra bit of mz handholding, and move plugins to
    parameterized code?

  * i really miss closures when writing cat code. names and nested
    scopes are important, and trading in a bit of conciseness for
    names (and absence of stack juggling!) is a good idea. with
    closures and macros, scheme is malleable enough to reduce red tape
    where necessary. my personal preference is moving: cat is not a
    good implementation language compared to scheme.

  * the cat intermediate language is interesting to simulate
    interactive forth: translation is really straightforward. gluing
    scheme and forth together, this layer serves well: adding scheme
    functionality to cat is straightforward + translating forth to cat
    is too.

this leaves me with the following problem to fix: ts-stack is a word
that is used to plug in the target stack bottom + pointer location. do
i keep it like that?

it looks like these things are best solved using parameters: that way
the scheme code will work too. maybe i should make a list:

  * connection (lazy-connect.ss)

candidates:

  * stack location
  * flash program/erase size
  


Entry: porting to mz v4
Date: Fri Feb 15 10:27:21 CET 2008

yeah, reading docs can bring clarity ;)

  doc/release-notes/mzscheme/MzScheme_4.txt

i got a bit confused about the whole scheme and scheme/base thing
while reading some web server docs. the biggest change seems to be the
use of optional and keyword arguments in lambda expressions.

do i make a full port? probably best to not keep too much legacy in
the brood core.. i need the upgrade for sandbox.ss fixes, so maybe
it's time to jump to 4 completely. as expressed in the release notes,
the keyword arguments can be problematic for legacy code..



Entry: big changes
Date: Fri Feb 15 10:52:12 CET 2008

OK..

i think i know what i need to do, but it's a big job: i need to get
rid of the NS namespace, and split the code into:

   * purely functional
   * parameterized

the line between the two isn't clear-cut. parameters are things that
are "mostly constant". i.e. communication ports, file paths, ... to me
it looks like this is the most important line of name space management
in scheme code. (in haskell, the problem of code parameterization as
automatic threading of data is solved using monads)

the problem with parameters is that they break referential
transparency, which is a great property for testing.. i think in most
cases, a transparent function can be wrapped in a parameterized
one. i just need some moderation here: every use of a parameter, deep
in the code (like 'here' in the assembler) makes things more specific,
but might be the right thing to do.

so, basicly, code can be dynamically layered: the assembler
i.e. doesn't USE the target dictionary as a parameter, but gets it
passed as an argument by the interaction system (which i.e. does has
it as a parameter). in contrast, the assembler, internally, might use
dictionary as a parameter, but the code outside of the assembler
doesn't need to know that.

getting rid of NS namespace, and moving to module name management
instead means:

  * more code is static (tree dependencies)
  * plugin behaviour (graph dependencies) need to be solved explicitly
  * simpler: map everything straight to scheme compilation, with names:
   - lexical
   - module-local (with prefix to separate from scheme)
   - toplevel (might be used for plugin behaviour / units?)

that looks significant enough to call it brood-5


Entry: eliminating the state dialect
Date: Fri Feb 15 11:20:19 CET 2008

anything that can be done on brood-4 before making the jump to
abolishing NS? yes: moving to parameterized project data while keeping
the transaction-like workflow intact + solve transaction thing with
target memory maps.

ROADMAP:

  - move more compiler code to scheme.

  - eliminate the prj <- state implementation, but make sure
    transaction behaviour is maintained (association lists or
     hash tables?)

  - move assembler and parser to separate dictionaries (or keep them
    in NS till later?)  

  - move CAT code to module based namespace.



Entry: plt scheme study
Date: Tue Feb 19 16:44:25 CET 2008

maybe it's best to look a bit closer to the plt scheme language now
that V4 is coming out. some things i'd like to know more about are:

  * mixin class system
  * delimited control

mentioned on http://en.wikipedia.org/wiki/Plt_scheme

in addition, it would be nice to get more of the drscheme
functionality in snot, such as proper stack traces, module browser,
syntax-level refactoring.

i'll take http://zwizwa.be/darcs/sweb as the case study for
this. brood's a bit to hairy atm.

trying to make sense of:
http://www.cs.utah.edu/plt/delim-cont/

it looks like understanding this will bring me closer to understanding
the problem in brood with "undo" at the console, and the transaction
based model i'm chasing after. yeah, vague..

reading the paper. chapter 2: the operators: shift, control, reset.
hmm.. i'm missing a lot of muscle to read that one..

ltu to the rescue:
http://lambda-the-ultimate.org/node/606
http://lambda-the-ultimate.org/node/297

  "Good stuff! But keep in mind that, as the cartoon in the slide
   says, control operators can make your head hurt..."

no shit..

to summarize vaguely what the 2 points are about:

  - delimited control: partial continuations: don't jump outside of
    context.

  - mixins: somewhat related to generic functions.

about the delimited continuations, it might be best to read the plt
doc on "prompt" and some related things on continuation marks and
stack traces. for mixins, i'm reading this:

http://www.cs.utah.edu/plt/publications/aplas06-fff.pdf

from a quick skim i don't see how it's related to generic functions
though.. mixins seem interesting though i don't see the difference
with multiple inheritance. maybe that inheritance is linear instead of
tree-structured?



Entry: expression problem
Date: Thu Feb 21 23:23:46 CET 2008

http://groups.google.com/group/plt-scheme/browse_thread/thread/3aaacdc5169e5889

Mark's reply was pretty clear, and this:

  The PLT folks have used the expression problem as a springboard for
  thinking about big issues like, what does it mean to be a software
  component, and what are appropriate ways for reusing and extending a
  software component.

Is then modules/units/classes/mixins..

Swindle might be indeed a good thing to have a look at next. The whole
deal of multiple dispatch, so central to PF, is in the end something i
need to understand better.

about multimethods: cicil is mentioned here:
http://tunes.org/~eihrul/ecoop.pdf

http://citeseer.ist.psu.edu/219067.html
compression of dispatch tables?
(about PF: there's probably a way out using small number of types or
compile time type inference..)


I'm reading ``Modular Object-Oriented Programming with Units and
Mixins'' now.

The slogans make a lot of sense:

  * UNITS: Separate a module's linking specification from its
    encapsulated definitions.

  * MIXINS: Separate a class's superclass definition from its
    extending definitions.

Maybe i should give it a try?




Entry: units
Date: Fri Feb 22 01:06:40 CET 2008

looks like units + modules are going to be enough to organize brood
without the need for a NS hash table. how to exactly chop it up is
still a bit of a mistery. maybe start with the plain CAT code, then
organize the macros in a similar way, then find a way to translate
forth code straight to s-expression.

what if i start with separating out the assembler as a unit? in the
end i'd like to be able to use externally provided assemblers / C
compilers.

in doing so, abstracting the data types that are passed between
assembler and linker might be necessary. these are assembly opcodes,
dictionary and compiled target words + linker data.


Entry: call by need
Date: Fri Feb 22 12:18:56 CET 2008

was trying to quickly hack up a solution in scheme that emulates
makefiles and i realized it's actually call-by-need, which is again
the same as the dataflow serialization problem (pd). which can be
extended to early reuse by transforming it into a linear language
(i.e. forth).


Entry: delimited continuations
Date: Tue Feb 26 13:26:36 CET 2008

best to start here:
http://pre.plt-scheme.org/docs/html/reference/Evaluation_Model.html#(part~20prompt-model)

i think i sort of get it.. the analogy of stack frames, but more
general since they can be tree-structured (just like
environments). all the operations on continuations are then
compositions of these trees, with restrictions on how far back in the
tree continuations can be captured, and rules on composition that
makes sence in light of these restrictions.


Accessing a tree as if it were a stream and ``updating'' in-place
without mutation..

http://lambda-the-ultimate.org/node/969


Entry: errortrace
Date: Thu Feb 28 11:50:17 CET 2008

http://pre.plt-scheme.org/docs/html/errortrace/installing-errortrace.html

this work when using it like this:

Welcome to MzScheme v3.99.0.13 [3m], Copyright (c) 2004-2008 PLT Scheme Inc.
> (require errortrace)
> (enter! (file "/tmp/test.ss"))
 [loading /tmp/test.ss]
 [loading /usr/local/mz-3.99.0.13/collects/scheme/base/lang/compiled/reader_ss.zo]
 [loading /usr/local/mz-3.99.0.13/collects/syntax/compiled/module-reader_ss.zo]
> (a)
error: bla
/tmp/test.ss:8:12: (error (quote bla))
/tmp/test.ss:6:12: (b)


 === context ===
/tmp/test.ss:7:0: b
/tmp/test.ss:6:0: a
/usr/local/mz-3.99.0.13/collects/scheme/private/misc.ss:63:7

the file /tmp/test.ss is:

#lang scheme/base
(provide a)
(define (x) #f)
(define (a) (b) (x))
(define (b) (c) (x))
(define (c) (error 'bla))


now, to incorporate it in snot, it looks like there's a combination
needed with prompts. indeed.. the error printing works fine when
wrapped in 'prompt', and execution continues thereafter.

http://pre.plt-scheme.org/docs/html/reference/cont.html#(mod-path~20scheme~2fcontrol)

first thing to note: 'prompt' and 'abort' i can add those in sweb
instead of the current combination of parameters and call/ec.

second: prompt is readily applied in the repl in brood, at run/error
in host/base.ss

it works for host/purrr.ss by replacing the toplevel error printer by
a prompt. probably can do the same in snot.

hmm.. it's not in snot that the prompt should be. i did add some
marking to the code that prints 'language-rep-error' in case the
underlying rep (provided by the program!) doesn't print the error
itself. so in brood the error should be printed, and preferably INSIDE
the box context.

"console.ss" is loaded in the snot context from "snot.ss". the latter
file registers the different languages using the 'register-language'
snot function present in snot's toplevel. ("snot.ss" is not 'require'd
but 'load'ed)

what i'm interested in is frames that run up to the sandboxed
evaluator, so maybe it should be implemented in snot/box.ss ? see snot
ramblings for more..


Entry: continuation marks
Date: Thu Feb 28 17:20:17 CET 2008

http://www.cs.utah.edu/plt/publications/icfp07-fyff.pdf

currently continuation marks are used to make some kind of scat
language trace through the code. basicly, i can put anything there i
want. it's reassuring that the basic mechanism is available. (also,
this idea is very related to some dynamic variable hack i tried in
PF.. don't remember if it's still there..)

something strange that i didn't know about exceptions: apparently the
handler is executed in the context of the 'raise' call! that explains
a lot. no.. this is not the case:

(define param (make-parameter 123))
(with-handlers
    (((lambda (ex) #t)
      (lambda (ex) (printf "E: param = ~s\n" (param)))))
  (parameterize
      ((param 456))
    (begin
      (printf "B: param = ~s\n" (param))
      (raise 'boo))))

gives:
B: param = 456
E: param = 123

ok: i'm confusing the lowlevel 'handle' with the highlevel 'catch'.
the paper mentions how to implement 'catch' on top of 'abort', but
also talks about interference of prompts, and the use of tagged
prompts to work around that.

so the bottom line: exceptions and prompts do not collide, because the
prompt tag used to implement exceptions is not accessible. this does
mean that an exception can jump past any arbitrary prompt.

question: how does this work in sandbox? apparenlty sandbox re-raises
exceptions: see the internal function 'user-eval' in 'make-evaluator*'
in scheme/sandbox.ss : the value that comes from the channel is raised
if it's an exception.

something is still don't understand about mixing of prompts and
exceptions. if i don't wrap a prompt around the evaluation in
host/purrr.ss exceptions will terminate the program, so the prompt
seems to terminate propagation and trigger the printing of the
error. however, doing this down the chain in snot doesn't work like
that..

a prompt with default tag wraps the toplevel, so the whole
continuation is also a partial continuation (upto that prompt).

hmm.. then i read this:

  "The default prompt tag is also part of the built-in protocol for
   exception handling, in that the default exception handler aborts to
   the default tag after printing an error message."

note this says 'default exception handler'. so if there's one above
the prompt, that one will be called instead of the default handler.


Entry: roadmap
Date: Thu Feb 28 14:45:25 CET 2008

adjusted roadmap:

  * get base language working without NS + put in separate module.
  * figure out how to use units for plugin behaviour

then follow up with entry://20080215-112019

it looks like understanding the namespace issue by first moving the
core component to a more native namespace management system is a key
element. the rest should then be mere disentanglement.


TODO:
  separate SCAT as a different project
  separate it from NS



Entry: eval vs. require
Date: Sat Mar  1 19:56:18 CET 2008

the key insight (finally) seems to be that the current 'eval' based
approach needs to be replaced by 'require', or an underlying mechanism
that allows module based namespace management. everything that now
goes through the NS hash can be done with module namespaces.




Entry: module namespaces
Date: Mon Mar  3 00:27:20 CET 2008

everything reduces to scheme code in modules, which makes things
easier to extend. (also for parsers?)

(define increment
  (lambda s
    (apply base.+
           (cons 1 s))))


the idea is that 'increment' can be imported as 'base.increment', or
anything else, using prefix imports. there's no need to specify the
target namespace unless there are clashes between scheme and the
functions defined in the module, which can be avoided by not importing
scheme bindings, and separating definition of base. primitives (which
has scheme available) from definition of composites. composite modules
then only contain definitions which map some namespace ->
(un)prefixed.

to this end, a similar aproach can be used as the 'find' plugin in the
rpn syntax currently used for NS linking. the 3 elements: syntax,
source ns and dest ns can be specified like before. (just make a
namespace translator?)

problem solved? probably only units left: plugin behaviour needs to be
handled explicitly.


Entry: language tower
Date: Mon Mar  3 00:38:37 CET 2008

scheme
base    snarfed scheme functional rpn
state   macro primitives
macro   forth machine wrappers
forth

why so many?

they all solve a single problem in a very straightforward way. base
snarfs functionality from scheme, state is a lifted base + threaded
state, and macro implements the greedy machine map + peephole
optimizer using a threaded state model.

misc hacks from plane notes:

- auto snarf through contracts
- use #lang scat/base for base->base maps (is purely declarative
  language possible?)
- decouple module as unit to speed up compilation during incremental
  dev. (fake image based dev)
- get rid of @ stx for streams (scribble) / find standard streams lib
  / use lazy scheme. (brood is pure FP so why not)
- use parameters for compiler object (also for NS stuff?)



Entry: parameterized transformer
Date: Wed Mar  5 14:30:34 CET 2008

instead of using a compilation object, it might be more convenient to
use parameters in the transformer environment to define functionality
for the basic syntax operations.

maybe best to write the rpn code from scratch in scat/rpn/



Entry: scat ready
Date: Thu Mar 20 08:38:42 EDT 2008

looks like the lowest layer of rpn code + namespace management is
done. made a nice extension that allows parsers to be written as
syntax transformers (like it should!).

until the representation part is finished and ready to be ported to
brood, the process is documented in the dev log at
http://zwizwa.be/ramblings/scat






Entry: BROOD-5: initial move from BROOD-4
Date: Fri Feb 29 12:39:38 CET 2008

This ramblings file is a merge between BROOD-4 and BROOD-5. The new
version is codenamed SCAT, and is a complete rewrite of the core
representation and name space handling code. The darcs archive has
been flushed as has happened before. The old histories are still
available at:

http://zwizwa.be/darcs/brood-4
http://zwizwa.be/darcs/brood-2
http://zwizwa.be/darcs/brood-1

(brood-3 didn't have a history flush, and is present in brood-4)


Entry: utilities = language ?
Date: Fri Feb 29 13:50:02 CET 2008

Splitting brood in 3 components: brood, scat and zwizwa brings up the
problem of code bundling. there are 2 views of modules:

  - what they provide.

    this is the most important form of organization. there's a
    spectrum with 2 extremes: one object, and everything. the latter
    is a utility module, which is akin to a language. the former is a
    component module: an abstracted collection of code with a very
    limited interface.

  - how they are used.

    using component modules is straightforward: since they are often
    highly specialized, dependencies between components can be clean
    and understandable. using utility modules is not: granularity is
    much finer, and they behave more like "background noise": stuff
    you need to know about, but can assume to just "be there".


therefore, when using utilities in a project, like scat, it's maybe
best to take a single file and make sure it exports a non-colliding
set of tools. so the purpose of that single file is to be a decoupling
point, providing a language to the client, and importing small
utilities and components from all kind of different sources.

  so the approach i take is to have one collection of utilities
  (zwizwa-plt), and have a single file in each project that uses a
  base language with (a subset of) these utilities present.

an organic analogy:

     GRASS = all permeating language (base lang + utilities)
     TREES = specialized program components


Entry: scat without ns
Date: Fri Feb 29 14:26:39 CET 2008

how to proceed? this needs abstraction of definitions in 'composite'
macro and abstraction of 'find' in code bodies. the latter is already
worked out.

TODO: make base.ss independent of ns.ss

might be a good opportunity to start documenting. maybe try out
scribble?

scribble is quite nice.

disentangling ns is not going to be simple though. there's aproblem
that i didn't think about: BROOD is fraught with occurences of
defining one language in terms of another one (i.e. primitive
macros). will this still work? i do need different namespaces. should
they also be just prefixed? => this is a core problem and needs a
proper interface!

also.. why not use real objects for the rpn-tx.ss plugin behaviour?
maybe it is overkill.


Entry: for-template and scheme/base
Date: Wed Mar  5 16:44:26 CET 2008

setting: 2 modules
  test.ss   (require (for-syntax "rep.ss"))
  rpn-tx.ss (require (for-template mzscheme))

when test.ss is #lang mzscheme, or #lang scheme, this works. however,
for #lang scheme/base i get an error:

/home/tom/scat/scat/rpn/test.ss:8:2: compile: bad syntax; function application is not allowed, because no #%app syntax transformer is bound in: (lambda (stx) (syntax-case stx () ((_ . code) ((rpn-represent) (syntax code)))))

after adding a 'for-syntax mzscheme' or 'for-syntax scheme/base' in
test.ss it works.

what works is to add (for-template scheme/base) in rep-tx.ss and
(for-syntax scheme/base) in test.ss

looks like there's different phase 1 bindings for mzscheme and
scheme/base.. or something.. i don't really understand.

i try to explain:

tom@del:~/phase-test$ cat tx.ss 
#lang scheme/base
(provide gen-code)
(require (for-template scheme/base))
(define (gen-code) #'(+ 1 2))

tom@del:~/phase-test$ cat use.ss 
#lang scheme/base
(require (for-syntax "tx.ss" scheme/base))
(define-syntax gen
  (lambda (stx) (gen-code)))

why are both requires of scheme/base needed?

EDIT: take a look at these expands:

box> (expand-syntax #'(module broem scheme/base (define foo 123)))
(module broem scheme/base
  (#%module-begin (define-values (foo) '123) (#%app void)))
box> (expand-syntax #'(module broem mzscheme (define foo 123)))
(module broem mzscheme
  (#%plain-module-begin
   (#%require (for-syntax scheme/mzscheme))
   (define-values (foo) '123)))

mzscheme has mzscheme included in phase +1, while scheme/base does
not. (what's the difference between %plain-module-begin and
%module-begin ?



Entry: snarfing
Date: Wed Mar  5 18:27:53 CET 2008

got the syntax part working. next = snarfing from scheme to scat
names. 

got compositions working too.

do need to take a good look at repl toplevel vs. module local
names. how to destinguish between undefined names in module context,
and use of a toplevel name in repl context?

EDIT: actually, it's not necessary. it might be easier to just map to
a name when it's not lexical if the distinction between module-local
and other isn't necessary, and let the scheme name resolver take care
of it.


Entry: namespaces
Date: Thu Mar  6 13:22:21 CET 2008

it's a lot simpler now. The dispatch routine interprets two kinds of
identifiers: 

   lexical -> used as is
   other   -> prefixed

if a finer grained control is necessary, the rpn-global (and possibly
rpn-lexical) parameters can be overridden.

  NOTE: it indeed makes a lot of sense to do this with parameters: it
  falls into the "printing" design pattern: code that transforms a
  description into a "document" according to a set of global
  parameters.



Entry: language names
Date: Thu Mar  6 16:02:48 CET 2008

is that still necessary? the only reason it's there is to interpret
the source code, but if that interpretation isn't always possible, why
keep it? it's also pretty awkward to fill the parameters everywhere..

with the risk of breaking things, i'm going to take out the language
names. probably best to have the source code field represent an
expression that evals to the object, and let the debug code that uses
the source field interpret the whole expression.


Entry: state syntax
Date: Thu Mar  6 18:25:20 CET 2008

scat/state doesn't do anything else than switching namespace for
quoted programs. symbols need to be imported explicitly. so the
question is: how to import a bulk at once?

EDIT: forgot immediates! ok.. seems to work now.

maybe this is it:
(namespace-mapped-symbols
 (module->namespace 
  '(file "/home/tom/scat/scat/base/base.ss")))

not so difficult: see module->names in ns.ss

now the question is, how to map this to a 'require' form?

this doesn't seem to work, i get some strange errors that might be due
to the fact that i'm using dynamic-require into the current
namespace. what really needs to be done is just determine the exports
of a module: the module then needs to be compiled, but not
instantiated.

maybe 'module-compiled-exports' is a better way to get to the exports?

from scribble/search.ss:
        
 (module-compiled-exports
   (get-module-code (resolved-module-path-name rmp)))])

ok.. from syntax/modcode i can use get-module-code to do this:

(define (get-exports path)
  (map car
       (car
        (call-with-values
            (lambda ()
              (module-compiled-exports
               (get-module-code
                (resolved-module-path-name
                 (make-resolved-module-path path)))))
          list))))

which works on:
(get-exports (string->path "/home/tom/scat/scat/base/ns-tx.ss"))

now i just need a way to make this work on ordinary require specs.
+ solve some problem in ns.ss (%app)

looks like i've got something that works from toplevel.. now need to
get it going in modules.

almost.. the rest is for later.


Entry: modules
Date: Sat Mar  8 17:12:01 CET 2008

got stuck yesterday at translating require specs -> module file
location. going to leave it, and try to get the symbol snarfing
working.

box> (define-lifted (state)
             (base)
             state-lift "base/base.ss")

compile: bad syntax; reference to top-level identifier is not allowed,
because no #%top syntax transformer is bound in: module

i have no idea what's going on here..

what this means is that:
 * the compiler maps undefined symbols -> toplevel refs
 * there's an undefined symbol mapped to toplevel refs
 * there's no toplevel


Entry: start from scratch
Date: Sun Mar  9 14:43:10 CET 2008

probably need to read a bit about modules, namespaces and
compilation. for example, what does this code actually do?

(module test mzscheme
  (provide foo boo)
  (define-syntax (boo stx)
    (syntax-case stx ()
      ((_ . args)
       (begin
         (printf "compiling\n")
         #`(+ 1 2)))))
  (define foo (boo)))
    

when evaluated, it declares a module, compiling its code.

before i can understand things, i need to see the relation between:

  - compilation handlers
  - load/use-compiled
  - get-module-code
  - namespace
  - namespace's module registry (namespace-attach-module)
  - code inspectors

the get-module-code approach works, but somehow is context-dependent
(current namespace / module registry?). it would be interesting to
find a method independent of context.


so...

a namespace is somthing that maps names -> things, to be used by
'eval'. it's a generalization of the standard scheme toplevel.

each namespace has a module registry. modules declared in a namespace
will attach to this registry, to be referenced by an identifier.


Entry: getting at the names..
Date: Mon Mar 10 23:06:32 CET 2008

got something that works:

;; Get to the exported names by requiring the module into an empty
;; namespace which has the base module attached to its registry.
(define (get-names path)
  (let ((n (make-base-empty-namespace)))
    (parameterize
        ((current-namespace n))
      (namespace-require/expansion-time path))
    (namespace-mapped-symbols n)))

still stuck at some #%app problem further down the line, but the names
come out.

wait.. what a mess! the previous one did work, and the current one
doesn't (needs absolute paths).. get-module-code is ok.


this seems to be problematic:

(define (define-ns-tx stx)      
  (syntax-case stx ()
    ((_ ns name val)
     (let ((mapped (ns-prefixed #'ns #'name)))
       #`(define #,mapped val)))))

the name created here 'mapped' is not recognized as a module-local one.



----------- broem.ss
#lang scheme/base
(require 
 (for-syntax
  scheme/base
  "broem-tx.ss"))
(provide foo)
(define-syntax (foo stx)
  (foo-tx stx))

----------- broem-tx.ss
#lang scheme/base
(provide foo-tx)
(define (foo-tx stx)
  #`(+ 1 2))



this gives
box> (require "broem.ss")
box> (foo)
broem-tx.ss:4:4: compile: bad syntax; function application is not allowed, because no #%app syntax transformer is bound in: (+ 1 2)

 === context ===
/usr/local/plt-3.99.0.12/collects/scheme/sandbox.ss:445:4: loop


the thing that's missing here is the (require (for-template
scheme/base)) in broem-tx.ss


OK..
so it seems that now the remaining problems are because some names are
expanded to toplevel form because they are not visible somehow?
#lang scheme/base



------- lala.ss

(require
 (for-syntax scheme/base))

(define foo 123)

(define-syntax (broem stx)
  (printf "compiling broem\n")
  #`(define lala #,'foo))

(broem)

(provide foo broem lala)



that doesn't create the lala symbol.. why?


Entry: more confusion
Date: Tue Mar 11 12:27:28 CET 2008

The module-path as passed to require is resolved as:

 ((current-module-name-resolver) '(file "asdf") #f #f #f)

actually, the docs explain it quite well: the name resolver also loads
the module and places the name into the current registry when 4th arg
is #t.. so we have the connection:


current-module-name-resolver -> current-load/use-compiled

EDIT: also look at 'expand-import' from scheme/require-transform 12.4.1


Entry: the problem
Date: Wed Mar 12 09:16:25 EDT 2008

so.. what about making a test case for the actual problems?

  module a: exports a couple of names
  module b: reads module a's source, extracts names, and uses them

  why can't you export a name created by a macro? i think i did this
  before, but the experiment above doesn't work.. what's up?


Entry: again
Date: Thu Mar 13 09:07:52 EDT 2008

why is this so confusing? what i try to do is symbol capture. or not?
2nd question first. let's look at how structures are implemented: they
create new names. maybe 'syntax-introduce' is necessary?

i clearly have to stop tackling this in a half-assed way. what's going
on here seems to be me being confused about the actual inner workings
of the module system, syntactic forms and namespaces. it's at the very
core of the language, nothing to try to random-walk around.. time for
some discipline.

revised questions:
  * how to export a name generated by a macro?
  * is my approach (snarfing names from modules) 'the right one'?

roadmap: read about core syntax and hygiene.
2.3 Expansion (Parsing) -> very useful ;)

the #%app error usually means that an identifier is not syntax, but a
function application. if phase 1 doesn't have a language instantiated
(which defines #%app) this is an error. the #%top error probably means
that an identifier is not defined. in modules, the #%top identifier is
not defined?

3.3 #%top can be used to override lexical bindings -> "Such references
are disallowed anywhere within a module form"

evaluating undefined abc as (#%top . abc) in toplevel gives:
  reference to undefined identifier: abc

in module level:
  reference to an identifier before its definition: abc in module: "/tmp/test.ss"


Entry: defining names from macro
Date: Thu Mar 13 12:27:21 EDT 2008

#lang scheme/base
(require
 (for-syntax scheme/base))

(define-syntax (foo stx)
  #`(define bar 123))
  
(define-syntax (broem stx)
  (syntax-case stx ()
    ((_ name)
     (printf "expanding (broem)\n")
     #`(define name 123))))

(broem lalala)
(foo)

;;---

this defines 'lalala' as a symbol, but 'bar' is not accessible. looks
like a hygiene issue where hygiene has to be broken explicitly?

a better approach is maybe to expand to a 'module' form, so all
symbols are introduces in the same place, and no capturing is
necessary?

something like:

(module state-snarfs "snarfer-lang.ss"
   (snarf-from "../base/base.ss" (state) (base) lift-state))

where 'snarf-from' is a macro provided by snarfer-lang.ss which
expands to a #%plain-module-begin form.

rationale: what is a module? it is a finite map (name -> stx/value)
in that sense, a snarf is maybe indeed better exposed as a module
transformer, mapping modules -> module instead of modules -> expressions

again: the alternatives are:
  * variable capture for define forms (non-hygienic)
  * whole module expression generation


Entry: datum->syntax
Date: Thu Mar 13 14:59:33 EDT 2008

this worked:

(define-syntax (foo stx)
  (datum->syntax stx '(define bar 123)))

it defines 'bar'. replacing stx with #f doesn't work (no #%app error).
the same thing with #' doesn't work.. apparently, the latter creates
some lexical bindings..

so, the proper way to break hygiene is using 'datum->syntax'.

the thing to look at is build-struct-names from syntax/struct


this seems to work for the current state.ss

update:
        using (->syntax stx <symbol>) with stx the stx object that's
        passed to define-lifted-tx, it seems to work fine. looks like
        the problem was (->syntax #f <symbol>)

in short: the stx object seems to have knowledge of the module's
namespace, can tell whether names are module-local?

Entry: module path
Date: Thu Mar 13 17:41:54 EDT 2008


more confusion: 

  resolve-module-path could maybe be used in combination
  with (syntax-source-module) to resolve relative references?


* have a look at module index :

EDIT:

----- test.ss:

#lang scheme/base
(require
 (for-syntax
  scheme/base))

(define-syntax (foo stx)
  (printf "~a\n"
          (module-path-index-resolve
           (syntax-source-module stx)))
  (datum->syntax stx ''bla))

(foo)

gives:

image> (require (file "/tmp/test.ss"))
'test
bla

so it returns a symbol instead of a path..


EDIT:

incredible.. this also works, but only in the dynamic extent of syntax
transformers (why?) :

(define (module-mapped-symbols module-path)
  (let-values
      (((base-phase template-phase label-phase)
        (syntax-local-module-exports module-path)))
    base-phase))



Entry: what's in a stx object?
Date: Fri Mar 14 06:50:58 EDT 2008
  
looking at 2.3 expansion, it seems that some information at least is
context-dependent. the question is: is this context stored in the stx
object, or set by parameters? (looks like that implementation is
internal : accessible by 'syntax-local-context')

another: how is the lexical info stored?


Entry: carefully tuned api for the compiler
Date: Fri Mar 14 07:33:18 EDT 2008

all-in-all, it's quite nice. once the basic design elements are
understood, and all over simplifying assumptions are cleared out by
actually reading the docs (!). there is a huge emphasis on macros and
modules (static), instead of "building new interpreters" as in SICP.


Entry: on with the real work
Date: Sat Mar 15 07:38:50 EDT 2008

tough this made me think a bit.. this snarfing business, is it really
necessary? or will it work at all? currently it uses some form of
delegation to implement functionality: if it's not in (state), take it
from (base) what is necessary is to delay the 'fill up' to the last
stage at which point, after declaration of (state) functions.


Entry: the design choices
Date: Sat Mar 15 09:09:15 EDT 2008

* use syntax objects for everything that represents code. this gives
  the best match to PLT scheme: it allows to use most of the
  underlying machinery in the way it integrates well.

    - representation: lambda + redefinable language structure
    - namespaces: modules and units
    - compile/interpret: use the scheme expander instead

  in retrospect, this took a long time for me to understand, but it
  looks like i'm finally getting it. identifier scope management
  (lexical, dynamic, module, unit) is hard. if somebody does it for
  you, then use it!

* syntax streams: are these really necessary? they complicate things
  due to having to deal with both lists and streams. maybe, when
  everything is syntax and usable as scheme macros, streams are no
  longer necessary?


it looks like the more important choice is going to be to have a
complete map from text -> binary code / assembly code that lives in
the syntax domain.


Entry: quoted programs
Date: Sat Mar 15 10:01:51 EDT 2008

in a code quotation, it should be possible to override the
language. if the first identifier in a program is a syntactic form,
then use it to transform the expression, otherwise use default.

not so straightforward. actually, it is: using

 (define (transformer? id-stx)
   (syntax-local-value
    id-stx (lambda () #f)))

maybe the possibility to do this recursively would be nice? that way
all kinds of syntax extensions can be implemented, and forth could be
represented as-is.

this can probably be handled in dispatch?

almost. it's a partial fold, while currently represent is a full
(left) fold. fold up to encountering a transformer, then pass the
remainder of the expression to the transformer (who might call
represent again).

first: cleanup some things. move the symbol mapping to a single
function.

second: having syntax in the middle of an expansion isn't really
sound, right? what would (a _b c) do, with _b syntax?


(represent (a _b c)) ->
(lambda s (represent/step (a _b c) s)) ->

(represent/step (a _b c) s) ->

(lambda s 
  (represent/step
    (_b c)
    (dispatch a s)))

well.. it does make sense. if a is not syntax, the result is an expression

ok.. i got something working: string things together with
'rpn-compile' until a transformer is encountered, upon which the
rest of the code is handed off.

however: still can't use '(lambda <f> <b>)' because the there's
already a lambda wrapped.


i wonder however: why not allow true parsing words? the loop over a
code body could be made more explicit: give words access to the syntax
objects. the traditional 'immediate' words could be used. no.. they
need to be macros: need to be available at expand time, and function
bindings aren't. so i'm on the right track.


Entry: lexical tricks
Date: Sun Mar 16 14:51:04 EDT 2008

maybe these lexical tricks are more of a nuisance than anything
else.. if lexical capture from scheme is required, why not just prefix
those names? what is lost here is an abstract way to access the
namespace, but what's gained is more clarity of mechanism +
readability of code. maybe that's worth more?

in that case, a simpler non-intrusive prefix might be desirable, or a
let-form that abstracts the prefixes.

ok: took out the automatic lexical tricks:

  * all identifiers in rpn code are now mapped in the compiler
  * unquote works in 
      - quoted () code   -> unquote value interpreted as a function
      - quoted '  data   -> unquote value placed in data structure           

looks better this way.


Entry: typed writer monad
Date: Sun Mar 16 15:58:10 EDT 2008

the state extension used in brood works well for macros, but is very
hard to use with the dictionary in prj. why is this?

the conflict is about what the meaning of quotations should be:

    if they're state functions, all code that takes programs should be
    redefined such that they pass on the state. this makes sense for
    state functions, but is infeasible unless it can be done
    automatically.

    passing state code to non-state code doesn't work.

so what if code were typed? the problem is that i'm trying to
implement some half-assed monad thing without having a type system to
make it convenient..

so... base can't run state code, but state code can be converted to
base code if it doesn't contain any real state actions. probably state
needs a 'run' that can distinguish between the two types.

can the functionality of run be somehow 'replaced' when 'inside' state
code? in order to answer, need to define what 'run' means.


Entry: run
Date: Sun Mar 16 16:38:27 EDT 2008

since 'run' is the ONLY PRIMITIVE word that accepts code, any code can
be made to run by overriding the behaviour of it. is this leading
anywhere?

if there are 2 languages (base and state), there should also be two
language types when code in these languages is represented as data AND
two STATE types.

once these things are in place, it should be clear what the behaviour
of run should be: dispatch on code and state type and do the right
thing:

\___ code      STACK      STATE
data\___

STACK          apply      error

STATE        apply/lift   apply


in order to not have to change everything, the base stack type can be
just a list. anything built on top though, needs to be typed.

let's start with the state type. OK. seems to work. this looks like a
generic 1-arg language (in scat-state.ss). looks like it's factored in
the right way now: didn't take any changes to core to change the rep
completely.

next: word structures. maybe take out source rep first: if it's
syntax, i can probably find source rep from source text instead of
storing it..

DONE.

the fact that SCAT words are procedures, is it a feature? or is a more
abstract interface required? now base and state have different reps,
maybe 'apply' (run) should be made abstract?

or.. each function object should carry a run method, which accepts
different states/stacks/arguments ?

i'm being pushed towards a staticly typed language..



Entry: lifting applicators
Date: Mon Mar 17 10:45:46 EDT 2008

so.. splitting the stack/state objects themselves into different types
is straightforward. now, how should a language type be implemented? as
a run method? this is probably the most straightforward way: the
default could be apply, and anything smarter than the base langauge
simply overrides it.

this is confusing. if a word has an explicit 'apply' method, it's no
longer a procedure (which has an implicit, universal apply method).

dead end? yes.. a word structure is either a procedure, or not. why
should it be an abstract type? ->
  * to add annotation 
  * to distinguish from other procedures


let's reach for the bottle: can i solve this with dynamic binding?
no.. all the words that somehow are to operate on the state need to be
lifted. it's not that the functions deep inside some dynamic extent
need to be updated, it's that the state itself needs to be
accessible. it's probably possible to solve part of it with dynamic
binding, but that will not play nice with closures..

so.. to come back to a comment in brood: the problem is not that prj
state language is a bad abstraction, it's that the lifting of
operators from base -> state is not so straightforward : anything that
passes or duplicates the whole state needs to be handled.

eventually, this boils down to 'run' and anything built from it. maybe
the compilation should handle this?

the next step to the solution is: 'run' should be isolated: no-where
should there be 'apply': i could keep the base representation for
debugging purposes, but everywhere else it should be abstract

maybe 'base-control' should be separated out again, like it was in
BROOD-2? am i going in circles? not really: but w.r.t. to lifting,
control operations are different.

next: split off control.ss which contains all the code tainted with
'run'.

so, why is dynamic binding for state so bad? suppose there's a
transition from scat/state -> scat/base, and something needs to access
the state inside the scat/base extent. if this is allowed, the whole
base language is no longer referentially transparent.

let's see..
  ifte : (choose run)

is it possible to lift 'ifte' if the source is not available?

maybe i just need continuations?

there's something right in my face here i just don't see..



Entry: lifting
Date: Tue Mar 18 08:46:21 EDT 2008

the thing that bugs me is that currently, the only way to do the
lifting is to manually update and duplicate all the code: i found an
operation that doesn't compose easily. this means that the whole of
'control.ss' needs to be parameterized, so it can be included in the
state language.

what this control needs is:
  * a way to apply a function to the abstract state
  * a way to push/pop to the stack in this state

should i give up the concreteness of the representation? currently,
the only place where i use it is in debugging (and that's ok) or in
lexical binding of functions (which can be dealth with using macros).

when what currently is 'apply' becomes abstract, a genuine
compositional language can emerge.

OK: got stack abstract now. change wasn't so deep, means i'm getting
better at choosing the right abstractions..

to summarise: the base language has the following components:

   * RPN TRANSFORMER
   * STACK DATA TYPE
   * NAMESPACE IDENTIFIER MAPPER

this is extended easily to STATE DATA TYPE


Entry: representation
Date: Tue Mar 18 11:33:20 EDT 2008

so... what about: represent everything with 
  - procedures (closures)
  - structure types
  - tagged lists

tagged lists can be used to implement structure types, but closures
can too. (the example of 'cons' in SICP). iirc, in Haskell, structure
types are unevaluated functions.

basically, these are all interchangable, and are merely about
implementation. however, in PLT scheme, using structure types seems to
be more efficient.

i see myself moving more from the "wow, everything is a list!" and
"eval is amazing!" attitude towards the static undertone in PLT
scheme, which seems to be more based on the language tower model
(macros are no longer 'accumulative' -> everything is unrolled into a
tree structure) and structure types: more abstract than tagged
lists. everything is more static without loosing any power, except for
a little bit of quick-hack power..


Entry: sidestepping the problem
Date: Tue Mar 18 12:17:33 EDT 2008

what if i sidestep the whole issue, and define base to have a void
state? that way, the skeleton can be kept, but arbitrary data can be
threaded through code. all control code just passes on the state,
without being able to modify it.

how is this really different from an 'environment' value to be passed
along?

if you look at how 'invisible state data' is used, it is quite related
to lexical variables. for example in (lambda (x) <exp>), the 'x' might
be used very deep inside <exp>, so for all the nesting inbetween, this
'x' is not used. the equivalent of a deeply nested expression in an
rpn language is a long composition.

let's give it a try..

seems to be the right thing to do. next problem = to abstract control.


Entry: control
Date: Tue Mar 18 13:29:05 EDT 2008

control operations are things that access the data stack, and are able
to apply functions to state, collect state, etc.. without having
access to the state themselves.

maybe this is a nice occasion to start using structure types and a
modified match with

(struct <tag> (<var> ...)) -> (<tag> <var> ...)



Entry: dynamic trick
Date: Tue Mar 18 15:55:40 EDT 2008

instead of making a lot of control words for each dynamic invocation,
there's a mechanism now that takes a function which accepts a thunk,
and evaluates the thunk in the dynamic environment. the thunk
represents the rpn continuation.



Entry: comprehensions
Date: Wed Mar 19 10:00:21 EDT 2008

today's excursion into plt land is about comprehensions. the main
reasons: 

  * for-each: the concatenative program interpreter
  * for: number -> list maps, for low-level ops

hmm.. leave it for later. i fixed it like it was, but turned
interpret-list into a function (was syntax).

looks like the language is ok now. it's simpler, and hidden state is
easier to implement. purity is guaranteed by just not passing in any
state.

so can the whole state namespace be ditched then? namespaces are still
necessary for the brood macro language, but not any more for prj. the
important thing is that there's no need for lifting, since the base
and state language syntaxes are the same now.

let's call it all scat. looks like this solves a lot of problems.




Entry: the scat machine model
Date: Wed Mar 19 12:22:24 EDT 2008

all scat functions take a data stack, and a hidden parameter used to
implement any hidden data that bypasses the computation. the reason
for implementing it like this is that the code operating on the data
stack (the SCAT langauge) is orthogonal to extensions that implement
hidden state.

the reason that the pure functions ALSO pass this state around is to
avoid the problem of lifting pure functions to state passing
functions. if purity is desired: simply do not pass any interpretable
state into a computation. to check purity: pass something that can
only be interpreted locally (checks read), and check if it's the same
when it comes out (checks write).

why is hidden state necessary? anything can be solved with
combinators, but for some problems, bypass can be be a tremendous
simplification. relate this to:

  * lexical variables (bypass trough random access in environment)
  * monads (bypass bookkeeping implemented by 'bind' and 'return')


functions can then be classified into 3 groups that do

   1) not know about state (define-word).
   2) know about state, but merely pass it on (define-control, make-state)
   3) modifications on the state data (like 2, but with data accessors)


i'd like to clearly separate 2 and 3, but see no way to do now other
than giving group 2 no means to interpret the data, and require
functions in group 2 to never replace the data. the latter could be
guaranteed with an assert, the former is namespace management.

i'm done ;)


Entry: from here
Date: Wed Mar 19 13:47:33 EDT 2008

i guess it's time to start rebuilding brood on top of this
structure. the core changes are:
  * create the macro language
  * rewrite prj.ss

going to keep macro.ss in scat for now.

default semantics seems to work fine, but name space mapping needs to
make the distinction between defined macros, and undefined ones which
default to calls. maybe this is a nice point to lift target namespace
management to scheme by requiring all words to be defined as macros?

box> (define-ns (macro) abc (postponed-word 'abc))

next?

core is working fine. the remainder is namespace management and
rewriting parsing words.


Entry: parsing words
Date: Wed Mar 19 16:40:59 EDT 2008

what about this: all scheme transformers are unary functions. nothing
prevents me though to add binary functions to the transformer
environment. these could then be reserved for calls from within
'next', because they don't fit the scheme transformer type.

this so, because it's illegal to call them directly anyway: they use
dynamic state set by the compiler macros.

this seems to work pretty well.


Entry: forth / macro mode
Date: Wed Mar 19 18:41:12 EDT 2008

the next thing to implement is the forth / macro mode. this is a key
issue, since i'd like to change this to a set of LHS = RHS expressions
where names are clearly defined.

part of this is delimited parsing. (EDIT: solved)

to make this transition as smooth as possible, the basic syntactic
form which defines both macros and words needs to be defined. the core
problem is a tough one: allowing identifiers to be overridden requires
some lexical structure. the second problem is to retain the 'inner'
names after a compilation. maybe the enclosing structure can be saved
and incrementally extended?

what i'm talking about is this:

(let-ns (macro)
    ((foo (macro: 1 2 3))
     (bar (macro: 456)))
  <forth> )

->
(let-ns (macro)
    ((foo (macro: 1 2 3))
     (bar (macro: 456)))
    (let-ns (macro)
        ((word (target-word 123))
         (burp (target-word 567)))
        <hole>))

basicly, the target dictionary is a nested lexical structure like the
above, which has a 'hole' in it. on the next compilation step, this
hole can be extended. this way all name management can be delegated to
the scheme expander.

to make this workable, the state should be stored in a form that's
like the above, but flat.

see target-tx.ss for the resulting code.

to summarize:

  * target state = dictionary = listof listof (name . code) each
    element in the top dictionary list consists of an association list
    for one compilation level. names within a level need to be unique
    and can be used recursively.

  * each forth <-> macro transition in forth code introduces a new
    nesting level.


remaining q: how to represent non-macros? what about using 'forth:' to
mean something that generates a macro which compiles an address.

what about treating words as macros, but mark them as 'instantiated?'
this might be interesting, since it allows some flexibility for code
processing (i.e. code inlining).

one thing at a time..
what's next?

need to define some interfaces. most likely the assembler. where does
it fit in? what does a forth file represent?




Entry: representation
Date: Thu Mar 20 08:40:18 EDT 2008

with scat ready, the next real problem is representation of forth
code. time for some "what is" exercises.

macros are not really the problem, but what is instantiated forth code?

(let-ns (macro)
    ((foo (macro: a bar))
     (bar (macro: c d e foo)))
  (let-ns (macro)
      ((baz  (forth: foo bar))
       (shma (forth: wikki wakki)))))

in the end, after compilation, this leads to:

(let-ns (macro)
    ((foo (macro: a bar))
     (bar (macro: c d e foo)))
  (let-ns (macro)
      ((baz  (macro: 123 compile))
       (shma (macro: 456 compile)))
    <forth>))

so: a dictionary is a nested let-ns expression with a hole in
it. compilation is filling this hole to get a new nested let-ns
expression.

it looks like the key extension is to move the assembler to the syntax
level also.

so what does the 'forth:' form do?

  * it creates a macro which compiles a reference, and saves code for
    later compilation in that lexical environment (closure) bound to
    that reference.

so the essential part is to separate the creation of new names from
evaluation of the righthand sides + complete unification of forth
words and macros. (a forth word has a macro associated which compiles
its body).

testing this, but i'm missing an essential part of brood (compilation
stack pattern matching) to get this working. EDIT: worked around it
using base language.

what i got now:
(define empty-state
  (make-state '() '()))

(define (run-macro macro)
  (state-data
   (macro empty-state)))

(define-syntax forth:
  (syntax-rules ()
    ((_ . code)
     (let ((word
            (delay
              (run-macro
               (macro: . code)))))
       (base: ',word compile)))))

(define xxx
  (letrec-ns
   (macro)
   ((abc   (macro: 1 2 3))
    (def   (macro: 4 5 6))
    (broem (forth: 123))
    (lala  (forth: abc def))
    (shama (forth: lala)))
   
   (macro: lala)))

looks like a pretty good implementation. it even has a means to only
compile what's necessary by adding an 'export' word: determine which
functions get exported into the namespace.

this structure creates a graph in which the nodes are forth words
(instantiated macros).

Entry: code structure: line or graph?
Date: Thu Mar 20 10:43:43 EDT 2008

now.. look in to the 'structured assembler' (something from an Olin
Shivers writeup about a project he worked on doing dataflow
optimization for..)

if assembly is just a serialized expression graph, why not keep it a
graph longer? there are some features used in purrr that assume
serialized code (fallthrough). is this really necessary? or should
such code be grouped somehow as a single function with multiple entry
points. it's interesting as low-level control, but a pain to do code
transformations.. the rep mentioned above already has this.

now, with this structured asm thing, maybe all local label issues can
be solved that way too?

NEXT:
  * unify macros and forth words (meaning of ';')
  * defining LHS/RHS and ':'
  * variables

Entry: definitions
Date: Thu Mar 20 16:59:55 EDT 2008


a problem popped up: either ':' (upto) is terminator, or ';' is a
terminator (including).

the real problem: if ':' is a macro, it should be able to introduce
new names. how to do that? macros are parsed inside the body of a
toplevel 'represent'. instead of doing it like that, rpn-next could be
called directly.

ok. got it working by abusing the compiler a bit and storing continue
points in a dynamic variable. i tried without side-effects (using
prompts), but can't get that to work.

now with compilation mode built in.


Entry: forth rep
Date: Fri Mar 21 15:33:11 EDT 2008

seems to be fixed now. see the files forth.ss forth-tx.ss and
forth-rep.ss : the result of evaluating a (definitions . <code>) form
is an assembly code graph.

now, how to save the macro definitions so incremental compilation can
be performed? the 'definitions' macro should be extended with an input
assoc-list of defined words (and macros?).



Entry: incremental compilation
Date: Sat Mar 22 09:12:25 EDT 2008

an intermediate representation is necessary which preserves the macros
in some form so they can be re-instantiated, but also preserves the
forth words in some form. is it possible to somehow grab the source
code (after lexing) of each word? that way original macro source can
be preserved, and forth source can be translated to abstract rep (only
addresses).

basic idea: can't use source code to save current language
state. maybe save the environment functions together with the forth
structs that are generated? i.e. if address is filled, return that,
otherwise return word struct.

maybe the macro from yesterday needs to be factored a bit?
OK. incremental updates are implemented as a simple nesting of
letrec-ns forms.

so, how to represent the target state? i prefer to have this in a
readable form, preferrably one with macro source intact, and forth
words resolved to numbers.

first: separated out some things: there's a 2-level nesting with 'old'
not being collected. need some factoring to reset parameters.

or.. i could make the macros ephemeral, so they have to be included
explicitly in each source file?

or.. the whole thing runs at stx-transform time and expands into a
module that defines a number of forth macros and a binary code chunk?

maybe it's best to collect per level. then later, code that's not
necessary can be ignored.

can collection be done statically?

the thing is: macros need to be saved in some syntactic form, not as
an object. from this, it's not necessary to evaluate macros, except
for things that generate code. instantiated live target code is
represented by macros.

so, ground rules:
 * evaluation of code gives a list of nodes in a code graph
 * saving of state = saving of macros as syntax.

problem = how to save macros?
each word evaluated needs to be evaluated in the lexical context.


Entry: what is compilation of forth words?
Date: Sat Mar 22 19:20:59 EDT 2008

essentially, in the current context, it's a source code map which
translates all forth words to macros that compile a call:

macro : abc 123 ;
forth : xyz abc abc ;

 ->

macro : abc 123 ;
      : xyz #x0010 execute ;


so what is the essence?  it's source transformation. i probably need
to perform some operation twice: one to construct some syntax, and
another one to evaluate functions. maybe first separate syntax and
semantics better?

let's try to catch the code first. it's probably easier to move from
using a 'continue' thunk to just recording the point of the next
definition.

catching code is problematic: there might be a macro that generates
several words or macros.. this cannot be cleanly cut out by cutting
out each definition. the only real thing is what comes out in the end:
that's what needs saving: the fully expanded nested let expression.


looking at this:

box-macro> (forth-words : abc 1 2 3 def : def 4 5 6 abc)
(forth-words-incremental (: abc 1 2 3 def : def 4 5 6 abc))
box-macro> 
(forth-definitions
  (lambda (collect)
    (letrec-ns
      (macro)
      ((abc
        (mode-forth
          'abc
          (lambda (state)
            (macro/def ((literal 3) ((literal 2) ((literal 1) state)))))))
       (def
        (mode-forth
          'def
          (lambda (state)
            (macro/abc ((literal 6) ((literal 5) ((literal 4) state))))))))
      (collect))))

actually gives the solution.
instead of expanding to something which includes dynamic binding, just
have them pass in as an argument. in this case: ditch the
'forth-definitions' and make 'mode-forth' and 'mode-macro' parameters.

this gives a very clean separation of function and structure: the
structure is just the expansion of all macros. function can be plugged
in later.

yep. works like a charm.

box-macro> (forth-rep (macro : abc 123 forth : def 345))
(lambda (forth macro collect)
  (letrec-ns
    (macro)
    ((abc (macro 'abc (lambda (x) ((lit 123) x))))
     (def (forth 'def (lambda (x) ((lit 345) x)))))
    (collect)))


Entry: syntax: going further
Date: Sun Mar 23 09:38:32 EDT 2008

what i'd like to save is not only this structure, but a means to
transform it into something that 
  * has addresses bound
  * can be extended

made the 'lambda' part parametric.. is this necessary?

so.. maybe move away from storing the delayed computation to something
more concrete? like the rest of the syntax stream? kept it, but
cleaned up the access a bit.

so.. it's essential to have 2 functions:
  * the syntax transformer
  * the asm graph evaluator

now, when the assembly step has finished, the original syntax needs to
be updated (words replaced with address refs) and saved such that it
can be reused later to build new syntax expression.

so what is what?

a dictionary is a collection of frames from 'compilation units'. (a CU
is a single level in the final nested letrec expression.) composition
of CUs is composition of syntax transformers.

ok.. separated out the core form: code->reps

made an extra macro called 'dictionary' which enables a slightly
lighter concrete s-expression representation, so it is easier to edit
when replacing words with macros compiling addresses.

maybe add another called 'word' which transforms the names so expanded
names don't have the prefix? this will probably clash with
macros.. nope: there's a way between: i need this only in the
representation, which is after macro expansion. 

EDIT: had to update rpn-tx.ss to expand forms returned by
rpn-map-identifier before determining if the resulting identifier is a
macro.

so.. we get this:

box-macro> (forth-rep dictionary (: foo 123 : bar 4) (: baz foo bar))
(lambda (forth macro collect)
  (dictionary
    ((foo forth (lambda (x) ((lit 123) x)))
     (bar forth (lambda (x) ((lit 4) x))))
    (dictionary
      ((baz forth (lambda (x) ((word bar) ((word foo) x)))))
      (collect))))

where 'dictionary' and 'word' are macros that make this code a bit
more readable. updating a dictionary to replace words with references
goes like this:


      (baz forth (lambda (x) ((word bar) ((word foo) x))))
->
      (baz macro (base: #x0123 compile))

maybe this deserves a little macro of itself to turn it into      

      (baz macro (address #x0123))

the next step is to turn this dead representation into a live function
that can be composed to generate new syntax transformers.


so why a raw lambda, and not some form that has concatenative code?
the problem is that this requires uncompiling: the lambda is the
result of all syntax that might be defined in forth code, which can
contain arbitrary scheme forms: there might be no 'simple'
concatenative form.

on to extension transformation.
  rep + name.address -> rep

Entry: not currently transforming
Date: Sun Mar 23 13:52:01 EDT 2008

PROBLEM: not being able to run the transformers because they depend on
some expander environment is problematic. i'm using some functions
that use context only available in `official' transformers. fix that.

EDIT: this is not just 'some expander environment'. it's the lexical
environment of the expression being transformed. also see entry about
calling macros directly. entry://20080325-144330


in other words: is it possible to turn a function into a transformer?
this requires some level shifting voodoo i don't see how to perform..

(define-syntax transformer (lift tx-fun))  ?? too simple to see ??

EDIT: looking at plt/src/mzscheme/src/env.c -> now_transforming
(syntax-transforming?), this predicate is derived from the presence of
a scheme_current_thread->current_local_env value.

in eval.c -> expand the function _expand is called with a new expander
environment.

hmm.. too complicated to quickly browse. what i did see is that the
environment during an expansion is ether dynamic, or attached to the
stx objects. the latter makes sense, but i can't see any obvious
refs.. the magic is done in resolve_env, which is quite
complicated. it looks as if the info is not tied to the objects..

so guesswork: the expander has a dynamic environment which contains
the lexical environment of an expression.



Entry: what is dict?
Date: Sun Mar 23 18:36:38 EDT 2008

the incremental compilation has type (DICT,SRC) -> (DICT,BIN)

so what type is DICT?

It needs to be something that can be serialized, so what about plain
s-expressions? Is there information that cannot be captured?

One problem is that arbitrary extensions might depend on external
code. Representation should probably be a 'module' form which
explicitly states its dependencies. Problem solved. Module forms are
neatly representable by s-expressions since they have no free
variables.

Now how to go about this?

A dictionary is a module that exports a function 'update' which takes
forth source code, and outputs the expression of another module and
binary code.

This is interesting: it involves writing a "module quine" ;)

EDIT: when you can modify the language in which to write a quine, the
problem is trivial. the reason why some quines are interesting is the
length it can take to express one in some language / system, or the
extent to which detours can be taken..


It almost works, except for some constant redefinition errors.

OK. got it working with arbitrary payload and update expression. some
interplay between macros and s-expression code generation: on each
iteration:

  * apply the update expression to the state
  * pass on the update expression
  * send the output
  
very straightforward iterative system once the boilerplate generator
is in place. this can be minimised by re-defining %plain-module-begin
or something..

updates: can probably standardize the module name, since only one is
necessary and it can be loaded into a sandbox.


stylized the reflection loop a bit: now using #%module-begin macro and
a minimalistic module spec which also carries over the body code.

cleaned it up to a single macro:

(define-syntax (module-begin stx)
  (syntax-case stx ()
    ((_ (tick state) . forms)
     (let ((name
            (syntax-property
             stx 'enclosing-module-name)))
       #`(#%plain-module-begin
          (provide update)
          (define (update input)
            (let-values (((state+ output) (tick 'state input)))
              (values 
               output
               `(module #,name scat/forth-dict
                  (tick ,state+) . forms))))
          . forms)))))

so state update / storage is solved.


Entry: dictionary update
Date: Mon Mar 24 09:21:37 EDT 2008

the incremental compilation has type (DICT,SRC) -> (DICT,BIN)

so, what does the module storage part need to implement?
 - name binding (i.e. add special require forms)
 - implement a transformer binding

it will be used as a syntax include.


Entry: lexical syntax annotation
Date: Mon Mar 24 10:33:01 EDT 2008

still i don't fully grasp the notion of lexical information in
syntax. the macros 'let' and 'lambda' annotate syntax with lexical
properties, when they are expanding. so does this work?


(define (make-expr body)
   #`(lambda (x) #,body))
((eval (make-expr #'x)) 123)

=> 123

so it looks like indeed, building syntax like that does perform
'capture'.

my problem is this

(define (wrap stuff)
   #`(lambda (x) #,(stuff #'x)))

(define (stuff stx)
   #`(let ((x 123)) #,stx))

(eval ((wrap stuff) 1)) 

=> 123

so the inner let captures the #'x
there's nothing special about defining the formal parameter x and the
reference x in the same function 'wrap'.



(define (test-stx stx)
  (syntax-case stx ()
     ((_ (a) b)
      (bound-identifier=? #'a #'b))))

(test-stx #'(lambda (x) x)) => #t

so 'bound-identifier=?' does do some lookup: it doesn't need to expand
the syntax?

EDIT:

what i wonder is: is it possible to construct a function 'wrap' as
above which can guarantee that the 'x doesn't get captured? this
requires some kind of symbol rename.

looks like it's a legitimate question:

(define (wrap stuff)
  (let ((x (car (generate-temporaries #'(x)))))
    #`(lambda (#,x) #,(stuff x))))

(define (stuff stx)
  #`(let ((x 123)) #,stx))

these are constructed using interned symbols, not gensyms. let's
update the lambda code.

so what about syntax marks? 2.3.5

(define-for-syntax (wrap stuff)
  #`(lambda (x)
      #,(syntax-local-introduce
         (stuff (syntax-local-introduce #'x)))))
(define-for-syntax (stuff stx)
  #`(let ((x 123)) #,stx))
(define-syntax (test stx)
  (wrap stuff))

((test) 456)
=> 456

works..  so, if i understand: marking is an on/off operation: the net
result is that syntax introduced by the transformer is marked, and
thus not the same as syntax introduced elsewhere.

EDIT: this does give problems with names introduced by 'letrec-ns'
separately: they are no longer catched. so.. in the end, i'm better of
doing everything non-hygienically? sounds like not right..


Entry: cleanup
Date: Mon Mar 24 11:58:57 EDT 2008

i'm not in a great design mode today, so going to do some maintenance
and simplification.

changed some names in forth-tx + took out the mode symbols and
hardcoded them to 'forth and 'macro -> lambda will capture them. (see
the remarks about bound-identifer=? and the local transformer
environment). the question that rises here is can forth code shadow
the 'forth', 'macro' and 'collect' names? yes: but only locally within
one forth file.

maybe it's also better to flatten the dictionary representation that's
stored in the modules? let's postpone this a bit..

TODO:

'with-forth' and the scat macro: collapse syntax and namespace. i
tried to add a similar 'word' macro but apparently it doesn't do
that.. maybe add syntax-parameters?

break.


Entry: dictionary serialization
Date: Mon Mar 24 18:38:34 EDT 2008

but.. with these things going on, the representation is no longer
serializable as an s-expression. so maybe i should look into
flattening out everything to a simple associative list..

there's a bunch of problems there that need to be ironed out. maybe
it's time to take a couple of days break from it? from what i've been
reading today serialization if not in compiled format is going to be a
problem due to hygiene: converting expansions to s-expressions is not
a good idea.. 

compared to previous implementation, what do we have here:
  * no ambiguities for names
  * can obtain all macros from loading source code

i guess it's time to start writing down some requirements and work
from there.. 


Entry: basic structure
Date: Tue Mar 25 08:44:07 EDT 2008

maybe i should just stick to using an opaque representation, and try
to use this to gain access to the data.

ther are 2 models to use this:

  * all code is available in source form: intermediate representation
    doesn't matter much, since it can be redefined.

  * some code is closed. in that case, the representation does matter,
    because it becomes an interface.

it's the latter problem i'm trying to solve.

if this could be almost human readable, but mostly independent of
mzscheme's binary representation, that would be great.


requirements:

  * a dictionary needs to contain binary code + macro code + reference
    to the compiler version / library it was made with.

  * a dictionary needs to be opaque, and at the same level of scheme
    modules.


what about separating the incremental model from the libary model? the
incremental model is mostly for developing. they are two different
namespace models.

  * kernel: uses mzscheme's module name management system.
  * incremental: extends the flat public interface.

maybe the next question is: what is a forth file?

      how to make "forth file" == "scheme module"

and build the incremental compiler on top of that. it's very
straightforward for macros. but what about words? every module that's
compiled contains forth words the same way as the nested let.

yes. it's better this way.


Entry: the ':' macro
Date: Tue Mar 25 13:00:05 EDT 2008

whenever forth->definitions is called, it needs to be done in an
environment where level -1 has the some definition macro (':')
defined. this can be ensured by requiring it for template.

hmm.. something's wrong here. got some dependencies tangled up.. can
it be made automaticly?

separate things: forth-lang.ss now gives a macro 'forth:' that
produces toplevel forms from forth code. this can then be wrapped for
module usage.

module works, but it doesn't want to export the name symbols. calling
the transformer directly should solve this: then no marks are
added. alternatively, we could mark ourselves?

aha: the 'provide' statement needs to have the same lexical context as
the names, then it works.

(define-syntax (module-begin stx)
  (syntax-case stx ()
    ((_ code ...)
     (let ((name
            (syntax-property
             stx 'enclosing-module-name)))
       #`(#%plain-module-begin
          #,(datum->syntax stx '(provide (all-defined-out)))
          (printf "FORTH:\n")
          (forth: code ...))))))

gathering forth code works too now: everything is dumped in a list
which is named according to the module name.


Entry: calling transformers directly
Date: Tue Mar 25 14:43:30 EDT 2008

isn't really good style. why? they all run in the same lexical context
as the transformer they are called by.

need to have a better look at local-expand and friends to see if
there's no better way to handle this. in other words: check where is
it (not) necessary to have the same lexical context.

i guess it's ok for the RPN code representation: a piece of RPN code
produces a single lambda expression, which is a single lexical
environment anyway..

but i don't see if it's always harmless.
maybe i should just try to break it?

because of the way that early expressions are on the deepest nesting
levels, introducing new names into the expansion only influences code
BEFORE a certain point, so i see no reason to do so.



Entry: syntax-case implementation
Date: Tue Mar 25 19:59:59 EDT 2008

let's see: 
http://www.cs.indiana.edu/~dyb/pubs/tr356.pdf

  two identifiers are bound-identifier=? only if they they have the
  same name and are present in the original program or are introduced
  by the same macro.

  free-identifier=? determines if two identifiers WOULD refer to the
  same binding.

  generate-temporaries creates a list of temporary names. not becase
  renaming is necessary (never is!) but because it might be convenient
  to insert (a list of) names.
  
macro implementation:

  for each expansion, a mark is created. input and output syntax is
  marked, and double marks are dropped. the net result is that
  identifiers introduced by the macro are marked once.
  

Entry: onward
Date: Tue Mar 25 20:29:12 EDT 2008

with the basic representation intact, i'm going to leave the the
incremental dictionary stuff for later and concentrate on bringing
over other constructs from brood.

  - reader syntax OK
  - the pattern matcher OK
  - variables and constants OK
  - code chunks / anonymous words OK
  - exit / jump-to-end OK
  - local variables OK



Entry: reader
Date: Wed Mar 26 14:58:39 EDT 2008

stole syntax/module-reader and adapted it for reading forth code.


Entry: splitting scat and forth
Date: Wed Mar 26 15:14:20 EDT 2008

let's call 'scat' the rpn language with functionality and namespace
management, and grow this project into brood-5 instead of trying to
port it 'into' brood. i see no reason to release scat as a different
project yet, but there are good reasons to separate the scat code into
a single collection that's accessible trough scat.ss and scat-tx.ss
interface files

maybe time to combine some files.. there are a lot.


Entry: pattern matcher
Date: Wed Mar 26 17:38:14 EDT 2008

looks like it's working: had to correct some hygiene: names lost their
lexical content in name-stx->symbol

time for more porting. next: constants and variables.

trying to port some pattern macros to scat, and i run into the use of
macro-find/false. this is the first occasion of dynamic namespace
access, which requires a bit of thought to solve..

the deeper question is: are quoted symbols still allowed? or do they
always refer to macros? i guess the answer is no. that's what making
things static is all about. it could still be added later, but let's
not do so until there's a compelling reason.

ok.. it's getting a bit more serious now: true cleanup. no more hiding
behind reflection ;)


Entry: name generating words
Date: Wed Mar 26 19:58:11 EDT 2008

it's all about names now.. the next things to tackle are name
generating words. let's start with 'constant'.

this needs to create a macro eventually. i eliminated this before
because of the awkward reflection loop. is that still a problem?

yes. it's a phase mix that can't be resolved, because "free-range"
code is no longer allowed. so, no constant: use macros instead.

then 'variable'. there's little trouble here, except that it requires
address resolution, so qualifies as a word.

so.. what does variable expand into? maybe a forth-word? let's see if
forth-rep.ss can be extended to represent variables.

maybe it should get its own representation, next to forth and macro?

a variable is a macro that compiles a reference (as literal) to a word
structure. so in some sense, it is a forth word: the defining property
of a forth word being: need for allocation of memory.

so what does a variable look like?

(make-word (scat: ',var-struct literal))

this var-struct should also be a word struct, so it can be registered
in the same way as forth words. however, the normal 'forth' wrapper
expression isn't really useful here. so it might be necessary to
change that a bit.


Entry: delimited words
Date: Wed Mar 26 21:31:10 EDT 2008

if there's a 1-1 relation between names and words, some care needs to
be taken to solve conditional jumps etc..

this is exactly the kind of trouble i got into when trying to make a
purely concatenative VM for the catkit on-target forth dialect.



Entry: forth-tx.ss and macro-lang.ss
Date: Thu Mar 27 08:27:14 EDT 2008

note that 'macro:' is really for brood style macros (postponed target
words) but the forth-tx in itself is more general. maybe move it to
separate directories? in short: those 2 need to be bound, at the
forth-lang.ss level, (which should be purrr), because they are
orthogonal upto there.

yes, it's a good time to decide how to make behaviour pluggable. for
microcontroller targets, most of the forth code can be shared.

forth uses 'compile' and 'literal' from the underlying target. so
should it be a unit? it's only 2 names, let's first pass them in
through function/macro.

let's reach for the bottle: dynamic binding. it solves all your
problems! ;)

but, in this case it might make sense. units are really overkill, and
to have some default is interesting for testing. i guess if the
bindings themselves are isolated, they are easy to change later.

so what to separate:
   - macro    (representation of postponed code + pattern matcher)
   - forth    (forth syntax on top of macro)
   - purrr18  (code specific for PIC18)

so.. got macro separated. now trying to make a Forth layer on top of
SCAT. note that here 'imperative' code should be possible! that's
something to fix later. until then only declarative code.

OK. scat-forth.ss is working!



Entry: duplicate module instances
Date: Thu Mar 27 11:22:56 EDT 2008

i ran into a problem where the rep.ss module is loaded twice when
requiring test/fafa.f and scat.ss (the latter to get at the scat:
symbol)



Entry: dependencies between subprojects
Date: Thu Mar 27 12:12:27 EDT 2008

maybe it's best to just have one file for both the main and tx words.
did that. looks a lot cleaner now. also made the test cases pull in
all code.


Entry: variables
Date: Thu Mar 27 13:17:08 EDT 2008

So, let's represent variables by

(define (wrap-variable name size)
  (let ((word
         (make-target-word
          name #f size)))
    (values
     (scat: ',word literal)
     word)))


This probably requires a compiler extension since it's different from
macro and forth modes.

Got it working: the trick was to add a special variable mode that
evaluates macros as literals, and a 'buffer' word that behaves as ':'
to define that macro. This then leads to the subsitution macros:

(substitutions
  (macro)
  
  ((variable name)  (buffer name 1))
  ((2variable name) (buffer name 2)))

see macro/target-rep.ss and forth/forth-tx.ss for implementation.


Entry: control structures / contitional jump
Date: Thu Mar 27 16:17:00 EDT 2008

there's an opportunity now to write the forth-style control words in
terms of higher order abstractions. is this possible? or are the more
lowlevel forth constructs necessary? 

probably things like for .. next are going to lead to
trouble. basicly, i need the equivalent of 'label' and a way to
emulate fall-through.

the problematic part is the conditional jump.

conceptually, it joins 3 words parts: the part before, and the 2
branches. let's just do if:


: bla if do-it then go-on ;

->


: bla   ' l0 ift go-on ;
: l0    do-it ;

ha! 

forth-tx.ss has a 1-element stack. turn this in an arbitrary length
stack and all branches can be postponed and compiled after the word is
done.

that's the mechanism: making this work so the current 'inline'
branches are still used should be straightforward. maybe it's just a
'swap' on that stack?

again. the idea is to make temporary at each brach point.

let's try the compilation stack thing.
OK. implemented. now what does it mean if there is more than 1
continuation waiting?

ok. i know what this is!
quoted code ;)

expression nesting should be part of the parser..

but the stack's not a stack but a queue: quoted defs come AFTER
current def.

i got a bit of a pardigm clash here: the scheme lexer has support for
s-expressions, so is a better candidate for building a syntax for a
language that supports code quotations. however, this is not
compatible with forth syntax. the question is: how to map

  if <1> else <2> then    ->     [ <1> ] [ <2> ] ifte
  begin <1> again         ->     [ <1> ] forever
  
doesn't look like it's a good idea to write a second recursive
expression parser.. really. stick to s-expressions for that. part of
the purrr kernel could be written in this different syntax: all the
machinery to manage that is available now.

rephrase the question: how can we keep the illusion of straight-line
code? it's very convenient to have as a low-level tool, but for
optimizations it's better to have a graph structure.

maybe forth chunks should just be lined up?

what about using an assembler instruction for this? fallthrough?

problem is that this doesn't collect all the variables.. maybe
variables should have them too? or variables represented as an 'allot'
opcode?

so, what about:
  * forth words have reversed asm code stored
  * the head of the list (last instruction) is 'falltrough' which
    points to the next word.
  * compilation interface exposes 2 words: register and compile.


Entry: variables again
Date: Thu Mar 27 18:01:01 EDT 2008

if variables are represented by the opcode 'allot', then they can
probably be generalized to quoted words: this is basicly
'create'. do i need to distinguish between ram/rom/eeprom allot? for
now, let's keep it simple.


Entry: semicolon
Date: Thu Mar 27 19:22:22 EDT 2008

let's say that 'exit' always means return, even in macros. but ';' means 

   macro: jump to end
   forth: exit

how to implement "jump to end" ?

this requires labels.

so.. what is a label? an entry point. a word.

let's see.. there are 2 kinds of fork points:

  * conditional goto   (not call!)
  * entry / label

so.. a macro can split a word?


  x x x then y y y

this turns the ys into a different word. the place where this would
happen is in target-rep.ss in macro->code:

that call might return multiple words. the first one being the
original word, but other ones made of chunks, where each chunk is an
entry or fork point.

note that this is almost the same as the splitter for parsing forth
definitions. why are they not completely the same? problem with the
forth parsing requires a function return because of 'rpn-next', while
the code splitter doesn't..

so.. this needs to be brought to the level of words (which have
names), not code lists. OK done that.

so.. what needs to be a word? jump targets. the reason is simple: jump
sources are clearly visible (idally, a word would be ONLY jump
sources), but targets are not. maybe in a later step also eliminate
sources? anyways. there's 2 kinds of jump targets: forward references
(if .. then) and backward references (loops).

OK: it's basicly like it was before, but labels are now references to
word structures (created when the label is created) and created by the
'label' macro. these are used by the 'split' word to start filling the
word structure with code.



Entry: hindsight
Date: Sat Mar 29 09:39:50 EDT 2008

Making BROOD more static is a way to bring the early exploratory phase
into a more fixed structure. It seems the overall design is good
enough to be stabilized. from that perspective it makes sense to cast
it in stone.

What did actually change over the last couple of weeks? Basicly,
symbols are disappearing. The only place where they are still left is
in the assembler, but that's easily changed. Symbols are replaced with
identifiers, and are a (scheme) compile time object. The Forth
compiler is now implemented using Scheme's exposed compiler API (the
macro system).

What this change did for the structure of the program is to point out
places where reflection was used without justification. It's now using
PLT Scheme's approach of 'unrolled reflection'. As a result, more
things can be checked at compile time + name handling is completely
handed off to scheme.

This is the natural extension of 'lamb / brood 3'. The step through
brood 4 was necessary to get familiar with the language layer
approach. Giving up the NS hash table and run time evaluation hacks is
the final step in trusting this layered module system: when names are
identifiers, scheme can do the management.

   so to restate one of the goals of brood: to do Forth the PLT Scheme
   way: built on top of its hierarchical macro/module system.

things i'm adding:
   - RPN: rpn syntax for scheme
   - SCAT: an embedded rpn language with scheme semantics
   - PAT: a typed concatenative pattern subsitution language
   - MACRO: a language for expressing postponed forth semantics
   - FORTH: building forth syntax on top of scat
   - PURRR: macro language with forth syntax

MACRO has postponed semantics because the output of brood is assembly
language. If there were a simulator, it would make sense to bring the
static implementation down to that level. Instead, that part is still
interpreted. This also allows easier integration with external
assemblers.

Central to brood is the structure of the MACRO language. It's a
concatenative language that operates on 2 stacks: the SCAT data stack
which is used as the MACRO compilations state stack, and the 2nd
stack, which contains the assembly code, is interpreted by the PAT
language as a typed concatenative language: the one which implements
partial evaluation in PURRR.

PAT's matching can be translated to compile time if at one point it is
decided to give up on the symbolic asm representation, but instead use
a (typed) abstract one. This probably needs some cleanup in the
assembler first.


 
Entry: exit: jump to end
Date: Mon Mar 31 08:59:52 EDT 2008

now for 'exit'. what i want to support is words/macros like this:

    ... if ; then ...

the ';' is exit for ordinary words, but it's jump-to-end for
macros. the 'end' is defined in the environment that executes a single
macro. there's no way to do this except for wrapping each macro.

ok. so, split it in 2 parts. 

   * meaning of ';'. for macros, it means 'exit-macro', for words it
     means 'exit'.

   * implement 'exit' and 'exit-macro'.


not that local macro exits always occur within some conditional
construct which ALSO introduces splits. can this be used somehow? or
is it better to optimize this away when eliminating empty words?

aha. the label is that for the code AFTER the exit, not the code it
exits TO! they's different!

seems to work. note: this allows dead code elimination: anything
present in a word's code body AFTER exit or jw can be eliminated. this
allows to turn jw/nop -> jw making code re-arrangement possible. dead
words are then eliminated by just not being reachable.

the next problem that pops up is interference between 'exit' and how
it is used in optimizations. basicly, 'exit' needs to be a parameter.


seems to work.
one more problem: if the last word in a macro is ';', it doesn't need
to split: would create too many spurious words.

maybe the ';' thing can be checked at (scheme) compile time? the
problem is that this really needs semantics.. so, no.. maybe propagate
source location information to at least give proper error message?

ok. wasn't so difficult: srcloc is now passed from compile time to be
stored in the target word structure, or used in macro error reporting.



Entry: eliminating the meta language
Date: Mon Mar 31 11:11:48 EDT 2008

maybe it's possible to eliminate the scat: meta layer altogether?
maybe makes things simpler, but could lead to some circular refs. scat
is really only necessary to implement 'm>' and '>m', the rest can be
implemented with the pattern language.

hmm.. let's keep it in there for a bit.



Entry: local variables
Date: Mon Mar 31 13:54:41 EDT 2008

.. and then i'm done.

: foo | a b c | c b a ;

first step is to make an anonymous version of the pattern language.

took some time, but got it. needed to clean up pattern language
transformer intermediate state.

the interesting thing now is that we basicly get multiple occurences
for free. if locals captures the 'state' argument of

  (lambda (stack)  (expr stack))

this is pretty straighforward:
  * close the expression collected up to then
  * apply it to the input state
  * collect locals from this state
  * bind locals to variables
  * bind wrapper macros
  * exand the rest of the code in this augmented lexical env.

EDIT: yes, but it took some detours ;)
it's working in the most generic version, with as much as possible in
the form of runtime support.



Entry: interpolation in ellipsis
Date: Tue Apr  1 08:51:51 EDT 2008

var ...  ->   (1 2 3 var #,(bla) 4 5) ...

doesn't seem to work. trying to work around that with local syntax
(let-syntax).

this works, but it made me run into an interesting problem: the local
environment of the transformer is lost when doing something like

(let-syntax
  ((rep: (lambda (stx) ((rpn-represent) (stx-cdr stx)))))
 ...)

now, i tried to capture that environment, but apparently that doesn't
work becasue the RHS is evaluated in a different phase.

basicly, the transformer in which the let-syntax expression occurs and
the RHS are independent.

i don't see a way around this, but it might be interesting to think
more about it. i had some small bad feeling about parameters, and this
is where it goes wrong..




Entry: time to start porting brood
Date: Tue Apr  1 12:09:15 EDT 2008

looks like all the machinery is in place, except for incremental
compilation. time to drag library stuff over, then think about
incremental stuff.

the next problem might be 'load'. which has to be replaced by a module
based interface. the thing to solve here is 'require' in forth. this
probably requires the function that moves forms to the toplevel..

maybe 'begin-lifted' from (lib "etc.ss")

i got it working by manually collecting requires in the purrr-lang.ss
wrapper, but that's not the best way..

/usr/local/plt-3.99.0.12/collects/scat/purrr/purrr-syntax.ss:41:10: require: not at module level or top level in: (require "purrr-bla.f")

now.. how to collect code from different requires?


looks like i need to get at the require before names are expanded.

OK. fixed it by expaning straight to a #%plain-module-begin


Entry: collecting words / incremental compilation.
Date: Tue Apr  1 14:55:15 EDT 2008

suppose we're building a kernel. that kernel is represented by a
single module. when instantiating that module, we get access to the
exported words. these are linked to structures that might not be
provided explicitly.

all dependencies are handled, and the required target code can be
computed by flattening the call graph given the entry points.

this means the problem of getting a linked kernel with limited entry
points is separate from building a library of macros accessible in
other programs. maybe time to start working on incremental
compilation?



Entry: purr18 / redefine
Date: Wed Apr  2 09:15:33 EDT 2008

maybe leave the incremental compilation bit till later, and try to
port the core purrr18 language first, then figure out how to modify
the assembler. maybe the latter should really be kept separate so i
can target external assemblers.

ok.. the next problem is the use of a lot of undefined bindings in the
previous pic18 spec. it needs a proper mechanism for target plugin
behaviour. one way to solve it is to wipe it under the carpet and move
it to the assembler, since that's symbolic. but this probably won't
work for everything. i.e. if a DUP is defined in the kernel, it needs
to be replaced in all the code that uses it. it's true late binding.

there are 2 paths to take, the static one (units + explicit linking)
and the dynamic one (just redefine the macro structs). since the
core language is macros only, that's ok.

there is one problem though: the core language's code should be
independent of target. if some target decides that a core macro needs
to change, core shouldn't have to know that. that means it has to
provide an exact specification of all words that can be
redefined.

this looks like too much of a hassle, so let's go for redefine using
mutation of word structs + some mechanism to at least keep track of
changes.

what about this: allow for macros to be redefined by checking for the
availability of the identifier. if they are defined, bind the old
functionality to a name SUPER, so one could do things like:

   (([drop] dup)  ([movf 'INDF0 0 0]))
   ((dup)         (macro: SUPER))

this makes sure that words are at least defined, and it also makes the
hierarchy between redefines clear (= same as module hierarchy).

what model is this? it's not late binding (at least one binding needs
to be present).

TODO:
       * fix 'insert' syntax to something simpler. OK
       * add redefines to the macro syntax

think more about why this is a bad idea.


redefine can be a postprocess step by having it refer to itself first,
then to swap the old and new implementations.

why are assignments bad? no sharing is possible. that's where
parameters are better in some cases: at least the extent of the side
effect is limited. so should i just make each word a parameter?


  - if the name exists, it doesn't get redefined, but the word that's
    returned by the wrapper is swapped with the one defined


(define <name> <body) ->

(letrec ((macro/super <body>))
  (swap-word! <name> macro/super))

can this be handled by define-ns ?

no.. it needs to be at the module language level: that's where the
define-ns macro is inserted. nope.. it needs to be deeper than that.

got it working with this:

(define (define/swap!-ns-tx define stx)      
  (syntax-case stx ()
    ((swap! ns name val)
     (let ((id (ns-prefixed #'ns #'name)))
       (if (identifier-binding id)
           ;; introduce 'super' as temporary self-ref
           (let ((super
                  (ns-prefixed
                   #'ns (datum->syntax #'val 'super))))
             #`(letrec ((#,super val))
                 ;; swap to undo self-ref
                 (swap! #,super #,id)))
           #`(#,define #,id val))))))

tested with:

(define/swap!-ns word-swap! (macro) dup (macro: super super))
-> expands to
(letrec-values (((macro/super) (macro: super super)))
  (word-swap! macro/super macro/dup))

(macro/dup (make-state '() '((qw 123))))
-> #(struct:state () ((qw 123) (qw 123) (qw 123)))

the module level thing didn't work because the 'require' statements
were not expanded yet, so bindings were not there.

now, the test case with forth lang doesnt work. FIXED: problem was
the introduction of 'super' from syntax context != source.


Entry: parameters
Date: Wed Apr  2 20:39:46 EDT 2008

some words about 
  1) bottom-up VS. late binding. 
  2) augmentation (permanent) or 
     default + specialization (temporary redefine)

(the interesting thing about writing brood/scat is that it brings up
issues that seem quite important on the level of larger scale program
organization)

so, is this specialization of deep components just a dirty hack? 

  + bottom up design with static binding at least solves the
    'undefined' errors: the lowest layer's linking is statically
    checked. having a core that's bottom up makes it easier to develop
    and test: there are a lot of macros and transformers in there. the
    defaults for low-level components could be just for testing
    partial eval.

  + allowing vectored code (making everything a parameter) solves
    having to explicitly _declare_ things as a parameter: some
    pluggable behaviour is necessary in the current approach.


are parameters necssary in the core language? they make sense for rpn:
used in scat/macro/interactive. 

do the macros need a similar approach?
is this all a consequence of the solution, or the problem?

seems the most important question is: OK to let higher level modules
modify parameters, instead of setting them with 'parameterize'?

what about making re-definition explicit in the 'compositions' macro?


Entry: standardizing interface names
Date: Thu Apr  3 07:25:57 EDT 2008

The goal is to make the macros used to define rpn words handle
redefinitions. The previous approach of mutation will be replaced by
parameters. Before that, let's give an appropriate name to toplevel
macros:

'compositions'
'patterns'


Entry: every macro a parameter?
Date: Thu Apr  3 07:39:22 EDT 2008

What about defining extensions as a functions that installs
environment modifications and runs a thunk? This allows to keep at
least the most basic functionality intact, and allows specific
extensions to be represented by an object.

To come back to yesterday's remarks: i'm a bit uncomfortable with
extension of low-level components without being able to undo
those. Carefully building a bottom-up structure and then starting to
poke around in its innards without 'undo' mechanism seems wrong..

So what is the real reason for needing this poking? The macros employ
target specific optimizations.

This should solve it:

(define (make-macro . args)
  (make-parameter (apply make-word args)))

Hmm.. where to introduce? First naive replacement does violate
something.. a lot of code assumes macros are functions.

Maybe in 'make-word' ?

Maybe using dynamic-wind is still the best approach. The problem is
that my representation type isn't abstract enough: i really would like
it to be a procedure mapping state->state.

So.. is dynamic-wind thread-local? Maybe that's the big difference.

So what about this: keep word interface like it is, but provide a
words-parameterize form: if it can't be solved with straight
parameters because they are procedures, solve it on the other side of
the interface.

EDIT:

some auto-upgrade is added so it is not NECESSARY to specify that
words are parameters when the auto-upgrade words are
available. however, it is possible to PREVENT words from becoming
parameters by not exporting the auto-upgrade functions.

this looks like it's flexible enough. now to adjust the compositions
and patterns macros to use this.

got 'with-compositions' working, but it needs a 'super' too.

got 'super' working too.

so now the mechanism for extending the compiler is in place: every
word can be replaced in dynamic context + a mechanism for at least
limiting some replacement can be easily installed.

EDIT: got pic18 compiling with redefined core words.  now.. get rid of
the parameters in target.ss and make those into re-definable word
too. then it should be mostly done.

todo:
  * remove target-postpone-* parameters in macro-lang.ss and replace
    them by parameterized words. OK
  * same for split / label? -> NO: special api


 
Entry: uneasy feeling
Date: Thu Apr  3 16:28:44 EDT 2008

something doesn't feel right with this parameterization. however, it
does look simpler..

maybe some mode should be added to automatically extract which
parameters are redefined? but then, 'super' might not do anything.. i
need to think a bit about this.

EDIT: so why is this parametrization necessary? it's a cross-cutting
concern: OPTIMIZATION. it's used not (necessarily) to change
functionality, but to change implementation.

the thing which makes it a bit half-assed, is the way in which
responsabilities of the core and the extension (pic18) are
distributed. is there a sound abstraction hidden here?

OK.. so what about automatically collecting all the extensions at
compile time, but leaving them unspecified in the code. what about
doing it the other way around: specifying ALL target specific macros
as an extension, and have them define the parameter if it doesn't
exist yet.



Entry: simplifying ns-tx
Date: Fri Apr  4 10:48:55 EDT 2008

a lot of syntax-case macros get simpler by using let-syntax: this
allows ellipsis to be used instead of explicit list manipulations.

the core 'ns' functionality is really to transform the names in a
symbol introducing macro like let. abstracting this now.


box> (letrec-ns (macro) ((a 123) (b 345)) #f)
scat/ns-tx.ss:94:19: compile: access from an uncertified context to unexported variable from module: "/home/tom/scat/scat/ns-tx.ss" in: make-let-ns-prefixer

after exporting the variable it worked..



Entry: what is an extension?
Date: Fri Apr  4 12:09:02 EDT 2008

so.. i'd like to keep the hierarchical module approach, which means
that several language modules can be built on top of the same code,
without the need for different module instances. this requires at
least that the FUNCTIONALITY of the module's data structures (words)
is not modified. in that respect the current approach with parameters
is OK.

so, an extension is a PARAMETERIZATION of the core compiler, such that
it generates target-specific code using abstractions provided by the
core.

currently, each extension needs to know which words are to be extended
using 'with-patterns' or 'with-compositions' macros.

now, does it make sense to have 'with-xxx' automatically define names
if they do not exist yet, with 'super' bound to an exception?

or should i just forget about all this nonsense and go back to
multiple instances of modules, with permanant mutation of word
structures?

let's read the doc about mudule instances first.



Entry: module instances
Date: Fri Apr  4 12:27:26 EDT 2008

looks like i'm stuck at a fundamental misunderstanding.. each time
'require' is called, the RHS of define expressions is evaluated. so
ALL the expressions are instantiated! this happens EVERY TIME AGAIN
when code is required.

however, this process is fast if the code is already COMPILED.

how did i work around this with the ns hash before? it looked as if
there was only a single instance.. maybe because i was working in a
single module environment?

in the light of the re-definitions and parameters mentioned before,
what i'm trying to do doesn't really make sense: suppose i want a
PIC18 and PIC30 forth. if i want both of them in the same module,
parameters would be a good approach.

but having them in separate modules would work just as well. they
would represent different specialized instances of the core compiler,
but would have no sharing of data. they could be explicitly put in
different namespaces. in such an approach, modifying the
word-structures in-place is a perfectly legitimate approach.

the parameter words weren't for nothing however: they can still be
used for local parameterizations, like ";".

OK. 'compositions' and 'patterns' now will re-define words, with
'super' bound to the previous implementation.

i guess now it's time to see how to compute code instances?


i tried this:

tom@del:/tmp$ cat A.ss 
#lang scheme/base
(require "B.ss" "C.ss")
(printf "A\n")

tom@del:/tmp$ cat B.ss 
#lang scheme/base
(require "C.ss")
(printf "B\n")

tom@del:/tmp$ cat C.ss 
#lang scheme/base
(printf "C\n")

box> (require "A.ss")
C
B
A

box> (define NS (make-namespace))
box> (parameterize ((current-namespace NS)) (namespace-require "A.ss"))
C
B
A


so, what happens is: each toplevel require pulls in what is necessary,
but instantiates the phase 0 expressions only once. the same require
in the same namespace will not do anthing, but in a different
namespace it will instantiate again: values are not shared between
namespaces, but compiled expressions are shared (the global module
registry)

The manual has this to say:

  In a top-level context, require instantiates modules (see Modules
  and Module-Level Variables). In a module context, require visits
  modules (see Module Phases). In both contexts, require introduces
  bindings into a namespace or a module (see Introducing Bindings).

So what's the difference between 'instantiate' and 'visit'? Visit just
looks at the phase 0 names, and evaluates any phase 1 expressions, but
doesn't evaluate phase 0 expressions. Got it.

This brings new light to collecting code. All modules should require
the central code registry, which then defines a *code* variable. Got
every module's code annotated by its name too: this gives all the code
compiled + a possibility to take only what's necessary.


Entry: next
Date: Fri Apr  4 14:36:09 EDT 2008

so what's next?

get the monitor to compile with 'require' instead of 'load' +
provision of symbols. this might introduce some trouble with undefined
symbols. also, the assembler needs some fixing.

that's a bit boring.. what's the most interesting thing to do next?




Entry: beautiful vs. interesting
Date: Fri Apr  4 17:16:35 EDT 2008

i was thinking about beautiful vs. interesting today. if beatiful
means simple on a superficial level, then interesting means simple on
a deeper, hidden level. in programming, things are usually interesting
before they become beautiful.

boring then means non-compressed complexity: that you don't understand
the problem.. it's either simple or interesting. boring can always be
made interesting on a meta-level, right?




Entry: poke
Date: Fri Apr  4 17:22:15 EDT 2008

what about starting poke, as a temporary relief from the pic18 guts?
poke should fit nicely on top of macro. it might also be a way to
improve on what 'assembler' means.

let's port the C code generator first.

the cgen has a special purpose transformer. let's map that to macros,
starting from the bottom. hmm.. doesn't really work that well since
this prints to strings, not s-expressions.

OK.. got indentation working as parameters. now.. go from
s-expressions -> syntax objects so 'syntax-case' can be used. before
that, first needs to be defined what an expression transformer is:

    stx -> string

ok. got it: syntax-case instead of match, but not using the scheme
expander.



Entry: compiler's NEXT
Date: Sun Apr  6 11:27:33 EDT 2008

this should be (source, exp) -> (source+, exp+) instead of returning
exp only when done: that would eliminate the hoop-jumping in
forth-tx.ss

ha: the continuation thunk used doesn't work well with the current
continuation passing approach. -> fixed it by returning a thunk
instead of just the syntax: some extra information was needed (a ':'
means 2 things: 1. end previous def 2. start new one)

makes me wonder if i can fix the splitting in target-rep.ss in a
similar way.


Entry: the assembler
Date: Sun Apr  6 14:43:02 EDT 2008

keep the road open to have assembler opcodes as syntax, to carry over
source code information.

so what does the assembler do?

  * provides chunks of (fully linked) target code representation given
    a namespace of primitive assemblers.

  * resolves target code/data addresses (using a multi-pass algo)
  

what does the assembler not do?
  
  * symbol resolution: all symbols should be resolved by the compiler,
    so there is no need to do any namespace management.

  * code order is determined by the compiler: assembler gets a list of
    word structures.

should it be called 'assemble!', and see it as a graph update
function?

not boring at all: there's a problem that needs to be solved, and
quite a deep one: what about symbols? well actually, they might
evaluate straight to word structures, which are accessible!


Ha!

evaluation of expressions in the assembler are dependent on 2
different context: whether they are part of call instructions, or part
of literal loads. this seems to solve a big problem, but i can't quite
express it yet...

TODO:
  * evaluation of symbolic code
  * think about side effects OK/not ?
  * where to store assembler opcodes?


Entry: quoting and meta-code
Date: Mon Apr  7 07:36:13 EDT 2008

the problem: quoted labels. somewhere down the line, quoted words
loose their quoted tag, such that macro evaluation doesn't give what's
needed.

' abc 1 + jump

what about getting rid of the symbolic part, and starting with clean
semantics?

 (([qw a] [qw b] +)    ([qw (macro: ',a ',b +)]))

this doesn't work of course, because it recurses.. so what would work?
schould it be scat code?

 (([qw a] [qw b] +)    ([qw (scat: ',a ',b +)]))

what is the semantics of a quoted scat word? can it be just a thunk?

 (([qw a] [qw b] +)    ([qw (lambda () (+ a b))]))

thunks are most flexible, but would it be good to limit the semantics
to somehow scat words or macros?

let's restate the goals:
  * obtain a value at assembly time.
  * allow easy composition of meta-code at compile time
  * allow meta code inspection
  * simplify definition of meta-ops (snarfing)

 (([qw a] [qw b] +)    ([qw (scat: ',a ',b +)]))

maybe i need to give up on inspection, and solve that
later. concentrate on semantics first.

   [qw <thing>]

what is <thing>?

it's any VALUE that makes sense at compile time, to be passed around
between macros, but eventually, it should end up as a number.

what is composition of meta code?

in the previous approach, this was done syntactically: just
concatenate lists. is this still a good approach? isn't a more general
abstract approach better?


 (([qw a] [qw b] +)    ([qw (meta: a b +)]))

now, what does 'meta:' do ?

  * produces a single (delayed) numerical value
  * the '+' comes from scat namespace
  * the 'a' and 'b' are lexical parameters.

ok. got it in macro/meta.ss : simple layer on top of scat code which
appropriately quotes lexical variables, and wraps results in a meta
structure to chain evaluation.

seems to work with the pic18 stuff too.

now: meta annotation. something that might come in handy is to figure
out where assembler literals come from. in the old brood, code was
just symbolic. here it needs to be annotated explicitly, because that
information is lost.

problem: meta code has to be thunks: the value can depend on numerical
addresses of words, which might change during the relaxation phase of
the assembler.

Entry: tick
Date: Mon Apr  7 10:57:55 EDT 2008

so, with meta quoting out of the way, the real problem can come back
now: computing with word labels. this probably boils down to giving
TICK the proper semantics.

     ' foo

this will produce a literal with a quoted macro. all symbols in
macro/forth code need to be macros, and quoting symbols needs a
different tick.

what does it mean to quote a name? it produces a literal value that
supports an 'unquote' operation.

in addition: it MIGHT support POINTER MANIP if it is a macro that
wraps a call to a word.

so: the previous approach of treating symbol names as word addresses
IMPLICITLY dequotes it to yield a numeric address value.

anonymous macros might be convenient. anonymous words also. what's the
difference?

let's see:
      
      ' foo compile    ==   foo

this is an important issue, and needs some more thought. the
difference between "execute" and "compile" should be cleared up also.

looks like i really need to be careful with AUTOMATIC changes between
macros and words.

NOTE: macros can't survive to the assembly phase, so everything that
used to be a symbol, now needs to be a target-word struct.


does this solve it?

 ;; Get the address from the macro that wraps a postponed
 ;; word. Perform the macro->data part immediately (as a type check)
 ;; and postpone the address evaluation.
 (([qw macro] address)
  ([qw (let ((word (macro->data macro 'cw)))
         (meta-delay
          (let ((pointer
                 (target-word-address word)))
            (unless pointer
              (error 'unresolved-word "~s"
                     (target-word-name word)))
            pointer)))]))


looks like it: 'run' and 'address' are now separate. 'run' doesn't
need to know if the quoted macro represents a target word. 'address'
does need to know that, and fails if it is not.


Entry: optional library code
Date: Mon Apr  7 12:19:22 EDT 2008

Then, the more general problem of requiring runtime words only if
necessary. For example:

 (([qw macro]  run) macro)
 ((run)             ([cw 'runtime-run]))

Since symbols are no longer allowed, this form of late binding needs
to be handled differently. The most straightforward solution is to
have the default throw an exception, and rely on targets to implement
the word.


Entry: bugs
Date: Mon Apr  7 13:03:29 EDT 2008

something went wrong with forth / macro mode: check test/purrr-broem.f
-> fixed: current-mode evaluated at the wrong time.


Entry: assembler
Date: Mon Apr  7 20:08:01 EDT 2008

using a prompt-tag to abort from meta-force: this somehow needs to
give the smallest or the largest instruction, depending on whether
assembly uses a growing or shrinking relaxation.


Entry: more assembler
Date: Tue Apr  8 09:12:35 EDT 2008

About the relaxation algorithm: as far as i understand, it is
necessary that individual instruction sizes move only in one direction
(grow/shrink) to prevent oscillations in the relaxation phase.

2 questions:
   * is this correct?
   * how to ensure?

ok.. i'm going to assume it's necessary to limit size changes. how to
implement this? for this to work individual instructions need to be
tagged somehow.

let's put the responsability at the machine assembler end, and provide
only a mechanism to record the previous result.

on the other hand, we could pad with nops. OK. got it.


Entry: relative addressing
Date: Tue Apr  8 13:41:28 EDT 2008

so.. the relative addressing is a bit of a hack. is it possible to
move address resolution down to the assembler opcodes? sure. just have
them depend on 'pointer-get'.

let's port the pic18 assembler, and see if the generation can be
improved a bit.

porting asmgen and trying to get relative addressing, which now has
already overflow detection, to use absolute input.

Maybe meta-catch-undefined can be eliminated by setting undefined
words to 'here', so they compile to a small relative jump
instruction. Maybe just leave that out: it's an optimization, not
essential.

OK. pic18 seems to work too.


Entry: .f -> .bin
Date: Tue Apr  8 17:11:08 EDT 2008

Time to define the purrr18 language, which assembles straight to
binary. Maybe target-word structures should have a 'bin' slot?

ok.. now for assembling on the spot.  or is that not a good idea?

(assemble! (apply append (map cdr *code*)))

maybe it's time to start doing the "load in namespace" thing?



Entry: workspace
Date: Wed Apr  9 13:37:21 EDT 2008

so.. all the static stuff seems to be in place, now for the dynamic
workspace. there are some issues still with multiple instances.

so. why would one want to use a namespace? to use 'require', 'eval'
and 'compile' in a controlled fashon.


Entry: outstanding issues
Date: Wed Apr  9 13:44:48 EDT 2008

  - dead code elimination OK
  - opti-save / pseudo OK
  - variable allocation OK: words got realms now
  - splitting: OK
  - jump chaining
  - org
  - code serialization



Entry: assembler bugs
Date: Wed Apr  9 13:53:52 EDT 2008

something wrong with error handling on eval..

(allot data 123) -> the 'data part is not allowed: only numbers and
meta promises that evaluate to numbers


Entry: multiple compiler passes
Date: Wed Apr  9 14:45:03 EDT 2008

something i forgot about: there is the 'pseudo' and 'opti-save' pass
that goes over the code after the first pass. it might be a good idea
to formalize this a bit.

the real question: why not postpone all optimisations till later, and
have the core language be as simple as possible?


Entry: words vs macros
Date: Wed Apr  9 16:21:56 EDT 2008

now that i got the target word datastructure in my hands, it's easy to
see that these are completely separate from the macros that generated
them. they can be serialized with evaluated meta expressions, possibly
augmented with some annotations as to where the computations came
from. to get at macros, simply load all the source code, but discard
the *code* variable.

so next question: how to serialize a graph of structs? and while we're
at it: what about graphs and functional programming?



Entry: conditional jumps -> more static assembly rep
Date: Wed Apr  9 18:13:49 EDT 2008

something to think about is if it is possible to find a common
primitive for for .. next and other loops.

ha.. something else: assembler constants need to be bound: no more
symbolic magic. maybe also a good time to require assembler opcodes to
be bound names + perform arity checks?

also added an opcode check: it's probably best to just replace that
with module name bindings in the (asm) namespace though, so all checks
can be automated. however, that does require either moving the
assembler to compile time, or using namespaces + eval. a big hurdle is
the implementation of the pattern matcher: it uses symbols. since the
assembler namespace is flat, and not so big, and quite constant, it
doesn't really need to be managed.. let's keep it like it is, but add
an arity check.

it would be nice to have some things available at compile time though,
like arity checks. might combine both the symbolic matching AND some
static name binding?



Entry: static vs. dynamic
Date: Wed Apr  9 19:57:49 EDT 2008

i don't know whether this is mostly bias, but it seems that using a
bottom-up approach instead of an ad-hoc, late-bound approach makes
things easier to understand.

i.e.: i didn't realize that the "delay evaluation until assembly time"
is really ONLY about values of target code and data addresses:
everything else can be evaluated earlier. previously this was handled
in a sort of ad-hoc way with evaluation of macros and assembler
labels.

so, what about the assembler? should it be static (which needs some
magic in the pattern transformer) or stick to the symbolic approach?
maybe some middle road: names handled statically, and the rest done
with structure instances.



Entry: matcher
Date: Thu Apr 10 01:14:34 EDT 2008

it's probably best to:
  * represent assembler instructions with structs + explicit type info
  * write a special purpose matcher that takes into account bitfield
    widths.

the idea is this: the asmgen goes from the textual rep -> symbolic rep
-> procedure rep. what about stopping that compilation somewhere and
leave a little bit of interpretation of the proto?

(define (proto->assembler . proto)
  (match proto
         ((name formals . operands)
          #`(make-asm
             (lambda args
              (parameterize
                  ((asm-error
                    (proto->asm-error-handler '#,proto args)))
                (apply
                 (lambda #,formals
                   (list #,@(map assembler-body operands)))
                 args)))
             '#,proto))))

proto looks like this:
(movlw (k) ((14 . 8) (k . 8))))

moved the asm-error parameterization to assembler.ss resolve/assemble
   
ok. arity check works.

let's move on to making a pattern matcher. the thing to do is to make
it match instances.


need to revert to find other bug. the problem was with '(list-rest',
which behaves differently than the dotted notation in the other
matcher. as far as i can see the following was legal in match:

   #`(#,@'() . rest)  -> rest

however, in the alternative syntax this becomes:

   #`(list-rest #,@'() rest)  -> (list-rest rest)

which isn't the same:

box> (match '(a b c) ((list-rest bla) bla))
match: no matching clause for (a b c)

tricky business

ok, so now the plt matcher is in place, and it should be possible to
start matching struct instances instead of symbols. on the other hand,
it's not so essential: got plenty of checking implemented now.. maybe
move on to real work?



Entry: mature optimizer
Date: Thu Apr 10 13:28:53 EDT 2008

in order to keep the optimizer tractable, it has to be factored a
bit. lets see what we got now:

   (1) compilation + optimization of non-jump instructions
   (2) jump optimizations on intermediate code
   (3) elimination of intermediate reps + save opti

the last step is target specific atm.


Entry: meta eval annotation
Date: Thu Apr 10 13:32:08 EDT 2008

annotation is easily made by replacing undefined addresses with
symbolic references..

ok, not easy, but at least straightforward: target values now have 2
thunks: one that produces real values, and one that gives expressions
referring the target word names.

maybe.. it's better to just use an s-expression language here?
targeting external assemblers' expression languages will probably be
easier using a nested format, instead of a linear one.. for the
built-in assembler, there's no need, since the evaluation mechanism is
abstract.


Entry: conditional jumps
Date: Thu Apr 10 17:22:33 EDT 2008

these are special.. but how exactly?

one of the things i'd like to try is to isolate loop bodies so they
can be optimized. the previous 'amb' based approach (for .. next) is a
bit of a dirty hack, and doesn't work very well with non-flat code as
before..

the for..next opti checks to see which of these produces the smallest
loop body.

       for .. next
       dup for drop .. save next drop

so, what is the pattern here?
  * execute a couple of simultaneous paths
  * choose the best one

this probably needs a purely functional split loop, so continuations
can be used without trouble. let's try that.

what does 'split' do? it calls 'next' and then continues. so it needs
a true continuation.

remarks:

  * split doesn't need call/cc: it just produces a value, and that
    value isn't all that interesting.

  * in the loop body, no assignments can be made: when a split occurs,
    just cons the word and code list together, and perform mutation
    AFTER everything is done.


OK, i think i got it written down.. not sure whether it will work
though: afraid that the continuations in the macro evaluation will
somehow interfere with the update loop..

it should really be seen as 2 tasks communicating.. anyhow. more
later.

i can't get this to work.. probably i'm discarding things i'm
collecting when calling the continuation. maybe composable
continuations work here? but i don't really understand them yet.. it's
like stuff pushed to the return stack..

so why can't this be solved in a monadic way? actually, this is the
same problem as the one i'm trying to solve with passing data
alongside the normal stack: because there's no room besides 'data' i
can't just tuck away more stuff..

looks like this is getting me closer to how to implement the core
mechanism for monadic threading... probably going to learn a thing or
two here. let's concentrate.

  - a macro is a map  (stack,asm) -> (stack,asm)

  - i'd like to extend this to a map (words,stack,asm) ->
    (words,stack,asm) which has a single mixing operator 'split', and
    all the other operations are lifted.

how to do this?

http://community.schemewiki.org/?composable-continuations-tutorial

from plt manual:

  (reset val) => val
  (reset E[(shift k expr)]) => (reset ((lambda (k) expr)
                                       (lambda (v) (reset E[v]))))
    ; where E has no reset

similar:

  (prompt val) => val
  (prompt E[(control k expr)]) => (prompt ((lambda (k) expr)
                                          (lambda (v) E[v])))
    ; where E has no prompt

applied to the monadic bladibla: suppose 'D' performs some mixing with
other state y, but all the small caps operate only on x.


(lambda (x y)
  (a (b (c (D (e (f x)))))))

=

(lambda (x y)
  (let-values
      (((y+
         x+)
        (reset
         (a
          (b
           (c
            (shift abc
                   (let ((x+ (e (f x))))
                     (values (merge2 x+ y)
                             (abc (merge x+ y)))))))))))))


this way, the extra data 'y+' can be passed sideways, not going
through the 'abc' chain.

so.. as long as an expression is wrapped in a reset, 'shift' can get
inbetween. how to use this for implementing lifting? one prompt tag
per lifted thing?



Entry: composable continuations
Date: Thu Apr 10 21:33:38 EDT 2008

some simpler example is needed. suppose we're doing one prompt tag per
threader.

(define (one stack) (cons 1 stack))
(define (drop stack) (cdr stack))

(define (word: . fns)
  (apply compose (reverse fns)))



(define (broem stack extra)
  (define (mix s)
    (shift post
           (values (+ extra 1)
                   (post (cons extra s)))))

  (reset ((word: one mix one one) stack)))


getting tired.. what i want to do is basically: create a mechanism to
chop the code up in chunks that is compatible with continuations for
non-deterministic programming, so optimization can be implemented
using 'amb'.

what i sort of see is that it is possible to use shift and control to
chop up a program into different parts, and recompose them. in the
light of writing a RPN program (a b c) as (lambda (state) (c (b (a
state)))) this makes sense: shift can capture what happens AFTER a
certain point, upt to where the result is needed.


again.. i think i get what reset/shift do, but can't make the
connection to sidestepping threading. maybe i should try to translate
it to RPN first?


Entry: hiding more stuff in 'data'
Date: Thu Apr 10 23:06:10 EDT 2008

1. there is no difference in trying to extend stack -> (stack,data) ->
   more, so it should work with just a number.

2. in a loop which has extra internal state, compute a composition of
   functions where one of the functions is special, in that it can
   refer to the enclosing state.

can't this be hidden in 'make-state?' : let that function perform all
the necessary combinations of state.

what i wonder is how to relate this to real monads: the operation of
"flattening" 2 monadic layers?

  bind : (M t) -> (t -> M u) -> (M u)

  map  :  (t -> u) -> (M t -> M u)
  join :  M (M t) -> M t

map seems really trivial, but join?






Entry: lifting with shift
Date: Thu Apr 10 23:19:44 EDT 2008


 (a (b (C (d (e x)))))

 ab : x -> x
 de : x -> x
 C  : x.y -> x.y

now make   
 abCde : x.y -> x.y

let's try again:

add1 : x -> x
swap : (x . x) -> (x . x)

(define (swap x)
  (cons (cdr x) (car x)))

(define (lift fn)
  (shift post                        ;; capture postproc
         (lambda (xy)                ;; create new function
            (let ((xy+ (fn xy)))     ;; apply fn to its input
              (cons (post (car xy+)) ;; apply post to one of the components
                    (cdr xy+))))))   ;; .. and join again

   * capture the stuff that postprocesses the x component
   * apply the 


Entry: struct matcher
Date: Fri Apr 11 10:17:44 EDT 2008

in order to make the monad thing work, i'm going to use structure
types only, and write a special purpose matcher that handles nested
structure types with a simpler syntax.


Entry: broken functional compiler
Date: Fri Apr 11 15:03:07 EDT 2008

paste it here, so i can get the imperative back online:


;; To the macro layer, code and labels are distinct entities
;; represented by abstract target-word data type and reversed assembly
;; code lists respectively. After compilation, the code lists are
;; permanently attached to the word structs. During compilation, no
;; side effects are made, so continuations can be used for
;; optimizations.



(define (compile-word input-word)

  ;; Label generation is stateful, but that's ok since we don't care
  ;; much about the counter values. They are just for readability.
  (define next (make-counter 0))
  (define (label (name (format "_L~a" (next))))
    (make-simple-target-word
     (string->symbol
      (format "~a" name))))

  ;; Get the macro code, and create a start thunk and set up the
  ;; grabber parameter.
  (define macro (target-word-code input-word))
  (define name  (target-word-name input-word))
  (define grab-words (make-parameter #f))
  (define (go)
    ((grab-words)
     (macro->code macro name)))
    
  ;; Split needs to be purely functional so continuations can be used
  ;; freely when compiling code, discarding the split word state if
  ;; necessary.
  (define word/code
    (let ((tag (make-continuation-prompt-tag 'compile-word)))
      (set-target-word-code! input-word #f)
      (parameterize-words-ns!
       (macro) ((semicolon (ns (scat) postpone-exit)))
       (parameterize
           ((target-make-label label)
            (target-split #f)) ;; Ensure side-effects are local.
         
         ;; State updates directed by calls to 'split'.
         (let update ((words '())               ;; listof (word . code)
                      (current-word input-word)
                      (continue go))            ;; continuation thunk

           ;; Split needs to be purely functional so continuations can be used
           ;; freely when compiling code.
           (target-split
            (lambda (state new-word)
              (let ((code  (state-data state))
                    (stack (state-stack state)))
                (shift-at
                 tag
                 chunk
                 (update (cons ;; no assignments!
                          (list current-word chunk)
                          words)
                         new-word
                         (lambda () (k (make-state stack '())))))
                 tag))))
            
           ;; After 'macro->code' we end up here to record the last bit
           ;; of code, collect everything and exit from 'go' and thus
           ;; the 'update' loop.
           (grab-words
            (lambda (final-code)
              (cons (list current-word final-code) words)))
           
           ;; Continue computation
           (reset-at tag (continue)))))))
          

            

            

  ;; Link up structures, and return a list of words.
  (map* (lambda (word code)
          (set-target-word-code! word code)
          word)
        word/code))



Entry: next
Date: Sat Apr 12 00:30:14 EDT 2008

got a bit confused by the control operators yesterday. might look at
this link, and some more about cursors..

http://blog.plt-scheme.org/2007/07/callcc-and-self-modifying-code.html



Entry: monads
Date: Sat Apr 12 13:03:46 EDT 2008

so.. from the point of 'map' and 'join', which i think are easier to
understand.

  map:  take a function f:u->t, and turn it into Mu->Mt

  join: take MMt to Mt: undo a 'double wrap'

the key insight is that how many times f is used, and in what order is
not specified. and for join, it doesn't matter what the wrapping does,
as long as it can be flattened: wrapping can contain multiple base
type instances, in whatever structure.

maybe 'bind' isn't that hard to understand after all, it takes a monad
Mt and a function that produces a monad Mu from a value t, unwraps Mt,
applies t->Mu as many ways as necessary, and combines all the Mu into
a single Mu.

in the map/join version, the ORDER of wrapping is very important. 

((map f) m) == (bind m (lambda (x) (return f x)))
(join m)    == (bind m (lambda (x) x))
(bind m f)  == (join ((map f) m))

in order to understand this better, i'm trying to implement it
(without looking at other implementations.) see monad.ss


i'd like to make 'map' and 'join' polymorphic, but that's not quite
possible because of absence of typing information. functions could be
annotated however (do contracts help here?)

(something in the back of my head: in haskell, one can dispatch on the
return type of a function. i'm not sure if that's going to be a
problem here.. EDIT: it's about the unit operation.)



Trying to implement a monad that carries around just an extra scheme
value. This is the simplest thing i can think of.

(define-struct extra-monad (value extra)

(define (extra-map t->u)
  (struct-match-lambda
   ((extra-monad value extra)
    (make-extra-monad (t->u value) extra)))))
               
(define extra-join
  (struct-match-lambda
   ((extra-monad (extra-monad value extra-inner)
                 extra-outer)
    (make-extra-monad value ???))))


The problem seems to be in the join operator. Map is simple: just
pass it on. But what does the combination do? An option is to simply
pick one of the 2.

http://groups.google.com/group/comp.lang.functional/msg/2fde5545c6657c81

     "You can also turn programs in continuation passing style into
     monadic form. In fact, it's a significant result (due to Andrezj
     Filinski) that all imperative programs using call/cc and state
     can be translated into a monadic style, and vice versa. So
     exceptions, nondeterminism, coroutines, and so on all have a
     monadic expression."

maybe time to formulate my question: since call/cc seems to be more
'native' to scheme, why don't i use that instead of monads?

ok.. am i allowed to try again with reset/shift ??


Entry: reset / shift
Date: Sat Apr 12 14:00:25 EDT 2008

trying to do this:

  (abcZdef)  -> (ABCZDEF)

need to do this dynamically, without changing the small caps.. wrt to
state, the diagram should illustrate it:

  -a-b-c-Z-d-e-f-
  -------+-------

so i try to use 'shift' to collect the remaining computation, and turn
it into a lifted function.

what i want is this:


(lambda (_) (f (e (d (Z (c (b (a _)))))))) ->

(lambda (_)
 (cons  (cons (lambda (x) (f (e (d x)))) z)
        (c (b (a _)))))

the problem is really termination (the 'null' of the list if you want)


ok.. i get something, but not what i expect..
time to go to a simpler version.

hmm.. very confusing stuff: i understand what happens if there's one
shift, but every next one gives results i don't understand. probably
best to try to write out some examples using the reduction rules.


Entry: static
Date: Sat Apr 12 15:23:48 EDT 2008

if i can't get it to work dynamically, why not provide the information
statically? the only thing that matters is the binding for the
functon 'split' in the compiler loop. isn't there a way to make the
forth macros accept this word?

the problem is: there are words defined on top of split, so i'd have
to make all those dependencies static too..



Entry: state.ss / 2stack
Date: Sat Apr 12 17:13:04 EDT 2008

main problem: the 'data' part in state is the thing that's passed
around by all control words. this cannot be the 2nd stack: data needs
to be a stack of 'wrapped things'.. i'm not sure what that means yet,
but it's 'stuff' that gets threaded through computations.

BROOD 6 is probably going to be about doing this with composable
continuations..

i'm going to try to shield access to this data atom. i think i still
don't understand why scat-control.ss needs to have this atum clearly
visible.

trying to define these stack update functions, i find a need to make
the WHOLE state representation explicit again.

ok.. it's the right path, but i'm using the wrong abstraction.

i need a mechanism to just stick something on state to be retrieved
later. in other words, the layers of wrapping need to be made
abstract.

i.e in:    a b c D e f g H i j

if D and H interact with the threaded state in some way, they needs to
be able to do that without the lower case functions knowing about the
existence of these things.

different type of things need to work independently. or not? i.e. asm
access only makes sense if the type is actually extended with such
information.

i'm missing some crucial insight here..



Entry: automatic lifting
Date: Sat Apr 12 18:05:29 EDT 2008

i'm looking at this the wrong way..

this all makes so much more sense taking the stance of "automatically
lifting" a procedure whenever it is applied to a certain input
type. that's really the only thing necessary..

so what about turning this around and seeing 'state' as an object with
a method 'apply me'

imagine a conversation between STATE and FUNC. 

STATE: dude, i want to apply you. what's your game?
FUNC:  i take A to A
STATE: hey, i got some A here, i'm going to use you and move on.

so, 'state' should really be a function.

STATE: FN -> STATE


so.. it's the responsability of the state to interpret the functions
applied to it, and the responsability of functions to identify
themselves.

(define (make-stack data)
  (lambda (fn)
     (if (stack-proc? fn)
         (make-stack (fn data))
         (error 'type-error)))
              

this changes the representation from

  (lambda (x) (a (b (c x))))

to

  (lambda (x) ((((x c) b) a))


which really looks like RPN code :)


the 'dumb' state would be

(define (dumb data)
  (lambda (fn)
    (dumb (fn data))))


so.. is this an interpreter? looks like it.. note that in order to
optimize things, some could be unrolled:

(lambda (x) ((x c) (lambda (stack) (a (b stack)))))

this is basicly an implementation of the 'map' function: the function
implementing the state object is the monad wrapper M which contains a
type t, and it maps an incoming t->u to Mt->Mu.


summarized:

  * all functions are typed, and do not need to be aware of state.
  * state is completely abstract


maybe this can do all kinds of lifting automatically? i.e. scheme
functions -> stack functions etc..


how hard is it to change this? where do control words fit in, since
state is no longer passed automatically. maybe control words are just
another type that take a continuation argument?


(define (stack lst)
  (lambda (fn)
    (cond
      ((stack/control? fn)
       (stack
         (call/cc
           (lambda (k) (fn k lst)))))
      ((stack/data? fn)
       (stack (fn lst))))))


well.. it's an interpreter for sure. can these contitionals be
eliminated? well yes, if at compile time the type can be
determined.. so is that possible? can functions be typed statically?

there's one problem though: composition: what type does this have?
state -> state, where state is a fn. so there's a difference between
'primitives' and 'composites'.

(lambda (x) ((((x) a) b) c))

EDIT: something's chicken and egg here tho: primitive types and
extensible types. looks like i slammed into the "expression problem",
since i want to extend both the type and the methods.




Entry: questions
Date: Tue Apr 15 19:37:42 EDT 2008

* extensible types: is the inverted approach of previous post a good
  idea?

* reset/shift : there has to be a way to 'split' functions at points
  where other data is injected.

Entry: shift/reset breakpoint draft
Date: Tue Apr 15 21:10:42 EDT 2008

(define tag (make-continuation-prompt-tag 'tag))

(define (make-split [more #t] )
  (lambda (inner)
    (shift-at tag
              rest
              (values (and more rest) inner))))

(define x add1)
(define y (make-split))
(define stop (make-split #f))

(define (make-composition . fns)
  (apply compose (reverse fns)))

(define (test fn input)
  (let next ((thunk (lambda () (reset-at tag (fn input)))))
    (let-values (((k v) (thunk)))
      (printf "v = ~s\n" v)
      (if k
          (next (lambda () (k v)))
          v))))


box> (test (make-composition x x y x x x y x x x x stop) 0)
v = 2
v = 5
v = 9
9


EDIT: i get it.. nested shifts will always return the deepset shift
free expression.


Entry: multiple compilation paths + memoization
Date: Wed Apr 16 09:31:56 EDT 2008

the reset/control is about implementing the forth compile loop without
side-effects, currently it uses a stack (push!). once that is done,
there should be a way to use extenisions to compile some sequences
multiple times, and pick the best one.

one of those is for/next. however, with nested loops, care should be
taken not to make the algorithm quadratic. i'm not sure whether
memoization is necessary: explicitly using 2-path execution might be
more interesting. in the 'test' loop before, this amounts to running
one compilation multiple times, one with code wrapped around the loop,
and pick the best one.


Entry: breakpoints
Date: Wed Apr 16 09:54:05 EDT 2008

the reset/shift approach has the semantics of breakpoints. let's just
call it that, and make the abstraction complete.

the players:
  * (make-breakpoint tag mix [more #t])
  * (with-breakpoint tag fn state0 value0) -> state,value
  * (mix state value) -> state,value

this seems to work well. my only worry is composition: what happens if
there is more than one tag involved? the way to look at this might be
from the outside: a tagged shift only makes sense if it's captured by
a tagged reset, so combinations of tags would be properly dynamicly
nested. in that case, i see no problem.


Entry: compiler with breakpoints
Date: Wed Apr 16 14:19:48 EDT 2008

looks like it's working.

now: is this really necessary? it would be nice to understand if it
can be done using parameters and side-effects.

the true test here is of course to try something with continuations,
see where it goes. maybe have a go at for .. next?

Entry: postpone-exit
Date: Fri Apr 18 11:05:58 EDT 2008

hmm.. something wrong with -broem -bla tests, they seem to
hang. problem with mexit.

they were calling each other:

(compositions
 (macro) macro-prim:
 (exit   postpone-exit))


(define-ns (scat) postpone-exit (ns (macro) exit))

renamed to 'compile-exit' goes better with 'compile'


Entry: cps forth
Date: Fri Apr 18 11:31:43 EDT 2008

is there any meat in cps forth? or is this just a way of interpreting?
probably..

cps replaces "CALL" and "RETURN" with "GOTO with parameters". it does
need first class functions though.



Entry: parsing C
Date: Fri Apr 18 12:50:15 EDT 2008

http://eli.thegreenplace.net/2007/11/24/the-context-sensitivity-of-cs-grammar/

of things to do.. i need to have a look at piumarta's packrat
parser. that would be a very interesting addition to brood.



Entry: scat progress
Date: Fri Apr 18 14:07:35 EDT 2008

is going really well. i'm as good as done, except for the interactive
part which needs a bit of re-org. the name space management is a lot
better now. making things a bit more static didn't really hurt.




Entry: new name for purrr
Date: Fri Apr 18 14:24:01 EDT 2008

everybody keeps calling it picforth, but that's already used. what
about PRICFOTH? it already sounds obscene in dutch..


Entry: tethering
Date: Fri Apr 18 14:44:10 EDT 2008

 * compile the monitor
 * port interactive code

maybe it's possible to get rid of interpret/compile mode in console
interaction. maybe some 'auto tether' can be made: not running certain
optimizations so macros can be easier simulated?

that's quite a challenge..

the problem at first hand seems to be the use of platform-dependent
constructs.. translating forth to pseudo code is trivial, but some of
the langauge is defined ONLY in terms of assembly code.

the reason to have an interpret mode is to not have to touch the flash
rom. ram-based forths should really just compile and execute, but for
rom-based forths there is room for a separate interpret mode
language. it's also the right spot to introduce tethered commands from
target's perspective.


Entry: compiling the monitor code
Date: Fri Apr 18 15:06:57 EDT 2008

things that are going to pop up:
 - handling the namespace
 - compilation, assembly + serialization of word struct.
 - org


Entry: for .. next
Date: Fri Apr 18 15:45:13 EDT 2008

maybe i need to test this first: compile 2 branches + save the best in
memoized form such that nested loops are computed inside out in linear
time.

           for         body         next
       dup for    drop body save    next drop

so, at the time 'for' executes, it needs to know which of the 2 is
shortest: (body) or (drop body save).

let's call the above: (for0 body next) and (for1 body next1) and
reserve (for) and (next) as the macros that setup the evaluation.

this leads to the following control logic:

  if 'for' can capture 'body', it can try several strategies and pick
  the best one.

can this be done using composable continuations? it would be the first
testing point to see if they mix well. if so, it can probably be
generalized to a lot more control structures.

i worry about nesting: for_o for_i .. next_i next_o would lead to
something like:

(lambda (state)
  (reset (next_o
          (reset (next_i
                  (body
                   ;; for_i
                   (shift i
                          ;; for_o
                          (shift o
                                 state))))))))

maybe they need different prompt tags? looks like it: the inner shift
won't see the outer reset. let's give it a try.

this needs to go deeper: since the rest of the code explicitly needs
to be called inside a dynamic extent.. confused now.


stack:

next_o
next_i
body
for_i
for_o




it's probably best to bring shift/reset to scat.

;; Installs a reset and saves the prompt tag on the stack.
(define-ns (macro) reset/tag
  (lambda (state)
    (let ((tag (make-continuation-prompt-tag 'reset)))
      (reset tag



Entry: composable continuations
Date: Fri Apr 18 17:16:26 EDT 2008

http://schemekeys.blogspot.com/2006/12/delimited-continuations-in-mzscheme.html

  ... four classes of delimited continuation operators ...  are
  referred to as -F-, -F+, +F- and +F+. Dybvig et al. describes them
  as "a classification of control operators in terms of four variants
  of F that differ according to whether the continuation-capture
  operator (a) leaves behind the prompt on the stack after capturing
  the continuation and (b) includes the prompt at the base of the
  captured subcontinuation."

that makes things a lot easier to understand.


Entry: tools + check
Date: Fri Apr 18 20:55:41 EDT 2008

moved code used from zwizwa-plt back to the tools/
directory. granularity is too fine. if i need it in other projects,
maybe best copy/paste.. most is too specific.

should also cleanup sweb to get rid of the stream stuff, and use
something standard.


Entry: serialization
Date: Fri Apr 18 21:44:26 EDT 2008

using scheme/serialize and define-serializable-struct should give
serializable object code if the target-value structs are
evaluated. maybe some annotation should be left instead of the value?


Entry: graphs and FP
Date: Sat Apr 19 10:28:53 EDT 2008

i never quite understood how to deal with graphs in FP. in EOPL
there's a point where circular reference is avoided by delaying
linkage. i think at the point where environments are implemented. at
the time it struck me as odd..

so, what is a graph? it's a map::  node -> (listof node)

whether children are ordered or not, listof can be setof

the problem with graphs is that nodes refer to one another. let's
first try to represent a graph as a tree.

see also zipper:
http://www.st.cs.uni-sb.de/edu/seminare/2005/advanced-fp/docs/huet-zipper.pdf

using lazy data structures, self-reference is easy, and can be
represented by lambda terms, which eventually boil down to the
Y-combinator.

EDIT: the idea seems to be to represent the graph as a lazy structure
that can generate a 'local tree expansion' or something.. a bit like
manifolds and R^n patches.

EDIT: what i'm looking for is called circular programming.
http://www.csse.monash.edu.au/~lloyd/tildeFP/1989SPE/
http://www.haskell.org/sitewiki/images/1/14/TMR-Issue6.pdf

basicly: you need lazy evaluation to build graph structures: a pointer
to a structure can be available while the structure itself is as of
yet unevaluated, and as such can reference itself.


Entry: spread the word
Date: Mon Apr 21 19:48:29 EDT 2008

http://www.forthfreak.net/index.cgi?WikiNode

Purrr is mentioned there. Maybe i go around and edit some wikis?



Entry: string -> language
Date: Mon Apr 21 21:26:18 EDT 2008

How to create forth code from a string?

i forgot how the logic works.. 

pic18/lang/reader.ss:
(module reader scat/forth/module-reader
  scat/pic18/purrr18-module-language)

the generic forth reader uses #%plain-module-begin from the specified
module. to declare and instantiate a module body

(module test "pic18/purrr18-module-language.ss" : abc 1 2 3)
(require 'test)
(print-all-code)
abc:
	[dup]
	[movlw 1]
	[dup]
	[movlw 2]
	[dup]
	[movlw 3]


now, from a string: open the reader module with a prefix:
(require (prefix-in 'forth- "pic18/lang/reader.ss"))


The answer seems to be: forth code lives in a namespace, so in order
to load a file, create a new namespace.

EDIT: got it to work by using:

    (parameterize
        ((current-namespace ns))
      (eval form)
      (eval `(require
              scat/macro/code
              ',name)))

now i can instantiate multiple namepspaces, each with their own
language. one problem though: the word structures are not accessible,
because the instances are different.

anyways, this gives a nice border to create the "badnop interface".

now, this takes noticable time with all modules compiled.

     (ns-print-code (purrr18->namespace ": abc 1 2 3"))

which means something is running during instantiation of the
modules.. maybe it's the tests? maybe it's possible to keep a
namespace around with an instantiated compiler, and re-evaluate forth
code? TODO: split instantiation of compiler, and compilation, to make
way for incremental compilation.

looks like this is the next step: make this easy to use.


Entry: repl during compilation
Date: Tue Apr 22 10:24:22 EDT 2008

in order to have the same debug 'compile' mode as in brood-4, some
access to the asm state is necessary. this needs to be implemented as
a breakpoint word, one which prints out the whole state in a
meaningful way.


Entry: flashforth
Date: Thu Apr 24 16:08:58 EDT 2008

going through the flashforth tutorial, and it seems mikael has been
busy. with some optimizations here and there. it's nice to have an
example like that.

this does bring me to the optimization vs. simplicity trade-off. it
seems difficult to stay at either extreme.


Entry: state extensions through shift/reset
Date: Fri Apr 25 10:22:36 EDT 2008

with this shift/reset thing working for augmenting the compile state
from one straight code list to an assoc list of such, i think it might
be better to do the same with the 'data' element in the core:
everything that uses the 2stack state should use this dynamic
extension mechanism.

to extend state:
   * define mixer words using 'make-breakpoint', referring to a prompt tag
   * wrap such code in 'with-breakpoints'

the place to start is the control words. this needs to be made
independent of state rep anyhow.

CONTROL INTERFACE:
  - state
  - state-cons
  - state-stack


the only one that's really problematic is 'apply' because it performs
both function application on an isolated stack + state merging.

maybe this should ignore state effects? solve later: might disappear
when state threading is done using composable continuations.

  NOTE: the business of 'merging state' in the monads/JOIN operator
  seems to be a generalization of assignment.

ok. so i got all explicit state reference removed. time to wrap the
whole thing in 'with-breakpoints', and turn the low level rep of scat
functions to stack only.


so.. made the change. the remaining question is "where to wrap"?
should all macros have SCAT prototype, or are they converted to a
2stack -> 2stack mapper at some point?



Entry: closed or open?
Date: Fri Apr 25 12:25:27 EDT 2008

the remaining question for the compiler is: how to represent
macros. are they open SCAT functions, or closed 2stack -> 2stack
functions. the problem with the latter is that it can't be composed in
the scat way.

so, let's do this: if composition in SCAT is necessary, prototype
remain stack->stack + open dynamic refs, otherwise the expressions are
closed using 'scat->2stack'

now, this has implications for the pattern transformers:
pattern-tx->macro will represent a transformer as an open SCAT
function.

ok.. got that sorted out. see k/asm->scat in scat-2stack.ss

now it's time to get in trouble: make-split-update requires access to
the asm state, so it needs to operate on closed macros.

    open macro = scat function (stack -> stack)
    closed macro = 2stack -> 2stack

so, the solution is to not operate on open macros, but on closed ones:
that way the compile mixer has access to the asm buffer.

   (i need some terminology cleanup)

next problem: 'with-exit' requires access to asm buffer.

maybe time for a break.. can't wrap my head around this: m-exit uses a
parameter, so can i create a mix function that references this
parameter? probably not, because the dynamic context of the mix
function is likely outside the scope of the with-mexit.

maybe the macro exit status should be hidden in the overall compile
state. alternatively: m-exit could close over the asm-buffer? i don't
really understand yet..

summarized: how do parameters and delimited continuations interact?

the problem with 'with-exit' as it is now is that without turning it
into a 2stack mixer function, there's no way to access the state.

so, bascly, i need a mixer that calls with-breakpoints. does that
work? it really should work.

what is this? local closure.

ok.. i'm going to try first to mix parameters with the breakpoint
control structure, then possibly make an abstraction for this.


got too much in my head again.. macro->postprocessor operates on
closed macros.

ok. i have the impression to be on the right way: this way of
composing does need a better abstracted api.

instead of using 'values' in the state update function, use a
structure type: the type has 2 values.


Entry: better abstraction
Date: Fri Apr 25 19:19:23 EDT 2008

1. the low level sequencer api:
     * make-breakpoint
     * with-breakpoints

2. high level lifting api: consists of adaptor functions constructed
   from state wrap/unwrap functions.


i'm REALLY close to figuring out the relation to monads, but can't
wrap my head around it yet. instead of applying the continuation in
'with-breakpoints', that operation should be abstracted.


Entry: weird bug
Date: Fri Apr 25 22:27:35 EDT 2008

ok. almost working, except that i get 'stack' instead of '2stack' in
the mexit-update function, while 2stack-mexit gets a '2stack'. i hope
this is not a conceptual error with order of prompt tags..

looks like it is.. i should check if it's possible to mix the
open/close constructs.

basicly, what i'm doing is this: create a shift with a reset in it.

  (shift (E (reset ...)))

the normal operation is nested shifts:

  (reset (E1 (shift (E2 (shift ...)))))


it really should not be a problem: shift can only see the inner reset,
the only way a reset can disappear in a reduction is when it exits
(returns a value).

(reset val) => val

(reset E[(shift k expr)])
=> (reset ((lambda (k) expr)
           (lambda (v) (reset E[v]))))

;; where E has no reset
--

(reset val) => val

(reset E[(shift k expr)])
=> (reset
    (define (k v) (reset E[v]))
    expr)
                                

i can't juggle with it yet.. maybe make some test cases?

EDIT: i really don't see it.. put in some logging tags, and i don't
understand that order either.. looks like the 'nested closing' doesn't
work.


Entry: in words: mexit
Date: Fri Apr 25 22:28:15 EDT 2008

during the execution of a macro, the word ';' will compile a jump to
the end of the execution, except when it occurs at the end. the word
will be split if necessary.

=>

'except' : remove last jump.

the problem i'm facing is where to store the state necessary to
implement this. was implemented using parameters + side effects, but
want it to be in terms of shift/control to give a purely functional
implementation.


Entry: giving up?
Date: Sat Apr 26 09:15:22 EDT 2008

maybe time to see if this can be installed in the compile state. what
i need there is a stack to trace the dynamic context of macros. it
feels wrong though, to not have this coincide with the dynamic call
stack.. but maybe it is becase i'm messing that up, that i need it
stored explicitly somewhere else?

the problem is: what if macros don't exit? am i allowed to jump
outside of context? does it need dynamic-wind? hmm.. if a macro
doesn't exit, it probably also doesn't produce code.

on the other hand: having the mdyn stack available might be interesting.


Entry: next try
Date: Mon Apr 28 14:08:47 EDT 2008

let's make a simpler example first: 2 level nesting of scat->2stack
and 2stack->scat.

works just fine..

(define-ns (macro) test-b
  (lambda (s)
    (printf "test-b ~a\n" s)
    s))

(define-ns (macro) test-a
  (2stack->scat
   (match-lambda
    ((struct 2stack (asm ctrl))

     (printf "test-a ~a\n" asm)
     (let ((out
            ((scat->2stack macro/test-b)
             (make-2stack asm ctrl))))
       out)))))

box> ,scat (require "macro.ss") (macro:: 1 test-b 2 test-a 3) 
toplevel in /home/tom/scat/
with-breakpoints:init #<continuation-prompt-tag:2stack> #<stack> #<stack>
with-breakpoints:next #<continuation-prompt-tag:2stack> #<stack> #<stack>
test-b #<stack>
with-breakpoints:next #<continuation-prompt-tag:2stack> #<stack> #<stack>
with-breakpoints:next #<continuation-prompt-tag:2stack> #<stack> #<stack>
test-a ((qw 2) (qw 1))
with-breakpoints:init #<continuation-prompt-tag:2stack> #<stack> #<stack>
test-b #<stack>
with-breakpoints:next #<continuation-prompt-tag:2stack> #<stack> #<stack>
with-breakpoints:next #<continuation-prompt-tag:2stack> #<stack> #<stack>
with-breakpoints:next #<continuation-prompt-tag:2stack> #<stack> #<stack>
(qw 1)
(qw 2)
(qw 3)
box> 


so it's somewhere else..
EDIT: i ran into a segfault somewhere..
EDIT: that bug is fixed, maybe try to see if this code now works?
EDIT: ok.. my bug is still there, i'm going to let it go.

Entry: destructive assignments
Date: Mon Apr 28 15:38:55 EDT 2008

so.. why am i doing this? suppose one takes a partial continuation
which has state, does it hang on to this?

(define ((integrate state) in)
  (set! state (+ state in))
  state)

(define k (reset (let ((x (integrate 0))) (x (shift k k)))))

box> (k 1)
1
box> (k 1)
2
box> (k 1)
3
box> (k 1)
4

yes: multiple executions of the partial continuation keep their
state. why would they do otherwise? that's the dragon i'm fighting: i
need an abstraction where the partial continuations are pure
functions.

the important remark here, also related to finding a decent
abstraction instead of the breakpoint one: what i'm doing is
'splitting' a composition. maybe i should go back to that, istead of
the mixer/update abstraction?

basicly, something like
  (a b c | d e f | h i j) with-split 

   -> (abc def hij)

the funny thing is, trying to just 

  split = (lambda (x) (shift k (cons k x)))

doesn't give a list, a pair with value and continuation, which in turn
produces a pair with value and continuation.



Entry: simplified sequencer
Date: Mon Apr 28 18:16:47 EDT 2008

maybe an explicit sequencer isn't necessary..


a b c >asm d e f

what should '>asm' do? obtain the continuation (d e f) and obtain the
state from somewhere. so what it should pass to the driver is a
procedure that takes a state and a continuation and produces a state.

mix  (state  k:type->state) -> state 

this looks like 'bind'  (Ma a->Mb) -> Mb


Entry: is it possible to implement mexit as a parameter?
Date: Mon Apr 28 21:58:10 EDT 2008


as long as the parameters are retrieved inside the proper dynamic
context (not inside a mixer!) there should be no problem mixing
_immutable_ parameters with the stitch mechanism. 

_mutable_ parameters are a problem when multiple executions are
desired. for mexit, this includes the exit label reference count
(number of exit points in a macro).

i.e.: suppose some macro executes multiple times, and on each
execution it calls mexit: the reference counts will add up.

but, if this effect can be kept local, there is really no problem: if
macros are wrapped, the state is only visible during execution of that
macro. i can imagine cases where this is violated:

   : bla ...  ( ... ; ... ) ... ;

the code between parens might be grabbed to be compiled in multiple
variants as part of an optimization: here ';' really shouldn't have
any side effects except for the result produced by the variant used.

so, on order to keep the design of the compiler simple, the following
requirements for macros are a good idea:

  * side-effect free wrt. code produced.

  * read-only parameterization allowed: not necessary to be
    referentially transparent

as an exception, side effects ARE allowed if they do not influence the
compilation results (i.e. logging). the reference count tracking of
exit label references violates this.




Entry: practical mexit
Date: Mon Apr 28 22:14:36 EDT 2008

so, can i sidestep the issue and eliminate unnecessary splits at the
end of macros? probably not: will mess up optimizations. local exits
for macros are an exception. is it possible to somehow automatically
ignore the last one, to scan the code for references? no: if a split
occurs during the execution of a macro by any other means, some jumps
to exit might not be visible.

a b c ; d e f ;

so, what are the tasks:
  * maintain an exit label (dynamic parameter)
  * figure out whether to split or not at the end of the macro

the problem is that ALWAYS splitting is bad, because it interferes
with optimization. checking if the label is reachable should be not
too difficult if the start of compilation can be marked somehow.

let's go back to what i'm really trying to do here: to _emulate_ a
return stack. why don't i just have such a thing instantiated
explicitly in the compiler state, so other return stack operations can
be emulated also?



Entry: struct macros
Date: Tue Apr 29 11:41:00 EDT 2008

it would be nice to abstract away the details for update pattern
matching. this requires some access to struct layout. basicly,
generate this from struct names:

(define-sr (compile-update
            (icurrent iwordlist iasm ictrl)
            (ocurrent owordlist oasm octrl))
  (match-lambda*
   ((list
     (struct compile-state (icurrent iwordlist))
     (struct 2stack (iasm ictrl)))
    (values
     (make-compile-state ocurrent owordlist)
     (make-2stack oasm octrl)))))

actually, it's much more straightforward to do this with structure
type inheritance. this requires a deep change though.. first: switch
order of fields in 2stack


Entry: structure types and inheritance
Date: Tue Apr 29 14:35:02 EDT 2008

now i feel stupid

delimited control isn't necessary at all here.. simple inheritance
will do the trick just fine.

one thing i didn't get though is: inheritance works nice for read, but
what's needed is to construct the right output type, so the update
function needs to be abstracted somewhere..


-> all derived structs now have an 'update' function in the first
   field, and a direct constructor as in:

(define update-compilation-state
  (case-lambda
    ((state ctrl)
     (update-compilation-state state ctrl
                               (2stack-asm-list state)))
    ((state ctrl asm) 
     (update-compilation-state state ctrl asm
                               (compilation-state-current state)
                               (compilation-state-words state)))
    ((state ctrl asm current words)
     (driver-make-compilation-state ctrl asm current words))))

(define (driver-make-compilation-state ctrl asm current words)
  (make-compilation-state update-compilation-state
                          ctrl asm current words))



ok.. done feeling stupid. works, and is a lot easier to understand.

this can be implemented more efficiently using lists: less copying,
more sharing. not important atm.


this abstraction makes it a bit easier to use:

;; state matcher which introduces 'update'
(define-syntax (state-lambda stx)
  (syntax-case stx ()
    ((_ type (var ...) . expr)
     #`(lambda (state)
         (match state
                ((struct type (update var ...))
                 (let ((#,(datum->syntax #'type 'update)
                        (lambda args
                          (apply update state args))))
                   . expr)))))))

maybe use syntax parameters instead of introducing a symbol?


Entry: summary
Date: Tue Apr 29 19:19:57 EDT 2008

last couple of weeks were dark. what came out?

* using inheritance + abstract factory gives a much simpler solution
  to hidden state threading than composable continuations. inheritance
  solves state read, while abstract factory solves functional state
  update.

* read-only parameters are ok for macros, but mutable parameters or
  mutable closed variables used for determining code output are not:
  if continuations are to be used to perform certain optimizations by
  trial and error, it's best to stick to pure functions. (i don't care
  so much about referential transparency as i care for macro side
  effects.)

* i've got a little more intuitive understanding of monads, and am now
  of the opinion that they are too general for what i'm trying to
  do. also: absence of polymorphism makes them hard to use in
  scheme. and, what i'm trying here might fit better under the arrow
  abstraction, but i'm unfamiliar with that.


Entry: shift / reset and for .. next ?
Date: Tue Apr 29 20:36:56 EDT 2008

what is the reduction rule in rpn?

  shift ... reset  => ( ... ) reset

the problem here is twofold:
  - what prompt tag to use?
  - how to pass the continuation?

it's probably better to use 'call-with-continuation-prompt'



Entry: brood is
Date: Wed Apr 30 18:34:27 EDT 2008

entry://20080329-093950

functionality: RPN SCAT PAT+MACRO FORTH PURRR

types: SCAT:  stack 
       MACRO: + asm stack 
       FORTH: + dictionary / current word / macro rs

implemented using functional data structures.


Entry: next action
Date: Thu May  1 01:49:04 EDT 2008

time is running out, what needs to be done next?

  - fix the for .. next optimisation as a pilot for other partial
    continuation based optimizations. (-> delimited control + lazy code)

  - port the monitor code.
  - port the interaction code.
  - port catkit / sheepsint.

  - find an easy bootstrap for catkit from within brood.
  - serial port interface (PLANET PACKAGE)

  - simulator interface


Entry: simulator and partial evaluation
Date: Wed May  7 12:48:26 CEST 2008

* interaction should really be partial evaluation of machine instructions.

* the assembler should be specified as a simulator (functionality),
  and state dependency (for data flow analysis)

this should be implemented as a separate macro language.

(TODO: look at plane notes again)


Entry: wire protocol
Date: Wed May  7 13:02:14 CEST 2008

it's important to have a look at what exactly goes over the wire. the
minimalistic monitor there is now is nice for minimal complexity, but
something that can be inspected directly has its advantages. i'm
thinking about the prefix notation from pltix.


Entry: namespace stuff
Date: Wed May  7 13:26:58 CEST 2008

it might speedup compilation a bit to separate phase +1 code from the
core routines: right now, the whole scat module gets instantiated
during compilation.

this is back to how it was before moving all phase level 0 and 1 code
into one module for convenience. i've tagged (they print their name)
the modules that instantiate stuff so it is clear that compilation
does not spuriously instantiate any code that only makes sense at run
time, and that run-time instantiation happens only once per namespace.

so, that works pretty well: an instantiated compiler + code dictionary
is represented by a namespace. this is a "synchronous late-bound
object": it takes messages that can alter its state, and returns
values. what is missing is a way to serialize the state as object code
that can be imported again without compilation.


Entry: machine model / partial evaluation and state management
Date: Wed May  7 17:52:08 CEST 2008

the idea is to be able to evaluate (simulate) code off-target, as long
as it only depends on MACHINE state.

[movlw 123] can be translated into: read,modify,write with the update
happening off target. [movwf LATA] is a border case: it can be split
in read,modify,write but it also effects external physical state.

what is required is a clear definition of what simulation means: is it
completely isolated from the 'real' world, or does it just simulate
the computation part of the target? does [movwf LATA] alter the output
of pins, or does it modify some internal model?

it would be best to make this behaviour pluggable: the amount of
'realness' should be configurable.

the modes are:
                  |  STATE     COMPUTATION
 -----------------|------------------------
  (1) stand-alone |  real      real
  (2) tethered    |  real      emulated
  (3) simulator   |  emulated  emulated
  (4) test        |  emulated  real


and, really, you need only the first 3. does the 4th one make sense
during application development? actually not: the CPU is a functional
unit, and can be exactly emulated (in principle, might not always be
necessary: partial emulation can be good enough). this mode DOES make
sense during emulator testing though. (emulating STATE completely
might be impossible since it depends on the external world)

the place to introduce emulated state is in the partial evaluator of
machine code.

so.. what you want is to be able to modify meaning of code depending
on level of simulation. i.e. [movwf LATA] might mean:

   (1) execute the instruction on the target

   (2) simulate the instruction as passive (memory only) machine
       state update on the host + write the state to the target

   (3) simulate the state update as active machine state update, do
       not involve the target. (i.e. writing to the latch might set
       the state on input ports during next instructions.)

   (4) compare the state update simulated on host and executed on
       target

probably i should generalize brood as a framework for pluggable
simulation. this is more general than the previous emphasis on
tethered development, and potentially a _LOT_ more powerful.

it's probably best to focus on memory mapped i/o and synchronous
execution: get it to work for the PIC18 first, then generalize the
architecture. each functional unit can be implemented as a thread.

what you want basicly is fine grained control over what exactly is
executed on the target, and what is not. there is an order relation
hidden here: it's impossible to simulate state update when executing
code on the target, this means there's a directed graph of 'realness'
that can be used as a guide to building a code/data structure to
implement this.

given the program source, it can be compiled for:

     (1) running completely on the target

     (2) running partly on the host + target state update. the latter
         could be plain code execution.

     (3) complete simulation


some remarks here. 

* time-critical software needs to run on-target, so it is important to
  design programs such that they can be tested by virtualizing the
  stimulus (slowing down time): make everything synchronous, that way
  time is an integer and can be abstracted. simulate non-synchronicity
  on top of this.

* the application domain is massive parallel, so the basic unit of
  simulation is a task. PLT scheme has all the necessary tools to
  build this kind of thing. it would be interesting to equip purrr18
  with some libraries to implement state machines and tasks in a way
  that works well with the simulator.

* program compilation = partial evaluation of simulators. i.e. [movwf
  LATA] can be compiled to machine code and executed on the machine
  only if LATA is real. an application will compile to 2 things:

     1. supporting machine code to run on target (i.e. the monitor)
     2. host side entry point, which might sequence simulation

* not so much related, but can 'incremental dev' be used here? only
  recompile parts of target support code that is necessary? this is an
  optimization problem which only needs proper dependency
  management (memoization) and can probably be solved seperately.



Entry: simulator problem definition: generalized interaction mode
Date: Wed May  7 18:43:54 CEST 2008

given
   1. (assembler) source code
   2. cpu (functional) and memory/port (state) model

generate:
   1. binary support code to upload, possibly incrementally
   2. a toplevel driver function that starts the simulation


work with assembler source code to keep the machine model simple:
source code simulation is never going to be accurate enough to be
generalized: you want the nitty gritty. this also enables the
decoupling of the compiler and the simulator: external compilers can
be used.

so, should the memory model be destructive or not? this boils down to
the question: what is more important: speed or the ability to have
non-straight line execution? what about going for the cleanest
solution, and have EACH memory location be a port, with memory being a
simple loopback port? so the only state related to memory model would
be the configuration "patch", which is static for a certain
simulation. all other state could be task-local.


Entry: base machine: real or virtual
Date: Wed May  7 20:20:36 CEST 2008

there's no need to do work twice, so what machine should be used for
the old interaction mode? the 3-instruction forth?

the problem is primitives: currently, the primitives are the machine
instructions. so accurate simulation means simulation of those. on the
other hand, it might be more flexible to allow a higher level
simulator to test algorithms.

the thing that gets in the way is premature optimization: part of the
problem to solve is manual machine mapping: currently, purrr18 is more
PIC18 than Forth. maybe something intermediate can be constructed: a
VM that implements a subset of instructions without optimizer?

so problem: find an intermediate language that is easier to simulate
than target (too complex / target specific) or language (too language
specific, underspecified)

EDIT:

this is a serious problem. 2 stances:

  * forget about target simulator: concentrate on programming
    language, and make it clean. that way simulation is easy because
    the primitives can be simple.

  * forget about the language simulator: what you want is a tiny layer
    on top of the real machine, and possibly use multiple languages
    and binary code.

if i have to choose, the 2nd one is really the only practical
solution. the only disadvantage is that it's target specific.. is
there some trick to be able to have both?



Entry: metaprogramming in the real world
Date: Thu May  8 12:35:56 CEST 2008

talking to axel yesterday, and he was saying that he's doing nothing
but writing scripts that write scripts. what does that really signify?
why is metaprogramming so effective?

there's a selling point hidden here..

i'd say: it's so incredibly difficult to build an interface to
extremely parametric code, that it's better to just turn it into a
proper language with its own composition mechanism, such that it is
complete. the metaprogramming then eliminates the tedious step of
making compositions that can't be composed in the base language. or,
it removes the necessity to extend the base language for one specific
problem.


Entry: simulator generator + specification
Date: Thu May  8 13:16:10 CEST 2008

another thing that popped up in the discussion: how fast is the
simulator? can it be specialized? maybe this is an essential point
also: concentrate on creating a simulator generator.

this means the simulator needs a specification language, so it can be
compiled to fast specialized code later.


Entry: loop bodies and delimited control
Date: Thu May  8 14:34:32 CEST 2008

maybe today is not the day.. tired and stupid, i'm not worth much. but
i run into very strange results when trying shift/reset.

why don't these work?

 (define-word shift stack (shift k (cons k stack)))
 (define-word reset stack (reset stack))

reset actually doesn't install a prompt around a computation, because
'stack' will be evaluated before it is reached. it expands to
(call-with-continuation-prompt (lambda () stack))

what i need is a macro that does this. that is unfortunate, because
all code that uses it will need to be macros too. is that really true?

yes, looks like it is: otherwise the code gets evaluated before it's
passed to reset. looks like i need to play with evaluation order a
bit: instead of using strict evaluation, it might be easier to use
lazy.

what about: every scat function takes a delayed computation?

it shouldn't be too difficult to change this in one place only:
wrapping each strict function so it becomes a lazy one.

(lambda (lazy-apply fn thunk)
  (lambda () (fn (thunk))))

or 

  (delay (fn (force thunk)))

    
Entry: strict vs lazy
Date: Thu May  8 16:57:49 CEST 2008

so, i run into a point where evaluation order does matter: reset needs
to delay its argument.. is it worth it to modify the entire
representation to a lazy one?

the thing is: modifying evaluation order requires macros.. but macros
are viral: any composition of a macro is again a macro.

on the other hand, why isn't EVERY function a macro? with the bulk
using strict application. do i really need to access functions
directly, or is (scat: bla) enough?

or.. why can't composition automatically be macrofied, or, why can't i
have unbalanced parenthesis?

it looks like there is really no way around this: in order to capture
dynamic compositions, code needs to be delayed so a prompt can be
inserted BEFORE evaluation.

EDIT: so.. lazy eval. does that give problems with sequenced code? no:
since a scat program is already sequenced (composition of unary
functions) there is no problem here.

when this is done with the datatypes themselves, it should be fairly
straightforward: states are thunks.



Entry: concatenative family (Cat language)
Date: Fri May  9 18:12:11 CEST 2008


it's time to dive into Joy, Factor and Cat again to see where things
are different. especially Cat, since Christopher and me are doing
similar things for 2 years now with little interaction, and with a
slightly different focus. in SCAT:

 * ties to scheme are important. my goal is not to write a stand-alone
   language. hence the choice of PLT Scheme, which is pretty big..

 * SCAT is dynamically typed.

 * SCAT is not linear.

 * MetaCat: i use term rewriting, but in a different part: i see no
   need for SCAT metaprogramming other than introducing
   non-concatenative language elements to support Forth. otoh,
   rewriting is _very_ important in the PIC18 code generator. however,
   the code that is rewritten is symbolic assembly code, not SCAT code.

 * SCAT is only used to support MACRO. it's probably not general
   enough as a full programming language. however, things are easily
   snarfed from Scheme. (with Dave's move to PLT Scheme for Fluxus,
   there is an interesting road to travel there though..)

TODO: relation to Factor and Joy.


Entry: peephole optimizer
Date: Fri May  9 18:25:40 CEST 2008

maybe it's better to separate the machine specific optimizer from the
code generation step? that way the peephole optimizer can be reused
with different languages, and probably be tested using a machine
model.


Entry: assembler expression language
Date: Fri May  9 18:35:17 CEST 2008

what currently is 'target:' might better be written in s-expr syntax,
so that it's easily converted to concatenative syntax. (the other way
is more difficult).

while now it's kind of cute to have this concatenative language map to
a concatenative assembler expression language, later when external
assemblers need to be supported, this might become a nuisance.

also, and probably more important: a distinction needs to be made
about data types and partial evaluation:

  * use scheme's infinite precision types
  * use only target types (i.e. accurately SIMULATE the computation)



Entry: documentation + presentation
Date: Sun May 11 11:42:37 CEST 2008

introduction:

  Brood is a metaprogramming environment for deeply embedded
  programming, starting with the idea: "How to modernize the tethered
  Forth approach?". Forth is appropriate for programming small
  computers, but too low-level for a host-side metaprogramming
  framework. Scheme is ideal for this.

  The second objective is to generalize this to special-purpose
  problem description languages.

what is metaprogramming?
  - use language A to generate code in language B
  - A = B possible, but more likely A > B  (more high level)
  - partial evaluation of the usual language tower, to limit
    complexity of on-target support code. (give up some generality)
  - overall idea: use high level construct where possible, but
    specialize to low-level where possible.

why are macros important?
  - aren't functions enough?
  - partial evaluation: separate compile and run time. (get extra cake
    at compile time without giving up possibility to use highly specific
    code.)

why these weird languages?
  - scheme = clean lisp. lisp's strength:
      * metacircular interpretation (language defined in itself)
      * leads to easy metaprogramming (lisp macros)
      * scheme: based on untyped lambda calculus = functional
        programming with imperative extensions (environment model)
  - forth
      * due to the concatenative composition model, the language
        itself is quite powerful in itself, despite its
        simplicity. (even without dynamic memory management or garbage
        collection. related to innate 'linearity')
      * base language = static, suited for real-time applications
      * efficient: thin machine model for simple sequential chips
        (less efficient for pipelined number crunching processors: a
        dataflow language would be better suited there).
      * simple metaprogramming
      * it has a purely functional + purely concatenative subset

different brood layers:
  - PLT scheme module system + module languages
  - SCAT: purely functional intermediate language implemented as
    Scheme macros.
  - MACRO: purely functional metalanguage on top of SCAT. a MACRO
    program generates (symbolic assembly) code. it includes PAT which
    combines code generation and peephole optimization.
  - FORTH: syntax on top of SCAT or MACRO to provide the
    non-concatenative part of Forth (parsing words like ':').
  - ASM: 
      * target specific assembler generator
      * target address expression language (= SCAT)
      * standard n-pass branch instruction code relaxation
  - LIVE: live target interaction / simulation framework.

why from scratch?
  - to gain deeper understanding
  - to find a natural modularity without tool-specific idiosyncrasies

can it use external tools?
  - yes, but design is optimized for internal tools. (i.e. compiler
    -> assembler interface uses structured data instead of text)

can it use different languages?
  - interfacing on object level: no problem (not implemented yet though)
  - since custom languages are the core business, i see little
    advantage supporting standard languages (like "C") directly.
  - however, purrr is a core component.

why not OO?
  - FP is natural: compiler = a function. maps source code to object code.
  - stateless code generation makes different code generation paths
    easy to implement (output feedback without environment setup)
  - easier to do OO in FP than vise versa.

different implementation language?
  - Forth/C/C++: been there. too low-level while performance payoff
    not so important.
  - Perl: i tried before, but i prefer structured data to strings
  - Java: too clumsy.
  - Haskell: i'm tempted, but probably too little wiggle space to
    evolve a design. The final impelementation however might work well
    in Haskell. Scheme's approach is conceptually closer to
    metaprogramming, also wrt. ML.
  - other dynamic OO languages (Python,Ruby,Smalltalk,...): i'm not
    particularly convinced they are better than a FP oriented
    language.


Entry: inheritance for state threading
Date: Sun May 11 14:14:23 CEST 2008

What is the practical reason for using threaded state instead of real
state? to simplify composition of code generators, primarily to allow
multiple applications without causing side-effects. It makes the life
of the optimization implementer easier.

This is probably a spot where multiple inheritance might be
appropriate: if there's no clear hierarchy to state extensions,
forcing one might not be a good idea.


Entry: target expression language (TEL)
Date: Sun May 11 14:46:50 CEST 2008

1. what is it?

The target expression language is the vehicle for expressions that
depend on target labels (static memory addresses), and are passed to
the assembler to be evaluated after static target memory allocation.

In the integrated compiler + assembler architecture in Brood, these
expressions are computations closed over (initially unresolved) target
word structures. For external assemblers, they need to be translated
into strings that represent target assembler expressions. Because of
the need to represent external tools, this language benifits from an
intermediate form. (currently, and only for illustration, this is
symbolic SCAT code, but will probably be replaced by s-expression code
later).

2. where do these expressions come from?

The expressions are generated by the peephole optimizing code
generator, mostly as partially evaluated target code. I.e. the Purrr
code
           ' main 1 +

compiles that code as the expression which adds 1 to the address of
the "main" procedure word. At compile time it can be determined that
the value can be obtained at assembly/link time, so literal
instructions can be generated. However, at compile time only the
computation can be stored, due to possible dependency on (as of yet
undefined) target label values.


Entry: purrr compile time expressions
Date: Sun May 11 15:01:22 CEST 2008

One of the cool core features of Purrr is the ability to inline
compile time computation without extra annotation. (In Forth
traditionally the words '[' and ']L' are used.) This allows for a very
flexible macro composition mechanism.

However, the computations are performed in infinite precision, and do
not give the same results as would do the same code without these
compile-time computations. I.e. in Purrr18:

             1000 30 /

makes no sense on target due to 8-bit limitation (and possibly
non-availability of the / operator), but makes sense in Purrr18
because the endresult will be truncated only in the end. What is
compile is
             [DUP] [MOVLW (1000 30 /)]

because the target is only 8-bit, this sort of computations is very
useful, and really should be the default (over annotated
meta-computations).

Now the question is: i'd like to build a simulator that can execute
the generated PIC18 code. is it possible to generate intermediate code
that's easier to simulate than PIC18 code, but produces the same
results?

This looks like a very hairy problem: PURRR18 is a 'dirty low level'
language where constructs are defined as-is, and you should be aware
of bit-depth limitations and flag effects etc.

The stay-out-of-trouble part of me says i should stick to a genuine
PIC18 simulator. It can simulate code generated by other means.
Following that approach probably also makes it easier to plugin
external simulators.


Entry: external tool interface
Date: Sun May 11 15:16:15 CEST 2008

To increase the commercial usefulness of Brood, external tool
interfaces are absolutely essential. As a test case, it these things
should be present:

  * gpasm + its meta language
  * gpsim

Instead of writing a simulator, it would be a better exercise to skip
the interpreted part and create a simulator generator: this would
allow testing of the C-code generation facility for 2 reasons:

  * an external, possibly C-based interface will be necessary
  * simulators need to be as FAST as possible


Entry: gpasm / mpasm expression syntax
Date: Mon May 12 11:33:28 CEST 2008

(see MPASM user guide, chapter 8 for expression syntax)
http://gputils.sourceforge.net/33014g.pdf

something i didn't know: this language is apparently stateful. there
are accumulating expressions like '+='

i'm not sure whether state accumulation is really necessary though:
most of what it would be useful for can probably be captured by the
compiler, unless it depends on target code addresses. usage will
tell..


Entry: electronics engineers should learn scheme
Date: Tue May 13 00:13:36 CEST 2008

I don't think I know anyone who has written code in some language and
at some point realized that the language is not powerful enough to
express a certain pattern, then to move on to writing some script that
actually generates code for that particular language from a more
highlevel description (or simply a set of parameters).

The idea is: it's difficult to create a language that will allow one
to describe all possible applications. However, it's not so incredibly
difficult to create a SIMPLE language that's aimed at being easily
EXTENSIBLE using a MACRO language.

So, if you know it's going to happen at some point, why not embrace it
from the start and call yourself a language designer instead of an
application programmer.

This happens especially in domains where hand-assembly is still
important: deeply embedded software.



Entry: documentation
Date: Tue May 13 12:56:32 CEST 2008

Time to start documenting. Let's make it a literal program with proper
online cross-ref + a way to reference to ramblings. Let's make this
into a tool to structure code for refactoring purposes.

Starting with scat.ss


2 kinds of comments:
   * paragraph:              ;; blabla
   * column                  (+ 1 2) ;; add it

comments are attached to an expression.

maybe it's best to avoid column comments entirely?


http://groups.google.com/group/plt-scheme/browse_thread/thread/1e2cae24ec84b70a/b59b55e3990da368?lnk=gst&q=scribble#b59b55e3990da368

That thread has an interesting comment on source code documentation:
you need BOTH reference (per function doc) and general overview /
meaning of a bunch of functions.


Entry: load vs. require
Date: Tue May 13 19:06:25 CEST 2008

the next problem is load vs req.. should i keep load? in the current
.f files a lot is done using late binding: require the includer to
specify some words.. this goes against the bottom-up module
approach. how to solve it?

finding a decent solution for this late binding is quite important:
code generation is heavily parameterized: it's inconvenient to have to
specify.. i'm already using this trick in the core, so why not in the
libs.. the only problem is: it can generate run-time errors.


Entry: org (non-declarative code)
Date: Tue May 13 20:33:58 CEST 2008

boot.f contains non-declarative code that calls 'org'.

so.. how to fix org?

the result should be a word struct that has an assigned address. the
problem however is that the assembler forces addresses. so this needs
a fix in the assembler and the compiler.

the problem with org is that a word can start at a certain location,
but be split after that. the previous mechanism might not be so bad
actually.. 

switching back to pointers = collection of shallow binding stacks.

ok. now how to solve this in forth?
there needs to be room for assembler directives OUTSIDE of code
definitions.

nope.. this violates some entry point stuff..

the org shouldn't be an assembler directive, but some command attached
to the list of words passed to the assembler.

ok, implemented in the compiler: per word, there's one instruction
that can be passed to the assembler about where to assemble the code.

how to specify this in the language? it's really like ':', but
different.. it would benefit from some kind of parameterization.. this
is a tough one: jeopardizing the clean per-word forth defs..


Entry: org
Date: Thu May 15 12:02:03 CEST 2008

so..

: bla 1 2 3 ;

but what about

: 123 1 2 3 ;

where '123' is the address?

the problem is, code outside of a definition is no longer
allowed. the only way parameters can be passed to the assembler is
through instructions inside a target-word instance. maybe the same
route should be followed as for variables? add some pseudo asm..

let's go to the root problem:

  * allow creation of words/macros from within macros
  * allow setting of address of these words

fuck i'm doing language design again..

actually, it's not so bad.. the trick is to make 'org' operate on the
current label. the code to compile a jump at a certain location then
goes:

macro
: install-vec

    ` VEC label   \ create new label
    #x200 >org    \ set current label's org
    do-vec exit   \ compile its code
    org> drop     \ restore org
;


looks like it's working.. the idea: to allow creation of WORDS within
MACROS. note that to create macros within macros a different mechanism
is necessary: introduction of names needs to be done on the Scheme
macro level, so words created as such are not accessible to the bulk
of the code by name.

it's still not optimal.. it gets in the way of straight-line
code.. maybe i should add this concept: 

  code that comes from the compiler needs to be assembled in straight
  line, but the compiler can ask to dump some code somewhere else too.

this should make anonymous code possible too.. argh

maybe this is good enough: the only place where it will get in the way
is re-arranging of code locations by the assembler (or intermediate
step). i.e. the connect-words! function in target-compile.ss won't work.

the real problem is: whenever an org-pop happens, compilation can
continue at the word where the corresponding org-push happened. this
might be a clue about how to implement. the compiler doesn't need to
provide a list of words, but a list of list of words, where the
inner lists have fall-through, and the outer ones are independent.


Entry: org and fallthrough
Date: Thu May 15 14:15:51 CEST 2008

Another example where a collision of two or more seemingly trivial but
annoying problems that resist elegant solution in the current paradigm
leads to a better paradigm.

Because of fallthrough, which is a low-level property of assembly code
i don't like to give up in the PURRR language, the order of words is
important.

However, words that ORG at a different address are independent of
those that came before, and words that EXIT are independent of those
that come after. This can easily be reflected by adding a 2nd level of
nesting in the representation of a target word collection:

        (deque-of (stack-of target-word?))

The operations are:

    * EXIT/ORG: create a new current-fallthrough list (with possible associated address)
    * QUEUE:    move current fallthrough list to the end

What data structure is this?

  * access top element :: (stack-of target-word?)
  * add new element to the top
  * move top element to bottom

So, it's a combination of stack and a set, implemented using an
assymetric deque. Stuff popped off the stack is recorded in the set
(and looses its order).

Actually, it might be implemented as 2 stacks directly.


Now, there's a deeper problem: this accumulation needs to span across
words, so the point where words are already packaged needs to be
modified to allow accumulation.


Entry: labels
Date: Thu May 15 18:44:27 CEST 2008

Every name that occurs in a source file corresponds to a "define" in
the scheme expansion, and is associated to a macro: a function that
generates code for the named construct. For target words, this creates
a reference to a word structure.

The problem with target words is that they have fallthrough, or
multiple entry points.


Entry: the Forth parser
Date: Fri May 16 10:32:27 CEST 2008

Turns out that collecting Forth into separate definitions isn't a good
idea, because there is no 1-1 correspondence between names that start
with ':' and eventual word structures:

  * there are multiple entry / exit points: words can fallthrough and
    thus are connected

  * it's possible to generate words on the fly: each _label_ should be
    captured into a corresponding scheme definition for the macro that
    generates it, but code inbetween can be accumulated.

Maybe it's easier to just accumulate everything into one giant
function? Simply using 'compose' on the current structure is probably
enough.

So.. instead of

(define (wrap-macro/postponed-word name loc macro)
  (let ((w (new-target-word #:name name
                            #:realm 'code
                            #:code macro
                            #:srcloc loc)))
    (values
     (macro-prim: ',w compile)
     (lambda ()
       (compile-word w)))))

we can have

(define (wrap-macro/postponed-word name loc macro)
  (let ((w (new-target-word #:name name
                            #:realm 'code
                            #:srcloc loc)))
   (values
     (macro-prim: ',w compile)
     (compose macro (make-target-split w)))))

where everything is dumped into a single macro that generates the
code, to be executed later by 'compile-word'.

this seems to work in first try..
let's clean it up a bit.


Entry: labels and multiple entry points
Date: Fri May 16 15:48:58 CEST 2008

code with multiple entry points like this

  : default-bla 123
  : bla-it      1 + ;

is useful to have, but it's difficult to handle. the problems happen
when default-bla is accessed in isolation, i.e. when moving around
code.

this needs a proper data structure, or at least assembler support..

* split the code into fallthrough chunks = a list of words where only
  the last one is terminated.

* add assembler support for a 'fallthrough' opcode.

the point is: make the data structure such that optimization and code
migration becomes easier to do. fallthrough code should be treated as
a single entity, even if it contains multiple entry points.

operations on word w:
  (A) does w fall into some w' ?
  (B) does some w' fall into w 
this is an extra level of linking between words. (A) is essential
knowledge, but (B) doesn't matter much for the word w itself..

so, each target word has a possible fallthrough word.


what about this: pass the assembler a list of words, where each word
is the head of a fallthrough word chain. the extra compiler state this
requires is a set of independent chains.


Entry: compiler state update
Date: Fri May 16 16:18:48 CEST 2008

state is growing larger so pattern matching isn't the best way of
updating it. maybe best to use local mutation: copy the state on
entry + perform imperative update.

hmm.. that looks even more ugly due to long names and explicit get/set

this just needs to be factored. 

ok, with some factoring (struct dict) it's all a bit more readable:
compilation generates a list of list of (word code) inside a dict
struct, which will be collected into a list of head words that have
the code fallthrough structure recorded.

next: the 'org' operations + fallthrough disconnect on jumps.


Entry: redefining words + compiler build log.
Date: Sat May 17 11:39:19 CEST 2008

I need a proper explanation about why it's good or bad to redefine
words. This is about installing 'hooks'.

The problem with hooks is that they can get difficult to
understand. Let's add some warning to this redefine process.


Entry: implementing 'exit' chunk splitting
Date: Sat May 17 13:23:52 CEST 2008

basicly, this needs to:
   * split with new label = dead code
   * collect current chunk

ok.. this, together with factoring out the target-post code and dead
code elimination, which is now as good as free, seem to work fine.

next: org

small fix: instead of eliminating dead code, it's better not to
generate it in the first place: compiler will drop when code is not
associated to a label.


Entry: conditional assembly
Date: Sat May 17 17:15:31 CEST 2008

Something that's not implemented yet: elimination of "<literal> if",
which reduces to an elimination of an "or-jump". This requires some
thought, but should be not so difficult to do.


Entry: Jump chaining
Date: Sat May 17 17:18:16 CEST 2008

This needs to be performed at assembly time due to delayed
computations. It's straightforward though: examine the opcode at the
start of the target word, and check if it's an unconditional jump.

This optimization might introduce new dead code.. Is it possible to
move it somewhere else?


Entry: implementing org using new datastructures
Date: Sat May 17 17:25:38 CEST 2008

org is a property that can be attached to a word chain. what i'd like
is a way to "inline" code that creates different code chains, without
affecting any optimizations. this would also be handy for anonymous
words.


Entry: org again
Date: Sun May 18 12:23:03 CEST 2008

It's not so simple, because there is no way to guarantee there's only
a single chunk going to be compiled: the problem is 'org-pop' which
needs to restore the current/code/chunk state.

The right way to solve this is to dump the whole structure on the
control stack, and restore it on org-pop.

OK: push-chain and pop-chain work. it's now possible to create new
word chains while compiling another one.

It's funny how Forth syntax's inherent lack of nested structures makes
you appreciate the simplicity of s-expressions. However, it's easily
solved by introducing balanced tokens.

So, what about this: if a label's name is anything else than a symbol,
it will be evaluated and used as code location.

Some more changes: forth/forth-tx.ss will now save the prelude code
under a #f label. scat/ns-tx.ss is changed so #f labels do not have an
associated define, but DO evaluate their expressions for side effect
(which will define a target #f word)

Changed wrap-macro/postponed-word to not create a target word struct
if there's no word name.


Entry: delimited control again
Date: Sun May 18 17:11:49 CEST 2008

Now.. hold that thought. Maybe it is better to introduce a proper
nesting structure to get at macro code..

observe:
  * 'reset' needs to be a macro
  * 'shift' can be plain code

what about making reset ']' and put the logic of shift in the
balancing word? i.e. for[ 1 2 ] 

Maybe it's best to return to shift/reset from semantics point, and not
from the particular implementation scat->scheme i'm using.


Entry: Labels and code
Date: Sun May 18 18:48:18 CEST 2008

so.. maybe it's time forget about splitting the target code into
words? is it a false abstraction?

not really.. but the current fallthrough mechanism does look a bit
clumsy. the problem wrap-macro/postponed-word solves is the creation
of wrapper macros, which is quite essential as it allows ALL names to
be handled by the PLT module + lexical scoping..

whatever the representation is of the code that generates the
assembler is moot. currently it's this:

          (#f . prelude-macro)
          (word0 . macro0)
          (word1 . macro1) ...

the macro<x> are then wrapped with a split (label) and concatenated
again.

a different, possibly simpler implementation would be to collect all
names separately, and define a single big macro that generates the
module's code. the problem here is that inline macros need to be
handled differently...

so let's stick to the current implementation.

  * macro definitions are clearly delimited: one name for one macro,
    no multiple entry points.

  * forth definitions have multiple entry points + there's an unnamed
    prelude at the beginning of the file. (names merely interleave one
    big macro that generates the code body)


Entry: org again
Date: Sun May 18 19:57:01 CEST 2008

ok.. org-push and org-pop now work: they will compile a single chain
of words.. however it's not what it should be!

  * it's still impossible to SET org permanently.
  * multiple chains wil 'org-pop' by themselves..

why is this so difficult to get right? probably because i'm trying to
keep the effect local, while org is really a global effect on the state
of the assembler.

so... can org-pop somehow garantee there's only a single chunk
compiled? no..

we'll get there eventually.. just need to find the right abstraction.

another problem: compiling a jump table will look like a bunch of
unreachable code.. (a jump table is a bit of a hack)

the jump table is easily solved by using a different word to separate
entries, which could enable some extra checking..

so, to look at this from the bright side: requiring a restricted
bondage-style structure for the compiler exposes a lot of corner cases
that exploit side-effects of low-level constructs. such side effects
need to be eliminated: the core needs semantic simplicity, where
semantics is close to machine semantics for data operations, but
closer to abstract semantics for control structures.


Entry: jump tables
Date: Sun May 18 20:40:00 CEST 2008

Uptil now these have been an abuse of ';' which breaks with the new
dead code eliminator. So, how to fix that?

     : dispatch route
          read ; write ; help ; reboot ;

the 3 last jumps will be eliminated.

so a different macro is necessary. something like


     : dispatch route
          read , write , help , reboot ;

with a bit of abuse of notation, comma is polymorphic and operates
both on CW as on QW. on CW it compiles a jump without exit.


Entry: quoted macros (the 'address' word)
Date: Sun May 18 22:40:40 CEST 2008

There's a problem with 

  ' abc

This really should produce [qw #<word>], where #<word> is the SCAT
word representing the macro that postpones the compilation of the
word, such that (' abc compile) == (abc)

Note: choose 'run' instead of 'compile'.

But there's one problem. What is this?

   ' abc ,

On QW values, comma will always produce a [dw], so this should compile
the address of a function, and fail if it's a generic macro.

Ok, i solved it before, it's "address".

So, i wonder.. Can't this be done automatically, as part of a
postprocessing step before things are handed to the assembler? or as
part of target evaluation code?

What's the idea here:

    postpone the conversion from macro -> target-value as long as
    possible, because the former is more general, but cannot survive
    the assembler. the problem is the inclusion of such values in
    assembler expressions: in that case the expression evaluator needs
    to be aware of them.


target-rep can't know about macros (to simplify design), so either/or:

  * catch all macro instances before they go into a (target: ...)
    expression or end up as a plain macro in the assembly code.

  * use explicit 'address' after using the tick operator.

  * give target-rep a means to evaluate macros.


the middle one might be best.. that way representations of words
(quoted macros) are different from addresses. essentially, they are..


Entry: i need this to be done
Date: Sun May 18 23:27:34 CEST 2008

I'm a bit fed up with mucking about in the low level architecture.
Apparently, a sane combination between high level constructs
(i.e. code graphs) and low level features such as fallthrough make
things complicated, and lead to some tough choices. Anyways, it does
look like I'm at some kind of end point with this. It's still quite
elegant and powerful.

One point needs some more exercise: construction of anonymous
macros. This probably needs a move to a lazy architecture for macro
representation.

Maybe instead of concentrating on for .. next and dynamic macro
creation, i should really concentrate on static anonymous macro defs
first..

The words [ and ] are not used yet. Let's turn them into static
anonymous macro creators.

EDIT: see, it's getting big before it's documented. this facility is
already there, but using the s-expression syntax:

box> (macro:: 1 2 (3 4 5) run 6)
(qw 1)
(qw 2)
(qw 3)
(qw 4)
(qw 5)
(qw 6)

ok, that was straightforward: (forth/forth-tx.ss)

(define (open-paren-tx code exp)
  (let-values (((code+ rep)
                ((rpn-represent) (stx-cdr code))))
    ((rpn-next) code+
     ((rpn-immediate) rep exp))))

(define (close-paren-tx code exp)
  (values (stx-cdr code) exp))


NOTE: this opens the road for a lot of functions expressed as hof,
i.e. ifte.

Entry: code annotations
Date: Mon May 19 11:54:18 CEST 2008

it's really not working well to add a symbolic representation from the
code: sometimes there just isn't any, due to effect of macro
transformation. it's probably best to just store source location,
since it's only for documentation purpose.


Entry: purrr
Date: Mon May 19 12:08:46 CEST 2008

so what's special about purrr? if i'm to explain what this is about,
the purrr language itself is rather central to the idea.

 -> partial evaluation
 -> extensive use of macros
 -> functional metalanguage


Entry: instantiate left-over macros
Date: Mon May 19 12:38:22 CEST 2008

Maybe it's possible to leave quoted macros in the code and instantiate
them? this would be a really powerful extension. Can be combined with
turning local exit points back into return/jump ops.


Entry: assembler directives
Date: Mon May 19 13:29:15 CEST 2008

The brood assembler has relatively few assembler directives. This is
intentional: the assembler performs ONLY linking and relaxation (and
in the future possibly related operations that optimze these
processes, such as code reordering.)

However, in the PURRR language, some control over code location is
desired. How to satisfy

  - control over address location
  - chained code to facilitate re-ordering

Yes, it's 'org' again..

Maybe it's best to let 'org-push' save the chain list too, that way
'org-pop' can ensure there is only one chain, which is what we want..

(maybe this should just push everything, making the internal compiler
state acessible to some macros?)

Trying: pop-chain will save the recorded chains as only one
chain. This works: ensures at least compilation at correct address.

So, this fixes the chain bug for org-push/pop, but still doesn't
provide an 'org'. Maybe this needs to be specified somewhere else?


Entry: next
Date: Mon May 19 21:09:55 CEST 2008

look at plane notes, and entry://20080501-014904

two deep problems remaining:

  - how to solve 'org'  (or, is assembler state access allowed?)
  - dynamically decompose macros (loop optimizations, lazy code)

the rest should be straightforward. i do not have the energy to tackle
any of them atm.. can they be ignored, and postponed until after
porting of the interaction code?


Entry: strict/lazy and macros
Date: Tue May 20 02:31:56 CEST 2008

so.. in a lazy language, less macros are necessary because evaluation
order isn't much of an issue. in a strict language, the existance of
'if' and 'lambda' as a special form infects certain constructs (to
be special forms also)

now, where would a lazy language need macros? there is template
Haskell, so i guess there is some need for metaprogramming..


Entry: use of monads in dsl implementation
Date: Tue May 20 11:24:27 CEST 2008

http://www.cs.yale.edu/homes/hudak-paul/hudak-dir/ACM-WS/position.html


Entry: constants and the 'parameter' word
Date: Tue May 20 13:39:41 CEST 2008

Constants are a sort of typed macro: they represent literal values,
but are not necessarily completely defined in the core compiler:
(re)definition of constants is necessary to obtain a specialized
compiler that can generate code.

The problem i run into is where to generate the error for undefined
constants. Currently, target-value evaluation uses target-value-abort
to signal a value is not available. However, this should really be
reserved for labels only.

-> fixed: partial evaluation of target-values NEVER calls the actual
   evaluation: this means make-constant can pass code straight to the
   assembler (currently: just the constant name)

fixed another problem: wrap-macro/mexit requires the state to be
compilation-state, while macro->code from 2stack.ss works only on
2stack. added a mechanism to temporarily wrap the 2stack state in a
compilation-state object.

so.. this can be moved to forth. simply add a word 'parameter' which
will create a constant that's later to be redefined. a parameter is
something more than a mere macro: it has a guarantee to produce only a
single value. (i'm thinking about things like 'fosc' and 'baud')

  parameter baud
  parameter fosc

i can't call it 'constant' because of confusion with the way that word
works in standard forth. so:

  A parameter is a stub macro that produces a single literal
  value. These serve to parameterize lowlevel code without resorting
  to more explicit parameterization. (i.e. 'fosc' might influence a
  lot of timing related constants)

  A parameter that is actually used to generate code needs to be
  (re)defined as a macro that produces a single value. Otherwise an
  undefined-parameter exception will be thrown at assembly time.

  Parameters are thus a somewhat controlled violation of the overall
  bottom-up structure of Purrr code.


small prob: the 2stack / compile-state code gave problems again. what
i'm doing now is to avoid that problem and require parameters to be
defined in forth code with mexit support.

EDIT: another problem with paramters: redefining them with another
paramter is not a good idea: basicly, there's a sequential element
here (load).


Entry: bored
Date: Tue May 20 19:53:21 CEST 2008

i'm getting thoroughly bored with this.. need to find some new tricks
or get something going, because i'm loosing motivation.


Entry: hooks / late binding and kernel modularity
Date: Wed May 21 15:06:03 CEST 2008

this problem is more serious that i thought. it's not only used for
constants, but for generic macros. maybe add something like a
'macro-hook' which is like a parameter, but doesn't guarantee anything
about code generation.

i need to think a bit deeper about linking and modularity.. the pure
bottom-up approach won't work well. maybe the 'unit' approach is
really better?

is it possible to import a module called 'link.f' that implements the
cyclic name resolution?

am i going to be anal about names? i'm already going quite far with
early binding.. consistency counts.. the 2 uses are:

  * compiler extension by redefining some core macros (i.e. 'dup')
  * code parameterization (both constants and generic macros)


wait a minute..

if code generation can be postponed until all the macros have been
loaded, then simply adding stub macros that are redefined later would
work just fine.

maybe best to take the inelegance and get the damn monitor to run.. in
essence, it's a problem with the .f source code. any abstraction
necessary to make that code more modular can be added later.

EDIT: it really gets in the way.. 'route' gives problems. but that can
be imported? this problem is solvable, but requires some thought..


Entry: another layer?
Date: Thu May 22 11:24:05 CEST 2008

I was thinking about putting target-compile.ss in forth/ because it's
mostly about extending the macro/ stuff with features necessary for
code instantiation and target label management. Or, it should be
placed in compile/

Can macro-lang.ss be made independent of target-compile.ss ?
Yes, when macro-lang.ss splits off a target-label specific part. This
code should move to compiler/  (which is badnop/ now -> get rid of
that name)


Entry: units
Date: Thu May 22 11:38:11 CEST 2008

I need separate compilation with clearly defined interfaces for some
components. One would be the logger: since it cuts through everything,
changing its code requires recompilation of the whole codebase.

A nice excuse to try to understand units, then to move on to using
this for .f files too.


Entry: another bug in redefine
Date: Thu May 22 14:22:19 CEST 2008

whenever a word is created, it creates a replacement macro. this macro
should have redefine enabled also.

( i enable mutation and things start going wrong.. )


Entry: no more juice
Date: Thu May 22 21:13:30 CEST 2008

Looks like i need to take some time off of the project, do some other
things. Looking at what i did in the last 2 weeks:

   * delimited continuations for loop body optimization: strict vs. lazy

   * trying to fix org, it's still not fixed (language design issue: i
     have no way to annotate this in the current structured
     representation)

   * struggling with specialization (redefine + super) and plugin
     behaviour.

   * trying to write documentation for the project

   * thinking about simulators, and simulator generators

   
The good things that happened: cleaned up compiler data structures +
separated postprocessing optimizations. Those look nice now. The rest
was a random walk, however, the EXTEND and LINK problems are quite
important, and as far as i can see the only real hurdle.



Entry: more juice
Date: Fri May 23 09:30:49 CEST 2008

got a good night sleep + some ideas about writing documentation

today: 

- write more docs: create a reference doc extracter
- separate some code and change names to make the module hierarchy more clear
- write something about forth and closures


Entry: introduction documentation
Date: Fri May 23 09:31:17 CEST 2008

- It is about language:

     * Lisp (more specifically PLT Scheme)
     * Forth (the Purrr dialect)

- It is about Meta-language: Macros

     * S-expressions (Lisp) and concatenative syntax (Forth) are easy
       to process. It's possible to make an all syrup Squishee.

     * Forth, viewed as a functional language, has an arbitrary
       evaluation order. This presents an opportunity for generating
       static, specialized low-level code from high-level templates by
       employing implicit compile-time evaluation. The Purrr
       experiment is about making Forth more declarative.
       
- Design is accessible.

     * Unit of composition = Scheme module.
     * Forth source files are Scheme modules
     * Design is layered: Scat, Macro, Compiler, Forth syntax, ...

- Goal
     * Small business and enthousiasts first
     * Test in industry setting (needs a specific problem to solve)


Now, this should be elaborated in a couple of chapters, with lots of
examples.

Entry: Community bootstrap
Date: Fri May 23 11:23:18 CEST 2008

A plan to attract developers. What is necessary?
 - it should work relatively flawlessly for 1 target
 - it should be well-documented
 - extensions should have a clear API

The first two are mostly perspiration. The real challenge is to
standardize some APIs. I am hesitant though about standardizing too
much: the aim of the project remains the construction of a tool for
'from scratch' development.


- Purrr language extensions

    These are at library level, and can be developed separately from
    the core project. Purrr standardization is mostly ad-hoc, but due
    to PLT's module system, glue layers are fairly straightforward to
    maintain, and standardization can be made the responsibility of
    the system designer.

- Processor extensions:

    Target-specific extensions can be separated entirely from the
    Staapl core. The PIC18 architecture specification is an example of
    this. It boils down to:

    * Creating an assembler. This can be a layer on top of an external
      assembler, or a specification in the assembler generator
      language. (see pic18/asm.ss)

    * Create a set of macros (compiler extension). Map the Purrr
      language to the machine structure (data stack, return stack) and
      implement the primitives. (see pic18/macro.ss)

    The interfaces for these operations are now fairly standard.

- Internal compiler extensions:

    Changes to Scat, Macro or the compiler would best be incorporated
    as as core changes. Such changes can be temporarily forked, and
    later merged into the main distribution. However, I expect this
    part of the program to be relatively stable.

- Simulator extensions:

    As an addon to the interactive part of Staapl, a simulator
    (generator) would be nice. The interface for this still needs to
    be developed. This could be written as a processor extension.

- Extra languages:

    A future goal is to construct a static dataflow language to
    supplement Forth code for building DSP applications. How to
    structure this isn't clear yet.
    
- Application spin-off communities

    I'm thinking about Sheep, CATkit and KRIkit: the applications that
    have been used to battle-test the Purrr language. This would
    involve some kind of Purrr library standard. I'm thinking about
    writing something a bit closer to ANS Forth, but there is
    insufficient project pull to do it atm.
        

Entry: redefining names
Date: Fri May 23 12:37:48 CEST 2008

Let's re-iterate the design choices:

CASE A: compiler specialization
CASE B: parameterized kernel code

for CASE A the choices are:

   explicit renaming vs. implicit redefinition.

for CASE B:

   linking (using the mechanism from CASE A) vs. explicit
   instantiation.

EDIT: it might be best not to loose to much time on trying to fix this
now, and 'emulate' the classic load-style redefine behaviour until a
better abstraction mechanism pops up.



Entry: namespaces and parameterization
Date: Fri May 23 13:11:54 CEST 2008

Instead of trying to use late binding, it might be more interesting to
use explicit parameterization where possible. This is one of the
problems with Brood 4 which caused a lot of pain. Let's fix that
first.

Taking the interpreter code as an example. What i'd like to do is to
make instantiation explicit:

: interpreter 
    ' io make-interpreter compile ;


There's an interesting problem here with choice of words. Maybe here
'compile' should be used to indicate that there's an instantiation
going on. It's not always clear whether something is dynamic or
static. There's a difference between:


    : interpreter int-body ;

and

    ` interpreter create-interpreter

The latter is preferred. It is more general. But is not possible in
the current implementation. Labels can be defined, but they are only
for annotation, and visible AFTER compilation.

Can this be simplified?
What is desired is something like

    ` bla

Where at that moment the current context has a macro called bla, which
is not yet associated to any code. So how to associate a macro to
code?

Maybe instead of mapping Forth code directly to 'define' it should be
mapped to a 2-step procedure of undefined macro creation + single
assignment.

I'm touching some core of reflection here.. Something that doesn't
really work well together with the way the Purrr language is layed
out. There's no problem with doing this in s-expression syntax, but
expressing this in Forth syntax is not so easy..

Let's see.. Quick hacking it:

      \ create the macros
      declare read
      declare write

      \ define them
      ' read ' write make-io!

But, this requires make-io! to have side-effects + read and write not
to appear in any code before make-io! is executed. That is definitely
not desirable: it would be back to the old imperative style.

So, what about using this as the primary syntax:

    [ 2 3 4 ] define bla

And writing ':' as a substitution macro?


Continuing the random walk. It is possible to do something like:

    : read   io 2nd ;
    : write  io 1st ;

But that's exactly the thing that's not flexible enough when a lot of
macros need to be created. Is it really a good idea to try to solve
this in Forth syntax, since it obviously has some shortcomings..

It looks like the only way to solve this is to use 'parsing word'
preprocessors. Maybe the goal is here to figure out a way to create
such substitution macros in .f files? It's certainly possible to
create scheme files which have the 3 levels:

       - parsing words (bound to scheme level 1)
       - macros (scheme level 0)
       - forth words: consequence of hitting 'compile'

The problem really is how to map the scheme s-expression flexibility
to Forth code. Can't be done?

This is a dead end..

Solve it with substitutions and be done with it?

((io read write) (:macro read io drop compile
                  :macro write swap drop compile))
  

Entry: Explicit instantiation and macro assignment.
Date: Fri May 23 14:04:35 CEST 2008

EDIT: comes from blog, now degraded to a rambling because it's
confusing given i'm embracing 3 level code now (word,macro,parser)

There is one pattern that is cumbersome to express at this moment:
   * Write logic as macros without specifying concrete names.
   * Provide names during instantiation.

If this is about creating a single function or macro, it's straightforward:

  : interpreter 
      ' receive 
      ' transmit 
     compile-interpreter ;

Because the .f syntax at this moment assumes that all macros occur in
isolated (macro . code) pairs it is not possible to define a
collection of functions/macros.

What is desired is something like:
 
   ` read ` write create-io

Which requires some form of mutation, even if it's single
assignment. Alternatively parsing words preprocessors could be used:

   create-io read write

To expand to

   :macro read make-io-read ;
   :macro write make-io-write ;

This requires a special purpose syntax, i.e. something like

   parser create-io read write ==
      :macro read make-io-read ;
      :macro write make-io-write ;
   end

The problem with this is that parser generating parsers are hard to
express. Is it possible to create a single abstraction that does not
limit the number of orders? It looks like programmable parsers are
necessary anyway when macros can't deal with identifiers. 

The good thing about using the previous approach is that it maps very
well to Scheme's define and define-syntax.


TODO: place this in the framework of compilation phases and figure out
a better syntax for defining parser macros.





Entry: substititions
Date: Fri May 23 16:11:41 CEST 2008

Why are parsing words necessary? Because modified semantics are
allowed for symbol definition (':')

What about a Forth syntax for substitution macros?

  parser variable name == create name 1 allot ;

This would solve virtuall all problems, since it gives access to named
subsititions, but it adds a level of inelegance to the language. Well,
sort of: they are already there, so why not make them
available.. These would make sense in interactive commands too.

  (What about ditching all this crap and creating a flexible
   alternative s-expression based syntax i think..)

So, considering that I don't want to loose the good things about Forth
syntax (prefix syntax to eliminate parenthesis) I guess I have to
learn to live with the bad things about Forth syntax (necessity for an
extra composition mechanism due to prefix syntax). It's not all bad,
just some trade-offs..

So, prefix substitutions. They are not like parsing words, but can be
used to emulate them. They are the 'last resort' composition
mechanism which are used to capture prefix patterns.

If I'm going to embrace them as one of the features of Forth syntax,
it might be wise to make ':' not a primitive. More general, it might
be wise to have this as a layer on top of a simpler, single assignment
preprocessor.


Entry: Single assignment base language
Date: Fri May 23 19:23:32 CEST 2008

So, I'm going full circle. Back to postfix notation with a (set! name
value) component? Is it actually possible to do so? Prefix primitives:

   [ ]       macro composition
   ' <name>  macro variable dereference (+ creation?)
   macro!    single assignment definition
   forth!
   variable!

This would require a separate state machine (that resembles a forth)
to run during the preprocessing step.

This looks like a nice solution, but i cant help to think: why is this
DIFFERENT than the Scat state machine? Can this be written in Scat?
And don't i introduce composition problems again?

A scat machine with state: (stack, in, out) should be able to parse
this without problems.

Now I'm confused.. caffeine poisoning..


Entry: More standard forth syntax
Date: Sat May 24 11:49:11 CEST 2008

Maybe it's a good idea to have an interpret mode anyway, at least
during the parsing phase. This would make it a lot easier to deal with
standard forth syntax like 'constant'.

Looking at what the parser does now, it is already the case. Only
there's just a single mode: compile.

Adding an interpret mode, the language that's active could be plain scat.


Ha, i'me using the [ and ] word for code quotation. Looks like that's
exactly the opposite from how it would be used in Forth.

Questions:
  * is it possible to solve this without interpret mode?
  * using substitutions only, does the hygienic system provide enough
    freedom?

The latter i mean
  (io read write) == 
  (:macro io    1 2 3 make-io-object ; 
   :macro read  ` read io ;
   :macro write ` write io ;)

The 'io' name is not visible in code since it's introduced by the
macro.

What can be seen here is that pattern relacement macros need a special
terminator. 

parser io read write ==
   m: io     1 2 3 make-io-object ; 
   m: read   ` read io ;
   m: write  ` write io ;
parser bla ... ==
  ...
end

Going that route, why wouldn't you write everything as parser
instead of macros?

The problem is, parsers allow you to deal with NAMES, while macros
allow you to deal with CODE. The fact that macros are associated to
names is a practical matter, but they are not allowed to modify or
create names. Parser words are necessary because prefix syntax is the
ONLY way to modify semantics of names, other than to reference the
macro that is bound to it in the current environment.

This might be a bit confusing. There are 3 things that can be bound to
names:

        * parser extensions  -> non-concatenative source code preprocessor
        * compiler macros    -> concatenative (compositional) code generation
        * forth words        -> macro instantiation

It's probably possible to design a language that doesn't need the
first step, but technically even scheme has this: the reader, which
has special features. Having to do it this way is a FUNDAMENTAL
limitation/feature of Forth.

This is DIFFERENT from Forth because it is pure input word stream
substitution. Forth's parsing words operate on the input stream
directly. This is one of the reflective properties of Forth that is
eliminated in Purrr.

EDIT: But, adding this extra level, does it stop there? How to create
parser macro creating parser macros?

The problem is easily solved with s-expressions, but hard to do with
the current implementation: what is necessary is to add semantics
AFTER collecting tree-structured code. This is what s-expressions can
do: parse step follows read step. Is it possible to bring this to
Forth?

TODO: write something intelligent about this problem. it's a deep one,
related to syntax and reflection: the difficulty of unrolling Forth.


Entry: parser cleanup
Date: Tue May 27 11:06:38 CEST 2008

EDIT: this is a mess.. i've reached the conclusion that the 3 layers
are necessary, so it might be best to think a bit about the best way
to represent the highest layer.. 

gut feeling says current problems with parser (the non-concatenative
part of Forth syntax) are rooted in the way it's implemented.

phase 1: what is ':' ?

this, in addition to creating a new label, will terminate the previous
definition. following the colorForth model, there is no interpret
mode, so definitions run upto the next word. those 2 behaviours need
to be split up.

( simplifying, why not have macros with multiple entry points? this
would not be too hard to solve really. )

so, let's have a look at the single-assignment language for Purrr
parsing. the problem to solve is:

   How to implement a Forth-style syntax (without interpret mode) on
   top of the purely concatenative Macro syntax used by Purrr.

To re-iterate why this is a problem: 

  * I'm convinced purely functional and purely compositional macros
    are a good idea: they behave well as a data structure for program
    representation and code generation (compiler structure).

  * Forth is only a thin layer on top of this, mostly to solve
    definition of names in a source file using a familiar
    syntax. Forth syntax in itself is a good user interface. However,
    the design of the Forth language is firmly rooted in reflectivity,
    employing an image based incrementally extensible word dictionary,
    something which i'm trying to unroll into a language tower.

  * There is already a base syntax using s-expressions which
    translates directly to Scheme. The problem now is to find a way to
    translate (linear) forth into tree-structured Macro expressions.

One solution is to 

  * create an intermediate language syntax that has all the features
    needed in Forth, but uses s-expressions and single assignment.

  * map this language to an explicit list of definitions

  * find a mapper from linear forth -> s-expressions

This should work for the 3 levels of code:

    - macros
    - words (instantiated macros)
    - parser (Forth source transform patterns)


First attempt in forth/single.ss

Maybe the real focus should be to embed s-expressions in the Forth
language. This gives parse-time lists.

Care should be taken not to create yet another level that's hard to
metaprogram. This needs to sink in a bit.


Entry: parser idea
Date: Tue May 27 14:21:56 CEST 2008

To further distill the actual idea. There are 3 levels of
concatenative code, with each its own interpreter. The real question
is: can't they be unified into one?

Macros

  These are Scat words that create target (assembly) code. They are
  the simplest kind, being built directly on top of Scheme (module)
  name spaces and an RPN parser. This parser reads from an input
  stream, and accumulates an output Scheme syntax.

  IN = stream of identifiers
  OUT = accumulation of nested Scheme expression

Target words

  Obtained as instantiated Macros. In macro/target-compile.ss

  IN = stream of macros
  OUT = accumulation of target code words


Forth preprocessor

  This associates names to Macro or Target code.

  IN = stream of Forth syntax
  OUT = list of (name . value) pairs.


Are they all really necessary? Macros are the core programming
construct and serve to describe programs. Target words determine the
instantiation level of code. Forth preprocessor allows naming of
macros and target words.

The target words layer is a consequence of manual code
instantiation. This is a feature: i want to have this level of
control. In principle this could be eliminated when the language is
made a bit more high-level (i.e. elimination of return stack access).

The preprocessor layer is necessary because of limited reflection: if
macros cannot create named code, constructs that abstract this cannot
be macros, so need to be preprocessors.

Wether all 3 layers can be implemented in more of the same way is an
interesting but not so urgent question. Wether they have to remain in
existence is easy to answer: yes. They are a consequence of two
important design choices:

  * explicit instantiation (macro vs forth)
  * non-reflectivity to simplify code processing and namespace handling



Entry: macro instantiation is memoization
Date: Tue May 27 14:51:34 CEST 2008

What about looking at the code instantiation problem as a form of
memoization? A macro that is inlined twice can be replaced by a single
instantiation and an indirection.

Doing this automatically could lead to a simpler (beginner) language
that does not need a programmer-specified distinction between Macro
and Forth modes.


Entry: next
Date: Tue May 27 16:14:53 CEST 2008

  - fix the parser = a base language + an extension syntax.

  - add a temporary 'load' function to supplement 'require' for
    old-style Forth: it might be better to keep the mechanism in
    there, instead of forcing the use of bottom-up modules.


Entry: embedding s-expressions in Forth code
Date: Tue May 27 16:27:19 CEST 2008

In order for the parser code to work properly, new parsers should be
able to work from within a file. This creates a problem because not
all forms can be identified before parsers are created.

This problem can be bypassed by allowing genuine s-expression syntax
for the parsers. That would also allow parsers to be nested. 

{ parsers
  { { variable name }  { create name 1 allot } }
  { { 2variable name } { create name 2 allot } }
}

Why not make this go a bit further and allow generic scheme code? This
would solve basicly all other syntax problems with forth files. Maybe
this is also the right way to map Forth syntax to scheme..

Quick hack works fine, except for one small problem: it doesn't get
evaluated at the right time. Maybe moving everything to toplevel by
using this as primitive syntax can help?



Entry: rethinking forth parsing
Date: Wed May 28 12:43:27 CEST 2008

The problem to solve is to identify parser commands before code is
parsed. This is only possible when all s-expressions are collected
before linear source code is parsed. This means that it's not possible
to define parser words that expand to s-expressions.

There's a chicken and egg problem there that deserves some
attention. The problem is that identifiers can't really be identified
as in scheme. -> look into how this expansion stuff works.

This works fine in scheme:

     #lang scheme/base

     (define broem (bla))

     (define-syntax make-bla
       (syntax-rules ()
         ((_ bla)
          (define-syntax bla
            (syntax-rules ()
              ((_) (+ 1 2)))))))

     (make-bla bla)


So, just expand everything to this, should work fine.

Maybe the RPN compiler should be simplified a bit, using syntax
parameters instead of compile time parameters.. Makes sense?



Entry: today
Date: Fri May 30 00:42:34 CEST 2008

was a day of writing 6 paragraphs of introduction. i think i sort of
got it going: the reasons for brood:

   * lisp is cool, especially for metaprogramming
   * create a small language to metaprogram from within lisp

the problem is to really leverage this, some knowledge about languages
and implementations is necessary. i'm still not sure about how to sell
this to electronics engineers that don't see the point of lisp..

about the code: the parser patterns seem to be important as a final
'highest level of metaprogramming'. i'm still not convinced it is a
good idea, but it seems to be a consequence of wanting to keep old
forth syntax instead of s-expressions. i need to spend a day writing
and thinking about this.


Entry: the parser again
Date: Fri May 30 10:11:18 CEST 2008

the problem is that parser macros defined in an .f file should have
immediate effect. can this be local-expanded or something?

i ran into this deeper problem: parsing isn't really factored when
being inside a single rpn macro: big reliance on dynamic
variables. can this be replaced by something else? probably not so
easy..

lets re-iterate over the expansion algorithm in PLT
Scheme. done.. basicly, it can fish out 'define-syntax' before
expanding the value expressions in 'define'.

so, maybe it should be a true preprocessing step like it was
before. one that converts the forth language to 'compositions' forms.

the thing is, inside a 'macro:' you really don't want any forth prefix
syntax. this should be s-expression syntax only. the only exception is
locals. -> probably best to separate the changes into those that
introduce new global names, and those that don't.

EDIT: not necessary. both data and code quotations are expressible in
s-expressions and part of the RPN syntax. local variables can be added
by using definitions like ((name param ...) body ...) instead of (name
body ...)

(locals are really special: they make sense only when parameterizing
bigger code chunks, with lots of parameter reuse..)


hmm... some things are not really very well disentangled yet.. maybe i
should accept that 1. macro is clean == enough,  2. the forth on top
of that has some hacks to support standard forth syntax on a system
that doesnt have forth's kind of reflection..

damn, this is complicated..

one remark about my method though: forth-tx is too complicated. i find
it difficult to make modifications because it is a hack on top of rpn
body creation. at this moment i don't really see how to change that
without introducing more than one abstraction layer..

maybe it needs a couple of days rest, i'm not coming up with anything
exciting here..


Entry: syntax parameters
Date: Fri May 30 11:17:57 CEST 2008

is this true? using syntax parameters, i can get rid of the hack that
calls the transformers directly.

need to be careful there: every time the expander is called again,
names are marked. because i'm using this mechanism to build a single
lambda expression, names might make more sense unmarked..

Entry: Forth syntax, philosophical approach
Date: Fri May 30 15:35:18 CEST 2008

meaning, through natural language.. i do this too little with the
tough problems that turn out to be huge time sinks..

The problem: 

  It should be possible to _define_ new Forth substitution words,
  which is implemented by define-syntax, _before_ the expansion of
  body code.

  In Scheme, due to the use of s-expressions, this is easy. In Forth
  however, the names are burried inside a muck of words: expanding all
  substitition words to expose those words that might yield the
  definition of _new_ substitition words is (probably) not possible.

  Question 1: is it at all possible to fish out these macro
  definitions? If so, how?

  Question 2: if it's not possible, can we formally acknowledge it as
  a shortcoming of Forth syntax and work around it?


Entry: fix it later?
Date: Sat May 31 10:11:37 CEST 2008

since this is more a pride issue than anything else, can it be fixed
later? probably.. it's just about

   * syntax for parsing words
   * allowing 'load'

'load' can be implemented using include/reader : specifying the reader
is essential since it needs to expand to some form that can be
included in a file.

this means i have to construct
  1. a scheme syntax to define forth files
  2. a reader that gives this scheme syntax
  3. module-reader in terms of those 2

the point to start is purrr/forth.ss

This file contains all logic necessary to expand module syntax.


Entry: apologies
Date: Sun Jun  1 00:14:32 CEST 2008

explain: 

  why (forth) macros are actually (scheme)functions, and code needs to
  be compiled at (scheme) runtime. (i.e. why is there one level
  (actually 2 if you count the assembler) that has manual compilation?

-> derive from this a proper instantiation syntax for forth code.

EDIT: the explanation is simple: a significant part of the program is
a long-lived target code juggler. that part cannot be just syntax.

the proper instantiation for forth code is of course 'define-ns' in
the (target) namespace. then all words are accessible through
reflective operations.


Entry: module system
Date: Sun Jun  1 00:56:05 CEST 2008

http://calculist.blogspot.com/2008/04/dynamic-languages-need-modules.html

StoneCypher said...

    It is important that you learn a well established language that
    has already successfully grappled with these problems before
    deciding on your own mechanism.

very true..


Entry: rethinking code instantiation
Date: Sun Jun  1 10:54:22 CEST 2008

it's frankly too complicated and ad-hoc atm.. i lost oversight. these
features/choices make it complicated:

  * forth words have associated macros
  * multiple entry and exit points
  * forth parser is a single macro, but uses factored macros

macros are really simple (declarative), but forth syntax +
instantiation makes it a lot more difficult.

code instantiation produces a single macro that runs with
compilation-state to produce a collection of fallthrough words. This
is the remainder after lifting out all macro and parser definitions.

what about making instantiation an operation on macros? i.e. replace a
macro with a wrapper, and collect the body instantiation
somewhere. this doesn't work for multiple entry points though..

so: multiple exit is easy: it's simple to fake in macros using a
'macro return stack' in compilation-state. multiple entry points
however is quite difficult.

is it possible to do this?
   1. bring the representation back to single entrypoint
   2. write multi-entry point code / fallthrough as an optimization

i don't think so.. there is too much code that relies on multiple
entry points. i need a simpler way to represent it.


Entry: again
Date: Sun Jun  1 11:13:26 CEST 2008

try meta level here: i'm loosing oversight because it's not working: i
can't make small changes to see how they propagate through a working
system. the real problem that started this is the inability to define
parsing macros, which lead to a realization that these need to be
instantiated before the rest of the code is parsed (chicken / egg
problem) which lead me to think that this is impossible unless some
form of partial expansion is used, which is then made difficult by the
way parsers are implemented: by directly calling them.

in short: 

problem A:

 it's difficult in the previous setting to get to a structure
 where the 'define-syntax' occurances can be isolated before they are
 used.

 it's easy to solve this by requiring them to occur in different files.

problem B:

 it's currently not possible to include a forth file because the
 module expander is not factored properly.

let's tackle B  first.

what's wrong with current forth-module-begin-tx macro? the whole
register-code! business is not so good. a forth: macro should take an
extra argument to represent the instantiated macro.

'register' is used in the macro expansion to store the word struct
produced by the wrap operation. 'compile' produces the code graph. the
problem is: i'd like these to be composable from different sources, so
a chunk of forth code needs to produce something that can be
accumulated later into something else.

the idea is correct, only the implementation is clumsy: instantiation
needs to be solved at a single place, then forth syntax needs to be
built on top of it.


Entry: nobody uses frameworks
Date: Sun Jun  1 12:49:28 CEST 2008

what are the entry points? the ui? brood needs to be api'd as a
library. it needs to be a straightforward set of macros on top of
scheme, no callback nonsense.


Entry: forth syntax / code instantiation
Date: Sun Jun  1 14:12:34 CEST 2008

split the forth macros in 2 parts: those that create new names, and
those that do not. (the latter contains locals, quote and code
quotation). note that quoting doesn't need forth syntax: there is a
corresponding s-exp syntax. if the same is done for locals, then forth
syntax can be exclusively used in .f files or a forth: macro where it
introduces names.

so: the '(forth-toplevel form compile (defs ...)' does:
  * expands to a special form (i.e. begin or #%module-begin)
  * binds 'compile' to a function that generates words
  * has preferrably no side-effects

the idea is that those forms can be composed (i.e. using 'load') and
that the toplevel module namespace initiates all code compilation.

the problem is not the expansion to definitions (either 'define' for
macros or 'define-syntax' for parsers): it uses the toplevel 'begin'
form. 

the problem is registeration of the forth words / instantiated
macro. binding it to a given name is not necessarily a good thing. i
really need to think a bit about how this is used in toplevel project
namespaces, both for one-shot and incremental code compilation.

maybe it is best to put all word instances in the toplevel namespace
AND allow for a mechanism to collect them (maybe from the namespace
using reflective operations?)  this could be an algorithm akin to
garbage collection: only compile the code reachable from the roots =
exported (forth) namespace names. 

it only needs a way to take care of the recursive definitions: the
word instances are defined in terms of macro names, and can only be
evaluated AFTER all macro bindings are evaluated. because of the level
split (forth macros are scheme functions) this has to be done
manually. it's easiest by just making all forth code into promises
though.

the only problem remaining is fallthrough: how to guarantee the
correct order of compilation? this is one of the main reasons why
simple mapping from name -> datum isn't really possible: the order is
important.

so, the problems:

  * forth code is ordered (supports multiple entry points)
  * evaluation of forth code requires all macros to be bound, so has
    to be done after evaluation of macro body expressions.

this is solved atm once per module, but due to 'load' this operation
needs to be composable:

  -> compose the definition of macros
  -> compose the forth instantiation macros

it's probably ok to bind the labels to names so they can be accessed
through reflective operations later, but it's essential to also
somehow orchestrate the compilation of the code.

the remaining question: when does the code need to be compiled? also,
answer this in light of incremental compilation (keep the namespace
active, just add in more code/macros). is it ok to assume some
context?


Entry: partial evaluation literature
Date: Sun Jun  1 14:49:19 CEST 2008

enough dabbling, i think i'm ready for reading:
http://www.dina.kvl.dk/~sestoft/pebook/

* pe = an operation on program text


Entry: namespace woes
Date: Mon Jun  2 19:35:21 CEST 2008

file:///plt/doc/guide/reflection.html 

  Modules not attached to a new namespace will be loaded and
  instantiated afresh if they are demanded by evaluation. For example,
  scheme/base does not include scheme/class, and loading scheme/class
  again will create a distinct class datatype:

This is important for the global registery used for recording Forth
instantiations, and datatypes that are shared by the host meta system
(i.e. target and macro word structs). 


Entry: fixing forth instantiation
Date: Tue Jun  3 11:14:34 CEST 2008

made the first changes to purrr/forth.ss so it generates just
syntax. need to change:

   - handling of toplevel require forms using parameter
   - compilation

maybe i should just give up the production of forth dictionary
'records' and just use the environment to dump stuff, making
compilation of a .f file a side-effectful operation.

SOLUTION:

  move everything that's not part of the instantiation macro to
  toplevel using the forth-toplevel-forms parameter. this includes
  macros (and maybe variable definitions?) and definition of the word
  structs.

  the remaining running state is just a single big macro which inlines
  the word structs in the macro code using the 'label' forth
  word. i.e:

     (define-ns (target) bla (make-target-word #:name bla))

     ... ,(ns (target) bla) label ...


  this way, forth compilation is what it is: construction of a macro
  that after instantiation gives a code graph. all OTHER stuff that
  happens in a .f file (definition of macros, variables, imports, ...)
  have a straight meaning as scheme module components and can be
  recorded in a side channel, implemented by a parameter.


  ( i can't help but think about writing the parser words as scat
  words.. this is yet another threaded state problem.. )


stubbornness: i'm just going to keep things as they were. cleaned up
purrr/forth.ss a bit + finally undersood how namespaces can share
code, and it looks like this is enough to build the necessary
abstractions.

NEXT: load


Entry: load
Date: Thu Jun  5 10:58:26 CEST 2008

got it working, at least with absolute paths.  the trick is to call
the forth syntax reader forth directly inside parser-tx.ss, and to
combine source location info with the proper lexical information.

next: fix path + convert kernel's 'require' stuff back to 'load' so it
can be modularized later, and so that most hacks around late binding
can be simply replaced by loading stuff into the same namespace.

Entry: simulator
Date: Thu Jun  5 14:17:16 CEST 2008

http://citeseer.ist.psu.edu/119550.html

see blog: instead of writing a simulator, building a partially
evaluated simulator might be a better idea, since for simulation speed
is very important.

what is an instruction? it's a state update. state is memory. memory
is a number of registers, with variable bit size. so an instruction is
something with the following properties:

  * an endomap for (a subset of) the machine state
  * timing information
  * encoding (for instruction interpreter)

maybe i should take a step back towards pure s-expressions for
instruction set spec, since these are a bit hard to compose (write
macros that expand to them) composition would help to define some
instruction classes.

 (addwf   (f d a) "0010 01da ffff ffff")

 ((addwf f d a) ((#b001001 6) (d 1) (a 1) (f 8)))

actually, the simulator descriptor language is as good as the same as
the dsp dataflow language. maybe i should do the latter first, then
generalize.


Entry: dataflow language
Date: Thu Jun  5 17:51:36 CEST 2008

see entry://20071211-093307

for dsp, dataflow makes sense from an execution point of
view. 

compared to forth, explicit named parameters might be interesting
because of the graph-like structure of dsp algorithms.

the main advantage of a forth syntax is that it gives a natural way to
deal with multiple I/O without naming intermediates. the main
disadvantage is its serial nature: order of operation is already
necessarily specified.

however, Forth might still be interesting as a SPECIFICATION
language. use Forth notation to build a DAG dataflow graph, and then
synthesize it in whatever form makes most sense.



Entry: Array Processing
Date: Fri Jun  6 13:25:05 CEST 2008

Programming in an array processing language can be factored in two
steps:

  * construction of primitive, pure many -> many functions
  * mapping these over tensors


Entry: next
Date: Fri Jun  6 16:10:22 CEST 2008

paths in load.
  * find a file in path.

now, allowing files with undefined symbols might be a convenient
notational device, but it makes it hard to test them individually
because they need context.



Entry: command line
Date: Fri Jun  6 16:56:16 CEST 2008

it's time to start using a forth command line + code store.

then, there is only boring stuff left: 

  * fix 'org', or think about how such a direct assembler-state
    control statement can be allowed in the language.

  * fix the undefined symbol problem introduced by the switch to
    module languages -> maybe add some toplevel undefined symbol
    handler in the badnop namespace management code. make sure some
    toplevel equivalent of 'load' works. (why can't load be used for
    import actually?)



Entry: namespaces
Date: Sat Jun  7 11:17:15 CEST 2008

trying to figure out exactly where to put things. 

  1. support system = toplevel application namespace
  2. one namespace per compiler / project.

the parser and lexer for the REPL obviously should be in 2. so it
needs an interface. also, it's probably best to make the interface to
the repl a macro.

uptil now, there were only modules. each module brings its own lexer
and parser. the result is Scheme definitions. to add repls, each repl
needs to be attached to a lexer and a parser.

the problem i run into now is that purrr/repl.ss pulls in too much
dependencies, mostly because of purrr/forth.ss
the latter should be factored a bit more.. fixed..

ok.. think i got it working: purrr.ss imports the whole purrr base
layer with forth syntax (parser words) AND a repl macro.

next: fix a problem with module loading.. infinite loop when requiring
an .f file

hmm... looks like i have a problem. increased the limit to 100mb and
now it works.. it runs in 26 mb too..


Entry: redefine
Date: Sat Jun  7 18:40:52 CEST 2008

so.. toplevel stuff is working now. so why can't these toplevel
definitions be used to change implementation of core macros like
'dup'? the idea is that yes, i'd like to keep the current module
system for managing names, but no i don't want to prevent modification
of macros. basicly, they are used to change aspects.

merely putting them in toplevel to be able to upgrade them would get
rid of other advantages of the module system.


Entry: require + toplevel
Date: Sat Jun  7 19:03:25 CEST 2008

box> ,purrr
toplevel in /home/tom/scat/
;; scat
extend: (macro) jump
;; macro
;; forth
;; purrr
box> (repl "require test/purrr18-test.f")
;; scat
extend: (macro) jump
;; macro
;; asm
extend: (macro) +
extend: (macro) dup
extend: (macro) drop
extend: (macro) swap
extend: (macro) or-jump
extend: (macro) not
extend: (macro) then
;; forth
;; purrr
;; pic18
;; dead: ((jw #<target-word>))
;; dead: ((exit))
;; dead: ((exit))
;; dead: ((exit) (qw 6) (qw 5) (qw 4))
box> 


Same for (require "test/purrr18-test.f"). This isn't right: it should
reuse the scat stuff.. How to do that? Does the namespace need to be a
module namespace?

Maybe it is.. require cannot depend on context, so when a module
requires another one, that last one needs to be re-instantiated.

I guess this is a chance to finally figure out what i'm doing with
this namespace / compiler instance business ;)

EDIT:

* there is one instance of the compiler for interactive use.

* each module that is required into the toplevel instantiates a
  compiler.

the latter makes sense: loading an application without a toplevel is
possible as long as it has its own compiler associated.

one question though: isn't using a toplevel terribly inefficient then?
probably, for things not used after instantiation, garbage collection
kicks in? The compiler is simply discarded?

ok, i think i got it fleshed out now.. the only remaining things to
figure out is to remove the dependenices of the data structures on the
scat code (make dependency on the badnop side optional) and figure out
where to put the assembler (probably best in the target namespace)


Entry: comping purrr to C
Date: Sat Jun  7 19:39:42 CEST 2008

it shouldn't be too difficult to add a C frontend for purrr. basicly,
every word instance is a function + the stack pointer is passed as a
parameter + tail calls are forced.


Entry: basic
Date: Sat Jun  7 23:16:41 CEST 2008

i was wondering how difficult it would be to compile one of the BASIC
dialects for PIC or AVR to purrr.


Entry: new names
Date: Sun Jun  8 17:39:24 CEST 2008

It's difficult to pick good names. The current ones: brood, purrr and
scat are a bit difficult to google because they are all common terms.

I was thinking about STAAPL, which is a creative spelling of stapel,
the dutch word for stack. Maybe retrogrammed as STAck and Array
Programming Language. Another one is Staprola, stack programming
language. No google hits on that.

Or something completely meaningless? Wurzon/Kamizi? What about calling
the whole system Staapl, calling the pure language Wurzon and the
Forth layer on top Kamizi? Hmm.. the most important thing is the name
of the project. Let's try staapl for a while.

EDIT: main project is now called staapl.


Entry: about that stack
Date: Sun Jun  8 18:00:38 CEST 2008

so.. are we going to stick with stacks or not? i'd like to give the
concatenative language as specification for a dataflow language some
thought. in that case, the system is as good as complete.



Entry: factoring
Date: Mon Jun  9 10:50:22 CEST 2008

I'd like to factor macro and target:

macro: just the functional macro metalanguage, no instantiation
target: only instantiation (compilation) and optimization

Is this a good use of time? Probably not.. The macro language is never
useful without instantiation.. it really is just composition of unary
functions which after all isn't terribly interesting if you never
evaluate them. So ditch this..

What does need to happen is to trim dependencies of the target
representation structure. Currently, it needs 'scat' for some
evaluation stuff. Fixed.


Entry: concatenative dataflow language
Date: Mon Jun  9 11:15:29 CEST 2008

it compiles to a dataflow graph. in order to do this incrementally, i
need to think about state. state, as represented in purrr, is the
current output of the network, so compilation is just adding a node.

(... [N1] [N2] +) -> (... [N3])

where

   [N1] [N2]
      | |
      (+)
       |
     [N3]

next decision: does this need a functional state, or can mutation be
used directly to build the graph? the tricky point is multiple
outputs:

(... [N1] [N2] div/mod) -> (... [N3] [N4])


   [N1] [N2]
      | |
   (div/mod)
      | |
   [N3] [N4]


this could be represented as: 

  [N1] = (div/mod [N1] [N2])
  [N2] = (shift (div/mod [N1] [N2]))

with dat structure sharing. this completely avoids the problem of
having to name intermediates.  

a more symmetric rep would be

  [N1] = ((div/mod [N1] [N2]) 0)
  [N2] = ((div/mod [N1] [N2]) 1)

this even has a representation is straight scheme in the form of
memoized procedures.

let's try to build one on top of the pattern matcher.

the first notational problem i run into is specification of primitives
with multiple outputs, which is the problem i'm trying to solve!

so.. let's stop going in circles. 

;;   Some important points:

;;   * Dataflow macros have a different representation. They have an
;;     entirely different compilation mechanism: one which involves
;;     register allocation and instruction scheduling. This
;;     representation should be made solid.

;;   * Give the dataflow macro rep, writing an automatic convertor to
;;     concatenative syntax is trivial.

;;   As a result, the macro/pattern.ss mechanism is only needed as a
;;   building block, not as a ui front end.

the composition mechanism should just build the graph, but represented
in such a way that 'executing' it is simplified.. this boils down to
how to do the binding, whether to use 2-way links, whether to represent
subgraph inputs by 2 nodes etc..

it looks like going for an explicit data structure that is later
interpreted or compiled might be the best approach. it's easiest to
understand. (the other way is to map it directly to scheme code, which
is also a DAG).

in hardware, all functions are many to one.. the only place where many
-> many functions come from is abstraction. can this fact be used to
simplify the problem? a subgraph is basically a list of (named)
expressions expressed in terms of (named) inputs.




Entry: monads and computation
Date: Mon Jun  9 12:28:05 CEST 2008

the philosophical idea behind monads starts to dawn on me.. in any
programming language, there are 2 things to consider:

  * a composition mechanism, which takes multiple language elements
    and turns it into one (or more) composite elements.

  * primitive elements.

this is 'bind' and 'return'.


Entry: strategic overview
Date: Mon Jun  9 15:04:48 CEST 2008

About how i'm going to tackle the simulator generator
problem. Hardware is best modeled starting from a description of its
interior, which is registers + logic. Functional/dataflow descriptions
are thus a good base language. It looks like using the simulator as a
pull for implementing the core dataflow representation seems like a
good idea.


Entry: representing DAGs
Date: Mon Jun  9 15:46:14 CEST 2008

"It's better to separate the 2 concepts of many->one functions and
grouping, than to work with many->many functions and permute/connect
their outputs.

What this does for representing graphs is the ability to use simple
nested (scheme) expressions.

In this view, mapping a concatenative language to an expression based
syntax is completely trivial. Representing one is completely trivial
also. So what problem am I solving?

( Some idea is itching in the back of my head telling me that partial
evaluation for functional dataflow analysis is really trivial as long
as there is only a single type: compilation is nothing more than
evaluating the graph while adding postponed semantics to the
code. What makes it hard is presence of higher order constructs. I'd
like to get a handle on this.. move it from the philo level to
concrete code.. Is it all the same thing? Is partial evaluation REALLY
better viewed from a compositional pov, as an intermediate form to get
the evaluation right, and then to transform it back to a graph for
optimizing the register allocation / sequencing? That can't be the
case really, since both are easily related to each other. Probably i'm
forgetting about associativity here.. )



Entry: base semantics
Date: Mon Jun  9 16:06:41 CEST 2008

Looks like i need a representation of base semantics of stack
manipulation operators. This can then be used to generate substitution
rules for the pattern language, and perform dataflow analysis. I've
added a file stackop/stackop.ss that's not used yet to put this info.


Entry: coming out
Date: Mon Jun  9 16:11:08 CEST 2008

when i start combining data flow analysis with a concatenative
specification syntax, it's time to admit that yes, this is about
syntax! so, rationalization:

  * for target implementation, a stack based langauge is nice

  * anything that can be analyzed before it's placed on the target
    might benifit from being transformed into a data flow graph, to
    get rid of the explicit serialization in concatenative code.


Entry: base language for simulator description
Date: Mon Jun  9 17:07:06 CEST 2008

Let's see if this route makes sense: create a scheme language level
for an expression serializer. In = an expression graph, out =
serialized graph. This is a pure scheduling compiler, mainly serving
as a front-end to a C-code generator.

First: what about names. If there are no scoping issues, it's best to
work on symbols instead of identifiers. This seems to be the case.


Entry: enough dabbling
Date: Mon Jun  9 17:27:19 CEST 2008

next: 

  * fix 'load' to perform source relative include and figure out a way
    to perform temporary code generation with undefined symbols
    (i.e. assuming they are constants or something).

  * fix 'org'

  * port the target interaction code


Entry: fixing load
Date: Mon Jun  9 17:55:53 CEST 2008

This is not entirely trivial: the environment in which the code is
expanded needs to be modified so the load statements inside the code
know where to get the code. Currently, it's simply inlined so context
can't be tracked.

OK. with the control flow out of the way, it's probably easier to
override current-load than to try to re-implement that part..

Q: is it possible to use require in a loaded file? if yes, is then a
problem to replace the load handler also for requires? 

hmm.. the parser atm is really confusing.. too much juggling with
return values and continuation thunks.. this needs to be solved
without a driver routine.. maybe a single dynamic variable to
accumulate code is better.. there is aleady one for toplevel defs..



Entry: new parser driver
Date: Mon Jun  9 21:22:40 CEST 2008

It's a mess. There have been several occasions where i tried to
understand it but couldn't. So, how to fix this. There are 2 things to
arrange:

* whenever a definition starts, the name, srcloc, and mode need to be
  recorded. -> implement as thunk.

* whenever a definition ends, the current expression needs to be
  combined with the stored header information and collected as a record.

the basic driver seems to work. it's a lot simpler to understand now:

(define (definer mode)
  (lambda (code expr)
    ((finalize-current) expr)
    (syntax-case code ()
      ((_ name . code+)
       (new-record #'name
                   (mode)
                   (stx-srcloc #'name))
       (collect-next #'code+)))))


now need to adapt all the other macros to this new way of doing
things.. should be straightforward. nesting can be implemented with
dynamic scope and an exit continuation like for load.

ok, with some minor shuffling in what to return to the continuation
(which is now implemented as a prompt) it seems to work.

now locals: maybe i can get back to using (rpn-represent)? this
requires the function to return.. is this possible?

ok: rpn-next can _only_ return when all input is parsed. this way
parsing can still be nested locally without the driver loop needing to
restart parsing.

ok, got some generic nesting working, now do the same for load so all
nestings can compose.

something seems to be wrong with 'load' though: probably a
continuation barrier.. nope: the procedure embedded in the syntax was
of course wrapped as a syntax object, and my printer routine
automatically unwraps it..

locals: the problem with not allowing rpn-next to return is of course
that now it is no longer possible to modify the closing expression
(the lambda wrapper): this used to happen by returning. the solution
is to add yet another parameter that represents expression closure.

it's actually already there in the form of 'rpn-lambda'
but this makes it a bit complicated..

the following modification should do it: allow 'locals-tx' to modify
rpn-lambda, and reset rpn-lambda in the forth parser so every
definition can start from a clear wrapper.

ok.. the thing is this: building an expression one wants to be able to
insert nesting expressions above and below: passing on just the inner
expression is a bad idea. maybe this needs to change? 3-value parser
state?

hmm.. alternatively, write the parser in terms of scat threaded state
updates, but that might go too far and lead to bootstrap problems.

the problem is now to make sure that wrappings are only used
once. this needs an interface:

pfff... i'm getting myself into lowlevel mess again because one
feature doesn't fit into the simple abstraction. what about making
'expr' an expression generator: a list of functions that can be
composed and evaluated.

OR:

expr = a cursor inbetween:

(outer . inner)

this should really solve all parsing needs.

ok. got it working with a bit of juggling with rpn-lambda:

  * at every rpn-compile, the current expression wrapper is set from rpn-lambda.
  * expressions are allowed to override the current expression wrapper
  * if entry is not trough rpn-compile, you need to initialize the wrapper!
    
pff..
next problem: the locals macro seems to have a problem with non-2stack
states. solved: wasn't fixed after abstract state update was changed.

Entry: compiling monitor
Date: Tue Jun 10 17:32:22 CEST 2008

without 'org' and some things disabled here and there. but it does
seem to compile, at least in toplevel namespace with 'load'.

it compiles, but doesn't assemble. some 2stack problem in
target-value->number



Entry: local variables
Date: Tue Jun 10 19:13:29 CEST 2008

A side-effect of the way locals are implemented, is that they can
occur anywhere in a macro definition or code word, and will bind
literals.

box> (repl ": foo 1 2 | a b | a b a | d e f | e ")
box> (print-all-code)
foo:
	[dup]
	[movlw 2]
box> 


Entry: next
Date: Tue Jun 10 19:56:35 CEST 2008

org:

let's see.. the real problem with org is that it permanently changes
assembly state. currently it's possible to set the org of a chain of
code, which is a local effect only.


asm:

monitor doesnt assemble.. problem with 2stack <-> compilation-state
confusion in evaluation of target values.  the error happens in 
/home/tom/staapl/macro/instantiate.ss:80:3 wrap-macro/mexit which
means a .f generated macro is evaluated. ok, it's a constant that's
evaluated with macro->data.

it should be possible to convert macro->data so it runs on a dummy
compiler state, but the real problem here seems to be: is it somehow
possible to not wrap macros with local exit if they don't need it?

well.. i can always run it once, then decide on how to encode it. a
macro can throw an error, but it is not allowed to have other side
effects.

problem:

    the m-exit mechanism makes it impossible to evaluate macros on a
    2stack state, which is possible for most 'clean' macros.

    should the concept of 2stack macro be discarded? or is the concept
    of clean macro important? (don't you just love dynamic typing!)

intuition goes towards: keep 2 classes because the compilation-state
class deals with 'lowlevel' features like multiple entry and exit
points.

is it possible to more clearly separate these 2 classes?l


Entry: meta
Date: Tue Jun 10 23:44:47 CEST 2008

last two months have been, well, long..
i didn't get so much done really. mostly reorganizing, fixing bugs and
thinking about the new features.. some topics:

  * load vs. require, module expander and redefining words
  * org and labels + multiple entry/exit points
  * namespace juggling (badnop)
  * parser cleanups + new syntax for scheme expr + code quotations
  * documentation
  * simulator ideas + dataflow language  
  
so i did get things done, they were just more difficult than
anticipated.. all of them involved significant choices and
backtracking on dead ends, not much straightforward coding as i
expected. maybe that's a good thing in the end.. it's just that now
i'm a bit drained on the creative front.

once compilation works (maybe tomorrow?) the road onward should really
be straightforward: port the interaction code, and solve bugs in the
compiler that are exposed. the goal should be to move to a working
'ping' beginning next week.


Entry: serialization for incremental dev
Date: Wed Jun 11 00:03:33 CEST 2008

it does look like the serialization problem for incremental
development is relatively easy to solve: save the forth words from the
namespace, and rebuild it later by loading macros from source code,
moving them to a new namespace, and augmenting them with compiler
macros for the serialized words.


Entry: m-exit
Date: Wed Jun 11 09:49:43 CEST 2008

ok, another choice to be made.
does 'macro->data' use 2stack or compilation-state or something else?

maybe it can be parameterized?

since it's more about a configuration issue than anything else: when
not using mexit and in-line word creation, use 2stack, otherwise use
the extended compilation state.

ok, something else to clean up:

    state:2stack : create a state object with update function
    make-2stack  : create the raw struct

now, the solution i have is to use a parameter called
macro-eval-init-state, but isn't it better to store this kind of type
information in the macro itself? in fact, this is a universal
property: each scat/rep.ss word has a native type on which it
operates, so let's make that mandatory. the type class can be
represented by a constructor for an empty state.

plan changed:

   * add a new record to words to indicate type. the type is actually
     a state constructor.

   * new-state:2stack -> parameterized constructor
   * state:2stack     -> type value

ha, this doesn't work for compostions! there it needs to be inferred
at compile time, but that's not possible. anyway, i'm going to keep it
to see where it ends up. maybe an order relation for state types can
be defined, so at least this type analysis can be performed at
run-time.

no.. it's too flakey, let's get rid of it.

Entry: error reporting
Date: Wed Jun 11 11:27:43 CEST 2008

box> (assemble! (all-code))
asm-overflow: (bpz (p R) ((112 . 7) (p . 1) (R . 8))) (bpz 126 1) (-131 8 -1)

that's nice, but where does it come from? what i want to know here is:

       * where in the assembly code does this happen
       * what is the corresponding source location

the latter might be difficult, but at least it should be possible to
find out in which word this is.

ok, got better error reporting now: it tells you which word chain it's
in, and dumps out the assembler code before it re-raises the
exception.

/home/tom/staapl/pic18/interpreter.f:59:2: n!f+
n!f+:
	[jsr 0 async.rx>]
	[movwf 4068 0]
	[drop]
_L72:
	[jsr 0 async.rx>]
	[movwf 4085 0]
	[tblwt*+]
	[drop]
_L75:
	[decf 4071 1 0]
	[bpz _L72 1]
	[movf POSTDEC1 1 0]
	[jsr 1 ack]
asm-overflow: (bpz (p R) ((112 . 7) (p . 1) (R . 8))) (bpz 129 1) (-134 8 -1)


the problem is quite clear now: relative instructions need to be
initialized differently so they don't overflow in the first pass.

no.. problem is something else:

 (bpz      (p R)     "1110 000p RRRR RRRR")


p is the first argument.

where did that come from?

i think i remember: all jump instructions were changed such that the
target address is the first argument. however, apperently the
assembler hasn't changed accordingly. this looks like a relic. let's
change it back..

ok, seems to work now.

next problem: all jumps are long. (typo)
next problem: dead code elimination for jump tables (fixed)
next problem: org


Entry: comma
Date: Wed Jun 11 16:42:06 CEST 2008

the problem with comma is that it is one of the reflective
words. (like 'constant' it accesses the run-time stack and produces
code).  i use it in purrr to change postponed literal semantics to
inlined raw words / bytes.

since this has no standard semantics i took the freedom to use it as a
replacement for ';' for jump-tables, one that doesn't terminate the
code.

is there a better way to do this actually? can't the current code word
be marked that it contains a run-time jump and thus needs to have
chain splitting disabled?

let's keep it manual: write a macro on top of the low-level
dispatcher later.


Entry: some slogans..
Date: Wed Jun 11 17:37:45 CEST 2008

.. to later remember why some decisions are taken

* it's important to have purely functional macros + some abstract way
  of handling the threaded state.

* for the parser i've opted not to, because this purely functional
  infrastructure is not necessary: it's merely a frontend for forth
  code, and has a composition mechanism in the form of substitution
  macros.  internally it uses an explicit serial interpreter ('next'
  routine).

* i'm trying to find a good trade-off between low-level control
  (i.e. raw jump tables, where the language is basicly an assembler)
  and high-level code analysis and manipulation, which serve as the
  basis for high-level metaprogramming constructs.

* yes, i like to split the code in chunks of about 300 lines


Entry: org
Date: Wed Jun 11 20:38:24 CEST 2008

so, maybe just hack it. whenever a name is a number, it's a permanent
org change. the obvious requirement is that it's an expression that
can be evaluated at compile time.

this is already used: if target name is a number, the current chain
will be assembled inside a code pointer push/pop.

in macro/instantiate.ss the function 'combine-if-org' is used to
combine multiple chains if the current store has an org specified, to
make sure it stays bundled.

so, what needs to be done is a simple marker in the assembly code that
sets the code pointer. that's all really.. why is this so difficult?
because it's a dirty operation, and there's no clean way to do it. it
probably needs some management at some point, i.e. to disallow it for
certain code contexts..

to summarize:


   ORG FOR CHAINS:
       * compiler uses state push/pop to control stack
       * on org change: all chains are combined
       * assambler recognizes org chains

   PERMANENT ORG:
       * some 'magic packet' in the chain stream.

permanent org needs to terminate the current chain, but doesn't need
to link up chains.

let's just make names (org <number>) and (org! <number>)

Entry: compiler state operations
Date: Thu Jun 12 12:19:39 CEST 2008

These need to be factored out a bit.

I do need to be careful about providing primitives that are
non-destructive, and leave destruction to simple destructors.

org-pop is:   terminate-chain
              combine-chains
              pop-chain

I'm going to leave this as is until I need a different mechanism that
needs to split/merge the compiler state.

AHA: split and merge. 

pop-chain =  terminate-chain
             combine-chains    ;; only for org
             merge-state

push-chain = split-state

split-state: save current asm, rs and dict on the control stack, and
             start with a clean slate.

merge-state: merge current asm, rs and dict with the one on the
             control stack.


Entry: Compiler Code Hierarchy
Date: Thu Jun 12 13:06:34 CEST 2008

During compilation the assembly code (the result of instantiating
macros) is organized in the following hierarchy:

  * A word is a single entry point, represented by a target-word
    structure associated to a chunk, which is a list of consecutive
    assembly code instructions. Code inside a word can only be
    reached through a jump to its label, and is thus not observable
    to the world. Words serve as the unit of code generation (and
    recombination). Any operation on code that doesn't alter
    semantics is legal within a chunk.

  * A chain is a list of words (chunks) with implicit
    fallthrough. Each word indicates a single entry point. Chains
    are terminated by exit points. Chains are the unit of target
    address allocation: each chain can be associated to an address
    independent of other chains. Some chains have fixed addresse
    (org).

  * The store is a stack of recently constructed chains.

Entry: next
Date: Thu Jun 12 14:08:10 CEST 2008

fix some bugs.. 

 * "if ; then" doesn't work  -> plug-in library support for macros

 * labels are registered when compilation fails. is this ok?

 * looks like macro delegation in 'pattern' is not a tail-call, is
   this a problem?

 * begin doesn't work (typo)

 * comma: byte or word?

 * org is sometimes dead code, sometimes not? silly: gets redefined. +
   another problem.. some org labels seemingly get dropped.


Entry: library fallback
Date: Thu Jun 12 15:01:40 CEST 2008

It would be really nice to be able to automatically link in
functionality. It's actually not too difficult to do so, but it
needs access to the filesystem, which currently is only possible in
the forth syntax layer (macros are pure).

So, where to put it?

This was solved using macro redefinition, without automatic inclusion
of library functions.

The convention is to prefix library fallback functions with a tilde
'~' character.

Entry: comma
Date: Thu Jun 12 17:22:57 CEST 2008

In case of pic18, should comma compile bytes or words? Byte tables are
useful, but the native code word size is 2 bytes, and all code words
use word addresses. Since comma is mostly for data tables it's
probably best to let it compile data word size instead of code word
size when they are different.



Entry: next
Date: Fri Jun 13 17:51:16 CEST 2008

i think i got the most important bugs nailed down. time to go from
code -> ihex and upload something.

(save-ihex (all-binary-code) "/tmp/broem.hex")


Entry: words and chains
Date: Fri Jun 13 22:02:10 CEST 2008

I'm already regretting the internal linking of word structures. It
feels unnatural to have to convert things to a list, and have to
remember if something is a sequence or not. On the other hand, it's a
clear sign of fallthrough: a word can never be mistaken to be
standalone..

Maybe i should define iterators / comprehensions?

It's probably better to define an explicit chain type, instead of
using words for that.. A chain is a list of words. The address of a
chain is the address of the first word. 

Basic idea: directly encoding fallthrough from a given word instance
is less important than having a proper data structure that
distinguishes entry points from code grouping due to fallthrough.



Entry: Toplevel vs. module namespaces
Date: Fri Jun 13 19:35:37 CEST 2008

When you talk to somebody, you'd like them to remember what you were
talking about before. Conversation needs context. 

When you read something, you hope for all context to be explained in
the text you are reading. Exposition needs completeness.

Same goes with interaction to a machine.  This is the image based
accumulative repl vs. the transparent repl debate.

I find it quite interesting to give it some thought, as both are
valuable for some uses. This is about load vs. require: Interactive
incremental development vs. the 'run' button.


(explain)



Entry: machine code org
Date: Sat Jun 14 10:05:19 CEST 2008

* instruction -> list of binary machine code
* word
* chain
* store

    In the code base, whenever assembly code, binary code, or word
    structures appear in a list, it is REVERSE SORTED. This is easier
    from the point of compilation.

    Code ANALYSIS needs the reverse of that: code is linked in the
    direction of instruction flow. This is how words are internally
    linked: given a target word (entry point) it's inline fallthrough
    can be easily obtained using 'target-chain->list', which again
    returns reversed list.


Entry: addresses
Date: Sat Jun 14 10:26:20 CEST 2008

The Forth uses byte addresses, because it might access data bytes in
flash memory.

The Assembler uses word addresses, because this is the basic unit for
flash memory.



Entry: Don't step on composition.
Date: Sat Jun 14 20:45:55 CEST 2008

Whenever you write a program, do not EVER limit the way in which
people can combine your primitives. This is actually quite hard:
limitations tend to creep in whenever a minilanguage arises. I.e. in
Brood-4 there was a problem with interaction macros: they did not have
a composition method that was easily accessible from Forth files.

The catch-all solution in Brood-5 is to provide access to the
underlying scheme core through Forth syntax.

(explain)


Entry: The compiler dictionary structure.
Date: Sun Jun 15 09:10:46 CEST 2008

There's a problem right now which looses labels, probably because the
state reaping in the compiler is too complicated:

 asm code collects with current word -> current chain
 chains collect to current store
 this state can be pushed/popped.

Drops should be made explicit. Trouble is I see no drops.

Problem is probably that 'terminate-chain' for 'org' also needs to
terminate the current word. The bug occurs after 'comma' of the config
values, which are not terminated.. So, let's make sure that
terminate-chain only works for #f labels.

No, that's not it.. weird one.. needs proper machinery to track down.


Entry: dataflow language
Date: Sun Jun 15 09:39:48 CEST 2008

Compiling an expression language to C is fairly trivial, since C has
an expression language built in. GCC also has SSA (static single
assignment) form, so presenting C code that uses single assignment
should be ok. Expression evaluation is straightforward, so trusting
GCC to handle this properly should be no problem. GCC also has a
mechanism for proper tail calls:

http://community.schemewiki.org/?gcc-does-no-flow-analysis

So, as long as there are no first class functions or comprehensions,
compilation is really easy.


Entry: array comprehensions
Date: Sun Jun 15 10:09:14 CEST 2008


The trouble is then, the problem I want to solve is to generate
efficient code for a dataflow graph + array comprehension
combination. 

The real problems for comprehensions (translation to nested loops)
are performance (P) and correctness (C).

  * Inner loop generation (P)
  * Cache memory optimization (P)
  * Handle border conditions. (C)

Here (P) need to be fast and (C) if applicable (i.e. in convolution)
might be approximately correct.

There is a discussion on the concatenative list about Backus' FP and
APL not having first class functions, but comprehensions. Is it fair
to say that this is the thing to do for numerical code? First class
functions are overkill, but anything you would want to solve with them
can be solved with comprehensions: you turn higher order operations
into syntax, which converts them to easily inlined loops.

There's a thread on concatenative about this:
http://www.nabble.com/Joy%27s-relationship-to-FP-%2B-a-Joy-variant-with-combining-forms-td17576284.html

My answer, without thinking too much, would be: macros. Languages
based on composition can have partial evaluation. Higher order macros
can expand to combination forms: Have higher order macros, but don't
allow such functions at run time. Is this cheating, or completely
beside the point?

   "We owe a great debt to Kenneth Iverson for showing us that there
   are programs that are neither word-at-a-time nor dependent on
   lambda expressions, and for introducing us to the use of new
   functional forms." - John Backus, 'Can Programming Be Liberated
   from the von Neumann Style?'

The basic idea; use concatenative syntax for specification of:

* pure functions, which are translated to a dataflow representation.
* higher order macros which serve as combining forms (combining forms).


Maybe I should try to answer this, and relate it to unrolled
reflection and partial evaluation:

http://www.nabble.com/Re%3A-Joy%27s-relationship-to-FP-%2B-a-Joy-variant-with-combining-forms-p17802001.html

I don't know how though.. Maybe best to try the combinator route first.


   The idea here is that "combining forms" as found in FP and APL are
   related to the macros in Purrr that are not compilable. My stress
   on macros is really about giving up first class functions at
   runtime, but not at compile time.



Entry: data flow + aspects
Date: Sun Jun 15 10:09:26 CEST 2008

So, for simple scalar DSP code (i.e. no FIR filters), it should be
possible to define such a language fairly easily, and concentrate on
'hints' for compilation.

What I mean is this: make the description highlevel, and add hints to
influence compilation. These hints would choose number systems,
scalings and bit widths for fixed point, etc..


Entry: for .. next
Date: Sun Jun 15 11:36:02 CEST 2008

The (hypothetical) higher order macro:

   [ ... ] for-next

and the Forth equivalent

   for ... next

do the same thing, but in current RPN representation, 'for-next' does
have access to the macro quotation to perform optimizations.

It would be nice to make the first one primitive. In the current
implementation however, this is difficult to do because the ... in the
2nd form is evaluated before 'next' is evaluated. This approach would
need a change in representation.

It does seem that using quotations directly is a far better approach:
it doesn't need any forensics to recover quotations from flat Forth
syntax. A limited translation of Forth to this form as a syntactic
operation might be feasible, but it's not possible to take it all..

This is at the core of the conflict of 2 forces in Purrr:

  (L) The lowlevel compiler which gives explicit control to the
      programmer about how to use nesting.

  (H) The highlevel approach based on explicit quotations and
      s-expression syntax.

Unifying both is difficult, but they can probably be built on common
ground: providing a macro language with simple (conditional) jump
primitives. This is (virtual) machine design: the Purrr control
primitives.


Entry: Purrr control primitives
Date: Sun Jun 15 12:07:38 CEST 2008

conditional jumps : VM primitives



Entry: partial evaluation of higher order functions
Date: Sun Jun 15 23:46:16 CEST 2008

  * higher order macros (HOM)
  * list comprehensions
  * combining forms + 1st order functions (FP)
  * forth -> pure macros. what is this level shift?
  * deforestation

EDIT: STA(ck)APL

According to the wikipedia FP entry, limiting a language to 1st order
functions and a limited number of (non-composable) 2nd order functions
creates a simple algebraic structure. Combining forms are quite simply
defined as 2nd order functions.

List comprehensions are similar to limited order combining forms: they
avoid the use of higher order functions to perform
iteration/folding. However, they do not generate first order functions
as objects: they are merely syntax.

I found a definition for HOMs here:
http://foldoc.org/?higher-order+macro

   "A means of expressing certain higher-order functions in a first
    order language."

P.L. Wadler "Deforestation: Transforming programs to eliminate trees"
http://www.springerlink.com/content/7217v376n7388582/

original paper:
http://homepages.inf.ed.ac.uk/wadler/papers/deforest/deforest.ps

Where i'd like to end up is to find the relationship between my forth
macro approach and fixed / composable 2nd order functions. I need a
theoretical framework, probably some type system restriction, to get
out of the anything->anything lisp world.

What I use this for is not necessarily to define a clean langauge
specification, but to see if it can help choose between higher order
functions and inlined expansion. Since I have higher order functions,
but would like to have inlining as optimization.

Somewhat related, i can make things such that each word is linked to
its originating macro. This might lead to functions that are both
instantiated, and available as a macro whenever they are used in a
combining form that cannot accept a function at run time.



Entry: more input
Date: Mon Jun 16 16:18:41 CEST 2008

Shopping around for input:
http://citeseer.ist.psu.edu/440438.html

 "Macros as Multi-Stage Computations: Type-Safe, Generative, Binding
 Macros in MacroML (2001)"

This is actually an interesting paper which deals with a lot of stuff
that's on my mind now. Might be interesting to take my implementation
directed approach a bit further. Macros in statically typed languages:
difficult? Is syntax tree manipulation in dynamic languages less
ad-hoc?

This is an interesting hub-paper.


Entry: preparing for state shuffling
Date: Mon Jun 16 16:29:14 CEST 2008

the current syntax in instantiate.ss and 'state-lambda' is not very
good. i already ran into trouble mixing out variables into something
returning a next state object + data.

it's also not abstract enough to be able to replace structure type
based encoding with list (stack of stacks?) encoding.

-> use syntax parameters

Oh, they are not so scary :)
Maybe it's best to make the rpn transformer like that too, instead of
all these compile time parameters. That's a major overhaul that could
be used to write it in cps actually.


Entry: wandering into confusion
Date: Tue Jun 17 19:13:25 CEST 2008

In very general terms, what I want to do is to see if I can do DSP in
a concatenative language with the proper higher order combinators for
list/array processing, and find a way to optimize all combinators away
to produce optimal loop code. That's all really: design a language
that feels high-level and lispy, but is guaranteed to compile to
efficient constructs.

My hunch is that this is actually not so very difficult to do with a
concatenative language. What I don't see is how to perform
instantiation automatically. How to move from high -> low, treating
higher order functions as macros. And, how much am I not just
re-inventing APL.

As far as I read, Wadler's deforestation paper deals with this kind of
higher order macros in section 5. Do I need to understand
deforestation before being able to use these macros? I don't really
need arbitrary intermediate tree data structures: i'm really just
using arrays. Looks like deforestation is necessarily expressed in a
first-order language, and higher order macros are a way to get some of
the effects of higher order functions in a first order language.

Thinking about this, it probably needs automatic instantiation: no
manual macro/forth defs.

Now this:
http://www.cs.ucsd.edu/~goguen/ps/will.ps.gz

looks like it is pretty close to what i'm doing. I was wondering about
how to relate C++ templates to purrr composable macros, but looks like
the root of this idea is in the OBJ language.


Entry: apology
Date: Tue Jun 17 23:39:50 CEST 2008

Why am I doing this? It is really about the language, about its
algebraic feel. Maybe I should be honest and keep that as the only
real reason. It's like Legos. It clicks.
 
Then there are explanations of why I might like it:

  * Concatenative languages span a wide spectrum in a useful way. This
    allows me to use similar paradigms from the very low to the very
    high. 

  * One can get far without closures (which take the form of curried
    quotations created at run-time).

  * Partial evaluation is simple for functional concatenative
    languages: scopes don't get in your way. 

  * Imperative concatenative language can have a large functional
    subset.

  * Linear memory management becomes non-intrusive.


Some fragments of correspondence. Question about breaking new ground
with the Staaple approach:

Well, I know for a while now I need some form of compile-time program
specialization that can turn higher order functions into specialized first
order loops. The real question is in how to simplify the programming
language such that this the problem of writing the compiler can be solved
by me in a limited time. Doing it only with an interpreter + specialized
manually crafted C core routines (like PF, Pure Data, Supercollider,
Matlab, ...) is not powerful enough.

Untyped lambda calculus is too general to solve the problem with a simple
compiler. Typed lambda calculus works better but such a language is not so
straightforward to implement. So I'm looking at something first-order with
higher order macros, closer in spirit to APL, Backus' FP and C++
templates than lisp.

The only ground I broke is that I ended up with a non-intrusive way to
combine compile time operations and run time operations in one language
without semantic problems, simply by taking a functional programming view
where evaluation time might be thought of as unspecified.

Concatenative macros are a very natural way to do template programming,
because name bindings don't get in the way. Concatenative form can also be
easily transformed to nested expression form so when I need data flow
analysis I can do it, but for some program transformations it's really
easier to keep it concatenative. Code is more 'algebraic' and less 'logic'
in that form if that makes sense at all.. Lists instead of trees.

What I have now is still manual: there is no automatic loop inlining
happening. I'd like to figure out if this is necessarily a part of
the language (1st order language + some 2nd order functionals) or if i
can automate it, so it becomes a language with higher order semantics
preserved in case the opti doesn't apply.

So, while I find it interesting, I am getting in territory where I should
be careful not to be too general, and try to stick to the problem of making
a language wich is very close to machine language, but has access to higher
order constructs. I'm already there: the macroassembler on steroids idea:
for PIC18 the bottom layer concatenative language almost maps 1-1 to
assembler. It's automatic code juggling part on top of it that is giving me
headaches.


About writing beginner languages:

I did some workshops with forth now, and i find people pick it up pretty
fast. The real problem is not language though. Some languages go smoother
in the beginning than others, but I found the real problem to be the point
where you leave simple scripting (filling in parameter values) and code
composition enters the picture: how to divide and conquer. I think a
beginner language should stretch the scripting part as long as possible,
but i sort of gave up on that idea. It only makes hitting the abstraction
wall more painful. What i got a bit discouraged about is that more often
than not, no matter what you try people like to stay in that scripting
area. I don't know if there's a way to trick people into crossing that
barrier unknowingly. Did you run into something like this with scheme?



Entry: about deforestation
Date: Wed Jun 18 12:39:51 CEST 2008

Maybe his earlier work on lists is better suited to translate to my
problem: folding combinators for array processing. 

I'm facing a gap between my understanding of theory, and a particular
practical problem. I should just try to solve one such problem with
higher order combinators to get a better view about concrete problems,
to get out of the muck of abstract confusion..

The problem is really one of datastructures and their iterators. In
FP, there are only nested arrays and some map and shift operators.

So.. Deforestation is about eliminating intermediate nested data
structures in a first order language with recursive pattern matching
for tree deconstruction.

Wadler defines the property 'treeless', which is used to construct
transformation rules to transform a composition of treeless functions
into a treeless function.

Definition: a term is treeless with respect to a set F of function
names if:

  * it is linear (each variable is used only once. this is to make
    sure transformations don't introduce redundant computations, and
    can be relaxed for integers)

  * it only contains functions in F (these are the 'exceptions' that
    will be expanded)

  * every argument of a function application and every selector in a
    case term is a variable. (obviously, otherwise there would be an
    intermediate tree result)

The algorithm that performs deforestation maps a linear term which
contains variables and functions with treeless definitions to a
treeless term and a possibly empty set of treeless definitions.

The core of the algorithm is the standard function 'inlining': replace
each function application with an inlined definition.

If i start to think from that point, shouldn't i get to something
simple? After all, i have no variable names to worry about, and no
destructuring or creation of run-time data structures. That is all
quite different.

So, question: what is the equivalent of deforestation in a
concatenative language with a simply managed array data structure and
a 'map' operator?

Let's write down some things that need to 'actually happen'

  * loop transformation to eliminate intermediate buffers
  * array memory management and reuse (linearity?)
  * dereferencing indirect addressing (on PIC18)

Dereferencing indirect addressing is a particular problem i ran into
writing highly specialized DSP code for microcontroller. It's a fairly
extreme level of templating which makes sense due to very limited
indirect addressing and multiplication on PIC18.

There is a difference beteween translating 'map f l' to (cons (f l1)
...) and making sure the same operation happens in place, or with a
fast reuse array.

There's something in Wadler's paper about
instantiate/unfold/simplify/fold, found in Burstall and Darlington: "A
transformation system for developing recursive programs."


Entry: eliminating intermediates in a concatenative language
Date: Wed Jun 18 15:14:52 CEST 2008

Suppose [ f map ] applies an operation to a number of data structures
and produces a number of data structures, according to the arity [ f
].

The transformation that eliminates intermediate storage is:

   [ f map g map ] -> [ [ f g ] map ]

In opti talk this is called loop fusion. For simple video processing,
this is about the single most important optimization: it eliminates
storage of intermediate frames, which take up a large part of cache
memory. This optimization is in practice bounded by:

   - dependency depth for deep pipelines
   - instruction cache size

Naively, to take care of those issues it can be beneficial to limit
loop fusion and allow for a limited sized intermediate
buffer. These 2nd order problems can be ignored for now. The most
important life saver is loop fusion, which if not for speed, can save
a lot of memory.

Translating this optimization to current concatenative macro
architecture, it requires access to all functions in macro form. Final
instantiation of 'map' can be the generation of a for loop and buffer
allocation, but the important step is the fusion. How to use current
code substitution rules to implement this?

For one, it requires a 2-pass algorithm in the current form. The map
macro cannot be instantiated until all fusion has happened. Postponed
partial evaluation is solved in the pattern language using pseudo
assembler instructions.

EDIT:
http://www.randomhacks.net/articles/2007/02/10/map-fusion-and-haskell-performance
apparently, in haskell this kind of optimization is pluggable

There are a couple of interesting links in that article.

It's all starting to become a bit more clear: concentrate on
properties of higher order combinators. One of the links seems to be
about automatically deriving these kinds of rules (Wadler's Theorems
For Free).
 

Entry: code transformations
Date: Wed Jun 18 17:22:49 CEST 2008

The multi-pass optimization algorithm has an ad-hoc form: the first
pass instantiates macros, while subsequent passes perform specific
substitutions only. I can't express it properly, but shouldn't code
originally have the form ([qw <macro1>] [qw <macro2>] ...) with a
'run' application?

An important question: is there a way to get rid of this 2-pass
mechanism? Is it necessary or just more elegant to use a demand driven
pipeline?

To implement  [ f map g map ] -> [ [ f g ] map ] we need to represent
'map' as a pseudo op:

  (([map f] [qw g] map) ([map (macro: f g)]))
  (([qw f] map)         ([map f]))

And the map can be eliminated in a 2nd pass by instantiating it:

  (([map f] map-pass) (macro: f do-map))

This is entirely too trivial, so all the beef is hidden in the
instantiation of 'map'. Which makes me think, what is 'map'? An
implementation of a data structure iterator + a specification of its
abstract properties used for source manipulation.

next: Think about 'fold' over arbitrary data structures and related
things like loops, iterators, comprehensions, compile time folding,
...

There's some rewriting in Cat that does this
http://www.cdiggins.com/cat/cat.pdf


Entry: doubts about compilation
Date: Thu Jun 19 09:12:11 CEST 2008

I'm wondering wether it isn't better to keep macro code representation
in list form. The assembler output is again a concatenative language
in source form, so why doesn't it have the same form as the input?

Also, maybe the eager algorithm is too simple?

The problem is that given a composition [a b c] it could be split as
[ab c] or [a bc]. One could be significantly simpler than the other
syntactically, while semantically they denote the same function. An
eager algorithm is always going to pick the first option. This is the
reason why some rewriting operations are postponed to a next pass.

  More specifically: the multipass algorithm now is a simplification
  of a general non-deterministic algorithm that optimizes ideal
  combination of terms.

Maybe multiple passes need to be defined more abstractly?


Entry: back to fixing bugs
Date: Thu Jun 19 11:39:57 CEST 2008

since i'm just getting confused over ideas that need fermentation and
more reading, maybe best to start fixing bugs.

the alledged problem with 'org' seems to be a problem with forth/macro
mode switching.

OK:

macro
\ : config #x10 ;
forth  

#x20 org : bla

WRONG:

macro
: config #x10 ;
forth  

#x20 org : bla


probably 'forth' needs to terminate previous macro defs, otherwize
non-labeled code will be concatenated to the last macro def.

looks like that was indeed the problem.

that was the last known bug in the way of uploading code. time for
hands-on!


Entry: interaction code
Date: Thu Jun 19 13:25:59 CEST 2008

might be better to try to get some communication going with the
previous monitor, before uploading the freshly compiled one.

start with target.ss

first part is upgrade to plt 4.0
for loops: comprehensions.


Entry: lazy-connect: book vs. conversation
Date: Thu Jun 19 15:06:04 CEST 2008

about the 'current state' issue for interactive development. i guess
it's ok to have state. the previous approach of making everything
temporary is maybe a bit too brutal.

i.e. a current connection is really ok. use custodians to manage that
kind of stuff, not parameters.

On the other hand, for lowlevel interaction it might be a good idea to
flush buffers on every message exchange, since things tend to go
wrong.

basic interaction works:
box> (with-io-device '("/dev/ttyUSB0" 9600) (lambda () (scat> ping)))
CATkit 
<0>
box> 

ordinary target access seems to work without trouble. the thing that
needs to change is interaction with the target dictionary, which is
now a scheme namespace + serialization, incremental compilation and
code upload.


Entry: dictionary / namespace
Date: Thu Jun 19 20:51:48 CEST 2008

Should the interaction code live in the same namespace as the
compiler? It would be nice to be able to specify interaction code in
source files, so maybe it's best to do that.

I need to be careful here not to fall into the same pit: host side
code should be composable + interaction templates too: prefix syntax
is ideal to override default target semantics so need to have it, but   
should be composable.



Entry: data types + HOF
Date: Thu Jun 19 22:08:06 CEST 2008

So, along the lines of
http://www.randomhacks.net/articles/2007/02/10/map-fusion-and-haskell-performance

The basic idea is: whenever you define a data type and a
map/fold/... HOF you need to somehow obtain transformation rules to
simplify compositions, by moving operations inside loops.

The problem is, mapping this to Purrr, it's quite easy to add these
transformation rules (manually), but the problem is really the
representation of the data type. How to add the runtime support for
say a video frame?

It is also pretty clear that typing is essential here: i'd really like
'+' to be polymorphic, so i can make every occurence of 'map'
implicit. Functions should be upgraded (coerced?) automatically.

I'm getting confused now.. Polymorphic macros. Is assembly pattern
matching a genuine type system?


Just had a look at the wikipedia C++ template page, and it says: 

   a feature of the C++ programming language that allow code to be
   written without consideration of the data type with which it will
   eventually be used

It looks like what i'm doing is more general that that (compile time
decisions based on literal values), but on the other hand, you can
probably hide everything that can be done with values in Purrr, in
classes in C++.

Maybe this is an essential difference really: value based
metaprogramming instead of type based? Does this make sense at all?
( Objects instead of classes? Prototype templates? )



Entry: Faust
Date: Fri Jun 20 00:28:30 CEST 2008

http://faust.grame.fr/

basicly what i want to do, but i don't like the syntax. makes me think
that i'm on the right track with a concatenative specification
frontend to solve the 'bussing' problem: connecting multi in/out
things..

another thing to solve when doing a dsp language like that is
block-based algorithms.

so, am i on the right way with programmable macro semantics? say i can
use a pic18 program that imports a module with a different macro
semantics that produces a static dataflow network + buffer management?

maybe i need to just write another synth to pull this thing
through. something more classic dataflow + feedback, with emphasis on
compilation to pic18 architecture.

NOTE: translating concatenative code to expressions has one advantage:
it makes usage explicit, so allocation might be simpler?


Entry: rewriting
Date: Fri Jun 20 00:51:18 CEST 2008

should i give up eager pattern matching, and move to a different
rewriting system, or is the current one good enough when it's equipped
with an easier to use multipass architecture?

this is interesting:
http://lambda-the-ultimate.org/node/1658#comment-20313


Entry: live parsing words
Date: Fri Jun 20 09:54:32 CEST 2008

problem is that this is a map from (live) -> (scat) while the other
substitutions macro defines endomaps.

it's straightforward to map to a different namespace:

  (unquote (ns (scat) id))

but doing this one looses nested macros, which i just conveniently
used for defining substitution-types. (using a primitive language, one
needs 2 composition methods)

wait.. the live->live subsitutions for 2sim can remain as is. just
need to define the primitive properly. it's probably easiest to just
use quoted code + run.

next: tfind



Entry: 3 different languages
Date: Fri Jun 20 11:01:48 CEST 2008

compositions:          (name w1 w2 w3)
postfix asm patterns:  ((a1 a2 name) (b1 b2 ...))
prefix substitutionsn: ((name a1 a2) (w1 w2 w3 ...))

Compositions are the core of the functional language. Postfix asm
patterns are used to implement eager rewrite rules during translation
and prefix substitutions are used for changing semantics of symbolic
names and numbers.


Entry: the meta namespace
Date: Fri Jun 20 12:06:22 CEST 2008

i run into a problem with different instances of the target-word
structure. maybe the solution is to make sure badnop runs in a module
namespace?

looks like the problem is with using the namespace anchor attached to
the module namespace.. maybe best to attach it to the repl's toplevel
namespace.

There's still some confusion: if a module A is in namspace NS, and
module B is required, but B requires A, then A will be
re-instantiated, right? Unless NS is a module namespace.

Let's see, what's the difference between:



> (define ns (make-base-namespace))
> (dynamic-require "target/rep.ss" #f)
instantiating target/rep
> (namespace-require "target/rep.ss")
> (namespace-attach-module (current-namespace) "target/rep.ss" ns)
> (parameterize ((current-namespace ns)) (namespace-require "target/rep.ss"))   

and doing this from within a module doesn't work...

An explanation: when a require form is evaluated inside a module, the
module registery of the required module is not the same as that of the
namespace in which it is required.

A toy example:

(module A scheme/base (printf "instantiating A\n"))
(module B scheme/base (require 'A) (printf "instantiating B\n"))

box> (require 'A)
instantiating A
box> (require 'B)
instantiating B
box>

again:

box> (require 'B)
instantiating A
instantiating B
box> 

so... A is not re-instantiated.. what am i doing wrong?

ok.. the namespace.ss code works just fine:


;; Create a namespace with shared and private module instances.
(define (shared/initial-namespace src-ns shared private)
  (let ((dst-ns (make-base-namespace)))

    ;; See PLT 4.0 guide, section 16.3
    ;; Reflection and Dynamic Evaluation -> Sharing Data and Code Across Namespaces
    (define (load-shared mod)
      (parameterize ((current-namespace src-ns))  ;; make sure it's there
        (dynamic-require mod #f)
        (namespace-require mod))

      (namespace-attach-module src-ns mod dst-ns) ;; get instance from here
      (parameterize ((current-namespace dst-ns))  ;; create bindings
        (namespace-require mod))
      )
    
    (define (load-private mod)
      (parameterize ((current-namespace dst-ns))
        (dynamic-require mod #f)
        (namespace-require mod)))
    
    (for-each load-shared shared)
    (for-each load-private private)

    dst-ns))

The problem seems to be about other modules that are loaded into that
namespace. They seem to re-instantiate the the target/rep module.

Ok, it was really really stupid:  

(namespace-require "pic18.ss")) ->  (namespace-require 'staapl/pic18))

Just a module name issue.


Entry: prj
Date: Fri Jun 20 13:54:13 CEST 2008

Prj is a small bit of glue to enable loading of different specialized
compilers in their own namespace. All namespaces share some code for
space efficiency and some datastructures so they can communicate
through the badnop layer.

Now, what about 'find'?

This is a reflective operation: it looks in the current toplevel
namespace to map a symbol to a value.

Ok, looks like everything is there now, just needs to be patched
together.


Entry: live interaction language
Date: Fri Jun 20 20:06:01 CEST 2008

the current interface doesn't seem to support what i want to do.

- to invoke macros properly i need to set the ide map to (live) prefix

- however, this map leads to undefined words for default semantics,
  which would take items from the (target) prefix.

however, it is possible to put the macros in the 'target' namespace.

there's a simple workaround: make sure the target word is executable,
or add an interpretation step. the latter is probably going to be
simplest, since performance is not really an issue here :)

used the interpretation step. interaction seems to work fine now. got
both a prj> and a live> language.

what is missing is incremental compilation with suspended state +
inspection.



Entry: terminology cleanup
Date: Sat Jun 21 10:21:53 CEST 2008

chunk:    list of binary/asm code associated to a word (entry point)
chain:    list of chunks with fallthrough

for manipulating blobs of code for ihex + upload, a different name is
necessary. let's call it blob then instead of chunk.

bin:      (number (listof number)) binary code list that happens to be
          consecutive

line:     upload unit (8 bytes on PIC18)
block:    erase unit (64 bytes on PIC18)


to ease uploading, the chunk/chain subdivision is converted to lumps,
which are then concatenated into bigger lumps. and split into lines.


Entry: binary data objects
Date: Sat Jun 21 10:51:04 CEST 2008

Handling binary data involves a lot of fixed size tables. Instead of
using index lists, it might be more elegant to use 'for/list'
comprehensions and sequences.

This deserves a bit of attention. Let's define the type better.

  - bin = (listof bin-chunk)
  - binchunk = (listof number (listof number))  : address + codelist

Operations:
  - splitting/joining bytes/words
  - line splitting
  - aligmnent
  - binchunk combinations


Entry: avoiding O(N^2)
Date: Sat Jun 21 12:10:04 CEST 2008

The problem of combining binchunks when they are consecutive is
interesting: I can't find a way to use my usual collection of higher
order functions without running into O(N^2) complexity due to iterated
append.

The problem: given a list of address/code pairs, create a new list
which combines them if they are consecutive.

((a c) (a c) ...)

To avoid N^2 and multiple traversals, the easiest way to do this is
using a state machine with an accumulator, but is it possible to use
HOFs for this, other than fold (which just factors out the explicit
recursion/looping), with the choice of using constant space (left
fold) or minimal hassle (right fold).

Maybe i need to look into parser combinators, or try to write one
figuring out the core routines.

On the other hand, the right abstraction for this might be stream
processing: convert binchuncks to a stream of word/address pairs, and
recobine them.

Let's try that first. Looks like i'm looking for 'for/fold'

This is actually an interesting point where first order functions are
syntactically more convenient than higher order ones ('for' is a
form!). The difference seems really syntax: it's probably
straighforward to convert between HOF and comprehension form.

In the guide: 11.8 Iteration Performance. Looks like they've been
thinking about optimization too :)

This actually looks like a good candidate for a ``pre-scheme'' style
first order language that compiles to straight machine code without
runtime support.

I don't understand why there is no 'in-append'. This would be a nice
exercise for sequence combinators. 


Entry: Parsing combinators.
Date: Sat Jun 21 14:53:40 CEST 2008

A lot of code in Staapl is about converting one datastructure into
another one. Serializing one is simple, but collecting into another
seems more difficult.

Upto now I've been using manual stack manipulation to collect data
structures. Is there a better way to tackle this?

Almost always this is insertion into trees + postprocessing
(reversing).

Let's see..

stack levels
1 -> push            (a b c) -> (x a b c)
2 -> push + push'    ((a b c) (d e f)) -> ((x a b c) (d e f))
                                       -> ((x) (a b c) (d e f))

Let's start with a simple list-of-list parser with the operations:

->0
0->1
1->2


I tried it with vectors:

;; ----

#lang scheme/base
(require "list.ss")


(define (llp-push-level! v n x)
  (let ((stack (vector-ref v n)))
    (vector-set! v n (cons x stack))))


;; (llp-move! v n x)  push x to stack n
;; (llp-move! v n)    push stack n-1 to stack n
(define (llp-move! v n
             [x (let* ((n- (- n 1))
                       (x- (vector-ref v n-)))
                  (vector-set! v n- '()) ;; move to x
                  x-)])
  (llp-push-level! v n x))

(define (llp-push! v x)
  (llp-move! v 0 x))

(define (llp-compact! v n)
  (for ((i (in-range n)))
       (llp-move! v (add1 i))))

(define (make-llp n)
  (make-vector n '()))

(define v (make-llp 3))


;; ----


But it's probably better to just use lists, since the operations        
themselves are simple tree operations.

push:    (a b ...) -> ((x . a) b ...)
compact: (a b ...) -> (() (a . b) ...)

this becomes:

;; ----

;; Stack of stacks.
(define (make-sos n)
  (for/list ((i (in-range n))) '()))

;; Convert a collapsed sos to a list of lists, applying an operation
;; to each list level.
(define (sos->lol sos [op reverse])
  (let ((dim-1 (- (length sos) 1)))
    (let down ((l (list-ref sos dim-1))
               (n dim-1))
      (if (zero? n)
          (op l)
          (op (map (lambda (le) (down le (- n 1)))
                   l))))))
    
(define (sos-push sos x)
  (cons (cons x (car sos))
        (cdr sos)))

(define (sos-collapse sos n)
  (if (zero? n)
      sos
      (cons '()
            (sos-collapse
             (sos-push (cdr sos)
                       (car sos))
             (- n 1)))))
  

;; ----


Entry: comprehensions + delimited control
Date: Mon Jun 23 08:14:22 CEST 2008

Looks like a nice alternative to lazy lists. Screams for coroutines,
so might want to add some abstraction to it.

http://groups.google.com/group/plt-scheme/browse_thread/thread/d0ff99391f9ac53f/f5de867c296afcbe?lnk=gst&q=yield#f5de867c296afcbe

(define (in-yielder f)
  (define end (list 1))
  (define i (iter f end))
  (make-do-sequence (lambda ()
                      (values (lambda (_) (i))
                              void void void
                              (lambda (e) (not (eq? end e)))
                              void))))
(for/list ([x (in-yielder (lambda (yield)
                            (for-each yield '(1 2 3))))])
          x) 


Entry: enumerators vs. cursors
Date: Mon Jun 23 11:08:51 CEST 2008

http://okmij.org/ftp/Computation/Continuations.html#enumerator-stream
http://okmij.org/ftp/papers/LL3-collections-talk.pdf
http://okmij.org/ftp/Scheme/enumerators-callcc.html
http://lambda-the-ultimate.org/node/1882

Oleg makes a case for providing enumerators natively, and deriving
cursors from them if necessary, since they are the less useful
variant.

stream:     encapsulated iteration state
enumerator: collection fold

Comprehensions are similar to enumerators, but they do not iterate
over an abstracted datastructure, but over a concrete (sum/product) of
(possible abstract) data structures. They are trivially translated to
a map/fold HOF + fold function + a data constructor.

http://srfi.schemers.org/srfi-42/srfi-42.html
According to Sebastian Egner the main reason for this srfi is a simple
form for the naturals.

I find the middle way: using 'in-generator' from tools/seq.ss to
convert generators based on delimited control to sequences usable in
comprehensions quite convenient.


Entry: got ihex
Date: Mon Jun 23 11:48:23 CEST 2008

with the comprehension based binary code formatters we're now at the
point to upload code and generate the monitor code in proper format:

:020000040030CA
:0E00000000C80F000080800003C003A0034072
:020000040000FA
:020000001FD00F
:020000040000FA
:02000800FFD027
:020000040000FA
:02001800FFD017
:020000040000FA
:10004000C8D09EBA01D0FDD7ABA203D0AB98AB8885
:10005000F8D7EC6EAE50ABA402D0ED50F2D7120040
:10006000ACB201D0FDD7AD6EED5012000500E9DF56
:10007000FD6EED50E6DFFE6EED501200EC6E000EF0
:10008000EFD7A68EA69C88D8F9D7A68EA69C87D82F
:10009000F5D70F0B8DD812000ED0E2D70ED041D07D
:1000A00047D0ECD7FF0019D00DD02CD020D047D0AE
:1000B00033D0E7D7EAD7C5DFE1D7D8DFDFD7C1DF55
:1000C000E8DFFDD7BEDFE46EED50EC6E0900F550C1
:1000D000C7DFE706FAE1E5521200B3DFE46EED5048
:1000E000EC6EDE50BDDFE706FBE1E5521200A9DF52
:1000F000E46EED50A6DFF56E0D00ED50E706FAE177
:10010000E552BCD79EDFE46EED509BDFDE6EED5016
:10011000E706FBE1E552B2D75BD8EC6E0800F528A4
:10012000D2D78FDF8EDFDA6EED50D96EED50A6D7C5
:1001300088DF87DFF76EED50F66EED509FD7EC6EDF
:10014000400EE46EFF0EEC6E0900F550ED14E7066C
:10015000FAE1E55285D7EC6EFD50F66EFE50F76E73
:10016000ED5006001200D96E800AE834DA50DA5AEF
:10017000ED501200F8DFEC6EDE501200F4DFDE6EA0
:10018000ED501200EC6EF29E550EA76EAA0EA76EF1
:10019000A682F28EED501200A684A688F3D7A6841C
:1001A000A6980A00EFDF09001200EF60EF6EED5035
:1001B000E844FD26010BFE22ED50120000EE7FF018
:1001C00010EE8FF08A68EC6E700ED36EED50120058
:1001D0001200000EFC6EF2DFC18AC18C93889382FC
:1001E000EC6E330EAF6E240EAC6E900EAB6EED5017
:0801F00081A801D064D704D0FE
:00000001FF


Entry: ssa
Date: Mon Jun 23 13:07:38 CEST 2008

http://www.cs.princeton.edu/~appel/papers/ssafun.ps
http://lambda-the-ultimate.org/node/2860



Entry: DSP language
Date: Mon Jun 23 22:12:09 CEST 2008

let's write a simple synth that runs on PIC18, but uses a nontrivial
hardware mapping.


Entry: booting the monitor
Date: Tue Jun 24 10:05:51 CEST 2008

and it's not working :)

good. time to get some debugging tools online. that's what we're here
for, right?

first: slurp + printing a hexdump from a sequence.

got hexdump + easier io stuff working, problem is not the serial line
but something else (debug-transmit and debug-loopback work)

box> (io> (target> 1 2 3 ts))
<3> 1 2 3

the problem seems to be just with 'ping'
ok. i remember: there's no 'hello' string defined.

fixed.
looks like it's running now, but i can't seem to execute words.


Entry: playing with generators
Date: Tue Jun 24 10:30:59 CEST 2008

A problem that i've run into is 'wrapping' a sequence around a loop to
build a 2D view. It pops up more than once (i.e. list->table), so lets
make an abstraction for it.

This is trivial to solve with a generator + comprehension:

     (for ((row  (in-naturals)))
        (printf "~a \n" row)
        (for ((columns (in-range 8)))
           (printf "~a " (generate)))
        (printf "\n"))

But the problem here is termination condition. Can this be turned into
a comprehension that abstracts termination?

A simple way is to turn the printing loop itself into a generator:

(define printer
(in-generator
 (lambda (yield)
   (for ((row (in-naturals)))
      (yield (lambda (x) (printf "~a ~a" (* row 8) x)))
      (for ((i (in-range 6))) (yield (lambda (x) (printf " ~a" x))))
      (yield (lambda (x) (printf "~a\n" x)))))))

which can then be easily combined with the sequence to be printed

(for ((p printer))
     ((i sequence))
  (p i))

The only thing that's awkward here is the missing newline in case the
sequence terminates in the middle of a line. This could be solved by
using some 'strings with backspace'.

This is the cleaned-up hex printer sequence:



(define (in-hex-printer [start 0]
                        [data-nibbles 2]
                        [address-nibbles 4])
  (in-generator
   (lambda (yield)
     
     (define-syntax-rule (lp formals . args)
       (yield (lambda formals (printf . args))))
     (define (addr x) (hex->string address-nibbles x))
     (define (data x) (hex->string data-nibbles x))

     (for ((row (in-naturals start)))
       (lp (x) "~a  ~a" (addr (* row 8)) (data x))
       (for ((i (in-range 6))) (lp (x) " ~a" (data x)))
       (lp (x)  " ~a\n" (data x))))))

Which can be used as in:

(define (slurp)
  (for ((i (in-thunk in))
        (p (in-hex-printer))) (p i)))



Now, turn it into an enumerator + a sequence derivative. Does that
makes sense? What does a fold look like if the print output is seen as
a data structure? It actually makes more sense as unfold operation.

So, really, print is a consumer, but i turned it into a consumer
producer.

Moral of the story?

  * comprehensions abstract termination conditions which makes them
    easier to use than generators (eof?/read)

  * in cases where generators are more convenient than nested/parallel
    loops (parsing/printing-style data representation conversion
    problems), the consumer can be turned into a a squence of consumer
    procedures, which can then be linked to a producer sequence in a
    simple for loop.


Entry: target mode / simulator
Date: Tue Jun 24 15:43:04 CEST 2008

in target mode, it would be interesting to allow all kind of macros,
and try to simulate them, at first only the 'qw' and 'cw'
instructions.


Entry: interactive compilation
Date: Tue Jun 24 16:33:25 CEST 2008

2 things:

  * what to do with the global code accumulator?

  * no more 'compile' mode for quick & dirty interactive compilation:
    use files/buffers instead.

let's have a look first at serialization. it's not possible to
serialize macros, but it might be possible to serialize constants and
aliases.


ok. got a function to read/write the namespace. on write, the macros
get recreated.

(define (target-words [words #f])
  (if words
      (for-each*
       (lambda (name realm address)
         (let ((word
                (eval
                  `(new-target-word #:name ',name
                                    #:realm ',realm
                                    #:address ,address))))
           (eval
            `(begin
               (define-ns (target) ,name ,word)
               (define-ns (macro)  ,name
                 ,(case realm
                    ((code) `(scat: ',word compile))
                    ((data) `(scat: ',word literal))))))))
       words)
      (for/list ((name (ns-mapped-symbols '(target))))
        (let ((word (find-target-word/false name)))     
          (list name
                (target-word-realm word)
                (target-word-address word))))))


the global accumulator could be replaced by a parameter, so
file->words conversion is possible locally. this deserves some
cleanup.

Entry: multi-stage programming
Date: Tue Jun 24 19:18:23 CEST 2008

http://www.cs.rice.edu/~taha/teaching/03F/511/

talks about metaprogramming (manual staging) in a type-safe manner.


Entry: questions
Date: Wed Jun 25 02:25:19 CEST 2008

The problem is not answers; it's asking the right questions. An
attempt:

 Q: Why are multiple passes for the rewriter so essential? Given a
    satisfactory answer to this, is it better to rewrite first to a
    simple qw,cw language, or are per-target patterns better?

 Q: Is it possible to see non-compilable pseudo assembler results
    somehow as type errors or contract violations, and associate
    blame?

Following John Nowak's advice, let's have a look again at the Joy page
about rewriting, and Backus' Turing Award lecture.

http://www.latrobe.edu.au/philosophy/phimvt/joy/j07rrs.html
http://www.stanford.edu/class/cs242/readings/backus.pdf


Entry: Joy and rewriting
Date: Wed Jun 25 10:46:45 CEST 2008

The setting: 

  Any programming language can be given a rewriting system, but for
  Joy it's particularly simple.

The idea is thus to put it on top: a rewriting system as a
metaprogramming system: a source code transformation.
( Actually, no. It seems to be about giving a full semantics to a Joy
program using JUST rewrite rules. )


 Q: Does it make a difference if the rewriting system works on the
    source language (as in Joy) or the target language (as in Purrr)?

It becomes interesting at the point where "The role of the stack" is
discussed: reduction strategies.

There's this 'duality' between programs and stacks that's right in the
middle of my representation. Looks like Manfred was there first:

   This is the key for a semantics without a stack: Joy programs
   denote unary functions taking one program as arguments and giving
   one program as value. The literals denote append operations; the
   program returned as value is like the program given as argument,
   except that it has the literal appended to it. The operators denote
   replacement operations, the last few items in the argument program
   have to be replaced by the result of applying the
   operator. Similarly the combinators also denote (higher order)
   functions from programs to programs, the result program depends on
   the combinator and the last few quotations of the argument program.

   It is clear that such a semantics without a stack is possible and
   that it is merely a rephrasing of the semantics with a
   stack. Purists would probably prefer a system with such a lean
   ontology in which there are essentially just programs operating on
   other programs. But most programmers are so familiar with stacks
   that it seems more helpful to give a semantics with a stack.

This is exactly what the Purrr primitive macros do: they take programs
to programs. Essentials:

   PRIMITIVES: rewrite rules as endomaps of target machine
   code. semantics of concatenative program expressed in terms of
   these primitive machine rules is the composition of these rules
   applied to the empty target program.

   COMPOSITION: already hinted above: ordinary composition serves as
   the main abstraction mechanism to construct new endomaps of target
   machine code.

   PARTIAL EVALUATION: The 'stack' shows up here as the local view of
   target machine code. If the target language has a notion of a run
   time parameter stack, there is a possibility for staging: moving
   computations to compile time while preserving semantics.


In "Quotation revisited" Manfred talks about the "draconian" measure
of not equating lists and quoted programs. In Scheme terms, this is
about constructing lambda expressions vs. quasiquotation + eval. Using
the solution of only allowing the construction of quotations, but not
the destruction (intensional definition, defined by its properties),
isn't really that bad. (It's how Scat does it: all quotations are
opaque, no reflection.)

 Q: For Purrr it's possible to talk a whole lot about the semantics
    of macros without even mentioning target semantics. Does it make
    sense to see target semantics as an extension of the semantics
    introduced by the rewriting rules, to capture the cases that the
    rules don't handle: those that are somewhat general? Or is it
    better to see the macro semantics as the extension of limited
    target runtime semantics.

So, to relate Purrr and Joy a bit more: using rewrite primitives and
function composition, Purrr will reduce a program to a value whenever
it is a pure program. However, the target semantics isn't pure, so not
all programs can be completely reduced.


Entry: code registration
Date: Wed Jun 25 12:12:03 CEST 2008

The point is to record all word structs as they appear in code, in the
proper load order. This operation should be nestable. The problem I
run into is that 'define' needs toplevel/module context, and making
code registeration nestable seems to conflict with this: trying a
parameter gives problem, because the defines will be expanded in an
expression context.

If it can't be made nestable, let's make the code storage write-once.

Maybe it's better to define 'compilation unit' (one invokation of
'register-code' per 'forth-begin'   = module or load.

The problem to solve is to figure out which code was already
uploaded. Maybe it should just be marked? Done.

The remaining problem is: how to handle errors during upload? It might
be wiser to only mark code as synced AFTER upload was successful. Lets
provide an enumerator interface instead.


Entry: Swapping the two stacks : using just rewrite primitives?
Date: Wed Jun 25 18:32:16 CEST 2008

In fact, since the 'assembly stack' is of such paramount importance
for giving a semantics to the macro language:

 Q: why not use it as the primary stack, and define Forth primitives
    that manipulate program entry points (conditional jumps) as an
    extension to that? (sticking with pure rewrite rules at first?)

 Q: if so, can a concatenative eager rewriting macro language like
    Purrr be equated with a purely functional typed concatenative
    stack language without full reduction?

To answer the first rule: if code quotations are allowed without
higher order functions then my gut feeling is that this should work
pretty well. This brings the metalanguage VERY close to Scat: simply
extending Scat with assembly code data types already does the trick.

It looks like this is the way to find a better link between target
semantics and macro semantics.


Entry: Automatic instantiation
Date: Wed Jun 25 19:34:45 CEST 2008

from the blog post, which will probably be edited once i find a way to
express this properly (and solve the problem maybe..)

  Now, that's a nice story, but that's not how it happened :) The
  specification of rewriting rules came natural as a syntactic
  abstraction for some previously manually coded peephole
  optimizations. This rewrite system however seemed way more useful
  than just for performing simple optimizations, and by introducing
  pseudo target assembly instructions has been transformed into a
  different kind of two-level language semantics: powerful compile
  time type system with an analogous run time type system with
  'projected semantics' derived from eager macro rewrite rules.

  The downside of this is that in order to write programs, one has to
  make decisions about what to instantiate, and what not. It might be
  interesting to try how far this can be automated: don't instantiate
  what cannot be instantiated, but try to isolate subprograms that
  have real semantics.

Two things:

  Q: macro semantics and target semantics are not the same for some
     words like '+'. is this good or bad? it's useful for computing
     constants, but dangerous for overflows. is it better to
     completely embed the target semantics, and use different symbols
     for the metaprogramming operations?

  Q: is automatic instantiation really that difficult?


compile time + might be seen as a different type.. (like + and +. in
ocaml)

except for optimality (inlining might sometimes be better), solving
the instantiation problem based purely on semantics (inline when a
composition is 'real' + doesn't mess with the target interpreter)
might not be so difficult.


Entry: and?
Date: Wed Jun 25 20:07:57 CEST 2008

did we learn anything?

 - to really know what i'm talking about, i need to concentrate on a
   simpler concatenative macro language without higher order functions
   using a single stack and automatic instantiation of real macros.

 - the relation between semantics introduced by the rewrite rules and
   the the partially/fully postponed operations needs to be clarified
   a bit.


Entry: them stacks
Date: Wed Jun 25 20:29:13 CEST 2008

In the current compiler there are 4 stacks. Following the previous
remarks about concentrating on rewriting first, the order will
probably change to this:


1. Assembly stack, contains target assembly code and is used for
   target code rewriting. (in Forth compilers this is the allot
   buffer. in non-rewriting compilers it grows monotonically.)


2. The Forth Control stack, used for recording jump labels to
   implement looping and conditional structured programming constructs

3. The dictionary stack: (actually a set of stacks, supporting
   multiple entry points / fallthrough and control flow analysis)

4. The macro exit stack: for supporting multiple exit points in forth
   style macros. (an emulated run time stack).



Only the first one is essential for Pure Purrr (Puurpr, Purrepr, ...
Paars?). The other ones are there to support Forth's state in a
functional macro system + give some freedom to exchange macros and
instantiated words and perform control flow analysis.

The problem I mentioned before is that the Forth approach with
separate control stack is a bit of a dead end, since it's not a very
structured way of dealing with code. Macro quotations are probably a
lot better. Unfortunately, there is no simple way to convert forth
syntax to quoted macros, without getting rid of the control stack.


5. Probably, in a language based on 1. with automatic instantiation
   (otherwise there would probably be some kind of code explosion), a
   stack of instantiated words might need to be added. However, this
   is just a write-only registery (log).


Entry: 2 stage semantics
Date: Thu Jun 26 10:15:58 CEST 2008

correspondence:

It might be helpful to put on a background light: i'm trying to write
a system for parameterized programming of tiny computers (currently
Microchip PIC18 microcontrollers) based on concatenative and
functional languages. I'm interested in limited order semantics mostly
from a perspective of optimal implementations: how simple can the
eventual semantics be made without having to sacrifice space/time
efficiency. Currently I'm leaning towards a full macro system with
first class macros, but I'm interested in this limited order
semantics, and like to see if it can somehow be embedded in my
approach.

What i'm trying to figure out is how i can use 'higher order macros' in  
my system to allow for limited order semantics as you are suggesting. 

The approach i'm taking is:

   * Use 2 stages: concentrate on the first stage which consists of 
     a joy-like language that operates on a stack of machine code 
     instructions (stage 1) and a stage that executes the resulting
     machine code (stage 2).

   * Start building stage 1 semantics from rewrite rules that operate
     on programs built from a single stage 2 instruction QW (quote
     word), which loads a number onto the run time stack. For example,
     the stage 1 function '+' performs the following program
     transformation:

            ...  [QW 1] [QW 2]   ->  ... [QW 3]

A complete Joy-like semantics can be built from this, if the fact that
QW can only accept numeric arguments is ignored. 

At this point, some operations might not be defined for all input
programs. For example '+' applied to the empty program is not
defined. What can be done here is to start building target semantics
based on the program rewrite rules: use a couple of instructions that
manipulate the run time stack to make sure '+' can be defined on all
input programs:

            ... [QW 1] -> ... [ADDLW 1]
            ...        -> ... [ADDWF POSTDEC0 0 0]

Doing this for the whole set of primitives gives a language with 2
stage semantics:

   stage 1: program text represents machine code rewrite functions
   stage 2: rewriting of the empty program results in 'real' programs

The remaining problem is that some values used as arguments to machine
code instructions might not be numbers: the Joy like language is
higher order, so quoted programs are an example of such values. (We
can create another problem by introducing intermediate instructions,
which are stage 1 data objects that do not represent target values nor
instructions. However, this is beneficial for the eventual goal of
paramerized programming.)

As a result, not all (macro) programs that have a stage 1 semantics
can be attributed a stage 2 semantics, because applying them to the
empty program does not yield a program that lies in the target program
space due to the use of non-numeric values, or the use of pseudo
machine instructions.

What I'm already convinced about is that this approach works pretty
well for manual metaprogramming: by requiring the programmer to
instantiate the 'real' programs as parameterized general macros,
programs can be built in a Forth style language. (Think Forth run-time
and immediate words). Allowing for compile-time data types that do not
translate to the (necessarily) limited target machine semantics gives
access to a very powerful way to factor/modularize parameterized
programs (specialized code generators).

What I'm interested in is to figure out how to perfrom automatic
instantiation which gets rid of the 2-mode word/macro Forth-style
semantics, how to turn non-specialized program generation into type
errors where the source can somehow be blamed, and how to embed
limited order operators in a sound way.


Entry: rewrite semantics
Date: Thu Jun 26 12:18:20 CEST 2008

a remaining problem in my reasoning about rewrite semantics is this:

  * Manfred talks about giving Joy a semantics through rewrite
    rules. This REPLACES the stack semantics, but stack semantics is
    later re-introduced as a STRATEGY for implementing the rewrite
    rules.

  * I talk about rewriting target programs. The definition of the
    function + can be written as

      [qw 1] [qw 2] +  -> [qw 3]

    This syntax represents the definition of a function which maps a
    target program of 2 instructions to one of 1 instruction. (Let's
    not use parameterized numeric values for simplicity.)

    But this is trivially changed into a system of purely syntactic
    rewrite rules like:

      [qw 1] [qw 2] [qw +] -> [qw 3]


What's the difference? They are really the same, no?

Extending the function system with functions that are
'self-compiling', i.e. :

       123 -> [qw 123]

and extending the rewriting system with a preprocessing step that maps
all syntax elements X to [qw X], we have two ways of interpreting:

      1 2 +

As functions, and inside the rewrite system.



Entry: chip bootstrap and monitor protocol
Date: Thu Jun 26 14:58:17 CEST 2008

The problem with CATkit is that it needs a bootstrapped chip that
listens for commands on the serial port. The problem with this is that
there is a threshold for people to start using Purrr for PIC18 without
buying a programmer: they have to build one. It would be a lot more
convenient to do all the communication throug the ICD port.

In theory this isn't so difficult, but does require some juggling to
get going. The Purrr console is an RPC protocol: host sends command to
target and waits for reply. The Microchip debugger protocol is a
master-slave protocol. After bootstrapping using the one-way Microchip
programming protocol the PIC can be made to do anything, but requiring
the host to wait for asynchronous replies isn't so easy to do with
custom hardware.

I was thinking about a serial port based interface in which the target
-> host protocol is simple RS232, but host->target can be synchronous
(for initial programming) or asynchronous.



Entry: Backus Turing Awared Lecture
Date: Thu Jun 26 20:39:54 CEST 2008

Interesting points about FP:

  * all functions are unary. (this fits in a concatenative but not
    necessarily stack-based) approach.

  * primitive combining forms are chosen not only for their
    computational properties, but also for how they behave in the
    algebra of programs.

It has crossed my mind about using objects different than stacks to
perform the chaining. Looks like that is what FP is: lists of lists.

( PF is FP backwards. very funny. giving FP a postfix syntax wouldn't
be completely insane, really.. having an embedded array processing
language is a concatenative one could get away with the strange
semantics of 'map' in a concatenative language: turning a value from a
list into a stack, processing it and turning the result back into a
value. )

  * functional forms and parameterized programming are quite related.

  * to apply a defined symbol, replace it by the RHS of its
    definition.

so, if you change the syntax around such that forms are implemented by
postfix macros that expect quoted programs, but do NOT allow quoted
programs to survive the compilation, you're done right? a
concatenative functional macro language.


Entry: point-free style and monads
Date: Fri Jun 27 11:24:25 CEST 2008

Cons: M t
Unit: t -> M t
Bind: (M t) -> (t -> M u) -> (M u)

A monad M is a way to organize a collection of t, together with a way
to sequence computations of such collections. The 'bind' operation
takes values t from M t, produces a collection of monadic values M u
and combines those into a single collection M u.

The thing with '>>=' and 'do' is that they introduce new names.
Instead of mapping one monad to another one directly, this is
unwrapped to a 'do' comprehension that is then 'iterated' by the
implementation of '>>='.

It's probably better to look at the alternative formulation, replacing
Bind by:

Map:  (t -> u) -> (M t -> M u)
Join: M (M t) -> M t

which can be used in point-free style, and is closer to the spirit of
FP and stack languages.






Entry: lambda: why names?
Date: Sun Jun 29 10:25:11 CEST 2008

http://www.latrobe.edu.au/philosophy/phimvt/joy/j08cnt.html

Lambda names are a user interface: lexical locality works well for
human brains. However, manipulating lambda expressions is
tedious. (DeBruijn indices fix this problem, but are not so
readable.. Maybe it's best to regard names as syntactic sugar?).


What I don't understand in a lot of texts about Joy is the emphasis on
composition instead of application. I understand that the _structure_
of a program is better seen in terms of composition alone, but the
eventual use of a program is really application. Let uppercase denote
data items and lowercase denote functions:

           S a b

The first space between S and a is an application, while the second
one is a composition. Eventually, you're interested in the
value. Maybe the nuance is to really get rid of the value altogether
and see the semantics of [a b] as the 'output' of a program? Feels
wrong..

    syntax:     concatenation
    semantics:  composition of functions
    execution:  application

The only real down-to-earth use i see is syntactic manipulation
leaving semantics invariant: optizing compilation.

Really, Joy in its 'composition only' lore is really about
compilation, about 'relative semantics' which stops at the actual
application. Maybe my intuition is too much attracted by operational
semantics (the 'real' world?).

After all, there is something to say for "Application has not a single
property. Function composition is associative and has an identity
element" -Meertens. In S a b, dragging the S on the left along is
rather pointless, even if the 'real' thing that happens is ((S a) b),
semantically all that matters is the composition of a and b, because
the application can be associated out: (S (a b))

Anyways.. Onward: Backus' FP: all functions are unary, but functional
forms can take multiple parameters. Embedded in Purrr this means that
at runtime there is a single 'token' going around, but at compile
time, there might be a stack of functions and forms combining
them. Actually, this seems like an interesting embedding! Purrr as a
metalanguage for a non-stack language, based on the observation that
both languages are concatenative, but having a different threaded
state: stack vs. list of lists.

Then about Category Theory and CAM. Too much for now..


Entry: is interpretation really different?
Date: Sun Jun 29 12:37:10 CEST 2008

This popped up before, but I'm not sure if it's an arbitrary
re-arrangement. Consider the expression from last post:

                S a b

Where 'S' is a state, and a and b are functions. Turning the data/code
roles around, one could interpret S as a function and a,b as data,
where application of S yields a new function:

                ((S a) b)

This has the semantics of an interpreter: 'S' is an interpreter state
that takes the input code sequence (a b) to produce a new state.
Compare this to the state monad in Haskell.

Somehow it feels as if (S a) or (a S) are really only two sides of the
same coin: producing a new state interpreter S from the message a, or
computing a new state S to be interpreted by function a. Is this
related to different order of evaluation/currying of the same
function?




Entry: oleg metaprogramming
Date: Sun Jun 29 23:29:32 CEST 2008

http://okmij.org/ftp/Computation/Generative.html#framework


Entry: gnuplot
Date: Tue Jul  1 14:37:34 CEST 2008

First: There's a small inconvenience in sandbox that shuts down ports
created inside a sandbox whenever there are eval-limits. Setting space
and time limits to #f fixes this.

For the rest, gnuplot works nice, but i get zombie processes.

FIXME: closing the port doesn't stop the process. add a custom port
wrapper.


Entry: custodian + custom port
Date: Tue Jul  1 21:06:13 CEST 2008

Trying to avoid the creation of zombie processes when custodian shuts
down the gnuplot pipe. This works well with 'close-output-port but
doesn't work with custodians, presumably because the inner port gets
shut down first?

(define (open-gnuplot)
  (let ((co (current-output-port)))
    (match
     (process/ports co #f co "gnuplot")
     ((list stdout
            stdin
            pid
            stderr
            control)
      (make-output-port
       'gnuplot
       stdin
       (lambda (bytes start endx _ __)
         (write-bytes bytes stdin start endx))
       (lambda ()
         (printf "closing gnuplot\n")
         (close-output-port stdin)
         (control 'wait)))))))

(define p #f)
(define c (make-custodian))
(parameterize ((current-custodian c))
   (set! p (open-gnuplot)))
(custodian-shutdown-all c)


 Q: does the custodian shut down custom ports?

(define (make-dummy-port)
  (make-output-port
   #f
   (current-output-port)
   void
   (lambda ()
     (printf "closing\n"))))

(define p #f)
(parameterize ((current-custodian c)) (set! p (make-dummy-port)))
(custodian-shutdown-all c)

-> nothing happens..

(close-output-port p)
-> prints "closing\n"

EDIT: solved with double fork using an external utility:

#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>

int main (int argc, char **argv) {
    int pid;
    if (argc == 1) {
        fprintf(stderr, "usage: %s <prog> <arg> ...\n", argv[0]);
        return -1;
    }
    pid = fork();
    if (!pid) {
        char *a[argc];
        int i;
        for (i=0; i<(argc-1); i++){
            a[i] = argv[i+1];
        }
        a[i] = 0;
        execvp(a[0], a);
        fprintf(stderr, "%s: can't execute %s\n", argv[0], a[0]);
    }
    return 0;
}


Entry: matlab-like behaviour
Date: Wed Jul  2 00:59:08 CEST 2008

Except for heavy-duty floating point, and a ton of library code, most
of what is in the matlab language is ease of working with vectors and
matrices on the syntax level. What would it take to have a scheme-like
clone? Are there any already to take some inspiration from?


Entry: chaos
Date: Wed Jul  2 01:01:45 CEST 2008

got lost in chaotic patterns again.. what i've been doing last couple
of days:

 - LFSR + Hough
 - Reading the FP paper + joy rewriting (algebra of programs)
 - misc stuff about point-free languages
 -                  metaprogramming
 - simplified core semantics: 2stack -> 1stack + sexp (code quotations)

 


EDIT: using dynamic-wind

(define c (make-custodian))
(parameterize ((current-custodian c))
  (thread (lambda ()
            (dynamic-wind
                void
                (lambda () (let l () (sleep 1) (l)))
                (lambda () (printf "shutting down\n"))))))

nope, doesn't work either


Custodians don't manage processes, and it doesn't look like there is a
way around that.. Maybe ignore the problem for now? Or try to get
something similar working with subprocess?

Maybe the "double fork" trick works here?





Entry: Multi Stage Programming: Its Theory and Applications
Date: Wed Jul  2 10:26:40 CEST 2008

PhD Thesis of Walid Taha:
http://www.cs.rice.edu/~taha/publications/thesis/thesis.pdf

About typed metaprograming (MetaML)


Entry: datatypes and iterators
Date: Wed Jul  2 17:14:37 CEST 2008

Starting from the ideas in FP, how can we build a minimalistic algebra
of programs specific for image processing? This means:

   * keep the data type simple (tiled images)

   * functionals are special cases of map/fold/shift for image ops

starting from this, building a framework for effective loop fusion
should be doable. the problem is composition of shift operators:
combining two convolution maps gives a bigger convolution map. (Is it
possible to work only in terms of the 4 direction unit shifts?)

This can be tested in 1D first, i.e. for block based audio processing.

Algebra of programs. Ingredients: 

  * binary functions +,-,*
  * scalars + vectors

Is it possible to make something smaller than FP?


Entry: lab/image-io.ss
Date: Wed Jul  2 17:27:50 CEST 2008

PGM input and YUV4MPEG output seem to work, but it's quite
slow. Working towards the algebra of image processors, it might be
interesting to start with the same basic structure in scheme:
represent images as 1D vectors, and generate 'fast iterators' for it.



Entry: that DSP language
Date: Wed Jul  2 21:47:25 CEST 2008

Buzzword time. Or, what are the different ideas I'm trying to solve at
the same time by being confused for months in a row.. Basically, I
can't take the time to read all research on this topic, and find it
hard to follow such without proper hands-on experience. So how far can
I actually get with common sense alone?

metaprogramming:
  * dynamicly typed metaprogramming (Purrr)
  * concatentative composition based languages + evaluation time

dsp:
  * an algebra of programming languages / rewrite rules
  * real-time memory allocation + organization: maximise locality
  * combining protyping + implementation (meet-in-the-middle language)
  * solve the tiling + shifting problem

Currently I got a 2 projects to finish (snowcrash + staapl PIC18), but
after that, I need some free hacking time to tackle the next problem,
or some study time. Before I do anything else, I need to do:

   * Pierce TAPL
   * Muchnick ACDI


Entry: tile problem
Date: Wed Jul  2 21:58:43 CEST 2008

Suppose it is possible to obtain a data-flow graph which maps inputs
to outputs. Use this to:

   * create a core loop for infinite data

   * optimize core loop to introduce software pipelining and eliminate
     multiple reads.

   * solve boundary conditions

Maybe it's time to start reading Muchnick ACDI, and combine it with
information from Pierce TAPL and the vague idea of Algebra of
Programs + how to use it to perform loop fusion for DSP stream and
image processing.


Entry: image iterators + dont-care regions
Date: Thu Jul  3 20:34:44 CEST 2008

Algorithms simplify a whole lot if dont-care regions can be
constructed: no need to handle border conditions, except for the
initial tiling step (duplication). The duplication this gives is
probably not problematic since it's 2nd order.

The players:

semantics:
  * unary operators + 1 image mapper
  * binary operators + 3 image mappers:
      - 2 images
      - 1 image with X or Y shift

implementation:
  * image accumulators
  * coordinate iterators


Maybe I'm just getting tired, but it's really hard to chop this into
primitives. One of the things that keeps getting in the way is to
create the correct type of result accumulator from the input type. The
problem is: the right model is not (+ a b) but (set! r (+ a b)) which
can then be used to create a (+ a b).

Maybe build it around this:

(define (inner-loop! i i->j fn r a b)
  (vector-set! r i
               (fn (vector-ref a i)
                   (vector-ref b (i->j i)))))

r:    result vector
i:    main index
i->j: main index to secondary index map
fn:   binary function
a:    first input vector
b:    second input vector (can be same as first)


Funny, what i'm re-inventing here are the x and y operators in the
generating function / Z transform (xy transform?) representation of 2D
sequences, but bound to a function. I.e.
   
  lifters : function => sequence operator  

   lift   : +    ->   +
   lift-x : +    ->   1 + x
   lift-y : +    ->   1 + y

Maybe the base language should just be ordinary math functions and
2-variate polynomials? This should be more than enough to generate
tiling + appropriate iterators.

I've come full circle: starting with algebra -> DSP -> implementation
of algorithms -> generalization in language -> algebra.

Anyways, for the 2 shifts in a framework of 2^n tiles, it is possible
to use modulo addressing, which simplifies code a lot.

Sobel:   (1 - y)^2 + (1 - x)^2

Ok, got it to work:

(define (sobel i)
  (define ^2 (U (lambda (x) (* x x))))
  ((B +)
   (^2 ((X -) i))
   (^2 ((Y -) i))))

U unary lift
B binary lift
X binary x-shift lift
Y binary y-shift lift


So.. These are 2 different views: the operator view (where X and Y
denote the shift operators) and the higher order function view, where
unary and binary scalar operations are mapped to unary and binary
image operations. The latter language seems more general: easier to
work with multiple arguments.



Entry: automatic lifting
Date: Sat Jul  5 10:41:02 CEST 2008

Some of the lifting operations can be automated: U and B can be
inferred from the arity of the operation. X and Y need to be
specified.



Entry: scheme vs. purrr PE macros
Date: Sat Jul  5 12:25:00 CEST 2008

It's about macro arguments. The fundamental idea is that the expansion
depends on the input _values_ not just the input structure. In
ordinary day-to-day Scheme macros this is seldom the case.

What I'd like to find is a way to explain the essential difference
between Scheme's macros system and Purrr's macro system, which is a
polymorphic concatenative language where values represent postponed
operations.

Building such a Scheme partial evaluator in transforming all functions
to macros shouldn't be too difficult. This is called "introducing
staging".  The analogous intelligent scheme macro:

(define-syntax-ns (pesel) +
  (lambda (stx)
    (syntax-case stx ()
      ((_ a b)
       (let ((da (syntax->datum #'a))
             (db (syntax->datum #'b)))
         (let ((na (number? da))
               (nb (number? db)))
           (if
            (and na nb)
            (datum->syntax stx (+ da db))
            #`(+ a b))))))))

So, is there ar real difference between the concatenative (string)
rewriter and the tree rewriter? Not really. The only problem is that
for a tree rewriter which optimizes applications, the appropriate
rules for lambda rewriting need to be implemented. The only difference
is thus convenience: this kind of stuff is easier to do in
concatenative languages due to absence of names.

  * Concatenative languages: non-primitives can be expanded as a
    concatenation of primitives, which are simply applied in order.

  * Lambda languages: non-primitives need to implement the usual
    lambda reduction mechanics.

So, partial evaluation of pure lambda expressions is actually not so
difficult: if you start from normal order reduction, just reduce
things that can be reduced for a certain expression.

So.. Applied to the case of image processors. If they can be written
as pure expressions, making evaluation order irrelevant, a program is
easily specialized:

       1. library of primitives + combinator HOFs
       2. specialized expressions

   -> eliminate all applications of HOFs to yield a single expression

So, really, this seems quite straightforward. Am I missing something?
Yes. This does not include deforestation (or, the simplified version
for image data structures).

Roadmap:
  * write a lambda expression reducer
  * obtain rewrite rules for image HOFs
----
  * alternatively, formulate it in a concatenative language to avoid
    the lambda reducer.



Entry: order of parameters
Date: Sat Jul  5 14:07:59 CEST 2008

 Q: For highly parameterized code, the order of arguments in a higher
    order function decomposition is a bit ad-hoc. Is there a way to
    make this less so?


Entry: split coma/macro
Date: Sat Jul  5 18:02:40 CEST 2008

Merged the split-off staapl-coma project: swaps the order of the two
stacks, such that there is a 1-stack metalanguage that doesn't use
Forth style control words. The log entries are inlined below.

What this does it give a clear separation between the languages:

  * COMA

     An s-expression based COmpositional MAcro language of which the
     values represent atomic target programs. Using pattern matching,
     program rewrite rules are implemented that perform partial
     evaluation and program parameterization.

  * MACRO

     On top of COMA, a Forth macro language with Forth control words,
     labels, code fallthrough and local exit macros. 


-----

_Entry: swap the 2 stacks
_Date: Wed Jul  2 23:38:47 CEST 2008

I'd like to move to a single stack model for a clean Macro language,
all the other stacks are for Forth style control words.

This is the prototypical "deep change" that's hard to make in a
dynamic language. Is there a way to make this easier? Maybe separating
out part of the macro language (mos) which will implement the core
compiler + pattern matcher.

It involves changing all primitives, since they no longer move stuff
from the Scat stack to the asm stack, but transform data in-place.


Got pretty far already: got basic coma macro language to run + simple
macro> command line.

OK. got a bit further. stuck at:

box> (require "pic18.ss")
...
box> (repl ": asdf 123 23")
;; (macro) asdf
STATE:#<compiler>
non-null-compilation-stack:  ((23) qw (qw 123))

 === context ===
/home/tom/darcs/staapl-coma/macro/postprocess.ss:36:0: empty-ctrl->asm
/home/tom/darcs/staapl-coma/macro/postprocess.ss:44:0: assert-empty-ctrl
/home/tom/darcs/staapl-coma/macro/instantiate.ss:218:0: compile-forth
/home/tom/darcs/staapl-coma/macro/instantiate.ss:384:0: target-compile-1
/home/tom/darcs/staapl-coma/macro.ss:35:0: target-compile
/usr/local/plt-3.99.0.26/collects/scheme/sandbox.ss:459:4: loop


while (macro> 123 23) works fine..
time to go to bed..


_Entry: bug fixes
_Date: Sat Jul  5 17:54:14 CEST 2008

Nothing serious, just some missing dependencies due to file splits,
and the expected ctrl/asm confusion here and there.

What i did note: pic18/test.f doesn't 'require' but it does 'load'

Looks like we're done. Time to merge.



Entry: rewrite rules for HOFs
Date: Sat Jul  5 21:25:02 CEST 2008

 Q: What is the essence of the 7 deforestation rules in the Wadler
    paper? (page 8)

(1) variables are left alone
(2) distribute over type constructors
(3) function application: substitute terms in parameterized body, and
    recurse transformation
(4) distribute over case (variable)
(5) given constructor, pick one branch and substitute terms
(6) case of function application: substitute argument
(7) case of case: push inner case through to the branches

The case statements are there to handle pattern matching for union
types. You need those to be able to stop recursion! The rest is really
just term substitution and elimination of constructors through rule
(5).

Translating this to what I want to build: either I find a way to use
this representation together with a final step that optimizes data
recursive constructors to use arrays, or I use a special set of data
types..

The higher order macros are interesting. (Wadler mentions OBJ btw. I
probably got it from there.) 'where' terms are introduced, a kind of
'let' for local function definitions. Macros are then like functions
whos variables can reference function names, but cannot be
recursive. Lack of recursion guarantees they can be expanded
out. First order recursion is still allowed using 'where' clauses.


 Q: Make a summary of what the OBJ system is about.

Rewriting + first order equational logic + ordered sorts (types).
Quite elegant, but I'm not sure whether I can use any of this in my
untyped ad-hoc approach.. Theories are quite surprising: an ability to
define properties of operations.

 Q: Algebra of programs? 

The Backus paper really seems to be seminal for all this work in
functional programming about program transformation.. Nobody seems to
call it algebra of programs though..

 Q: "Functional Programming with Bananas, Lenses, Envelopes and Barbed
    Wire"

    "We develop a calculus for lazy functional programming based on
    recursion operators associated with data type de nitions. For
    these operators we derive various algebraic laws that are useful
    in deriving and manipulating programs."

Seems to be about moving from basic, low-level recursion to
transformation on a higher level: using combinators.



Entry: optimizing lists to arrays
Date: Sun Jul  6 00:35:12 CEST 2008

This deforestation business seems doable. Remaining problem is to map
recursive list processing algorithms to vector algorithms by somehow
faking 'cons'. Or, it's really 'cons' with cdr coding.

On the other hand, HOFs could be used for this: operations that lift
scalar ops to container ops.


Entry: not writing a single line of C code..
Date: Mon Jul  7 18:08:52 CEST 2008

Attempt to generate C code for Sobel + Hough transform, based on a
higher order macro specification. Basicly the same as hough.ss but
with partial evaluation of some functions.

Roadmap:

 * start from a purely functional description in HOF combinator form
 * proove some transformation laws for the combinators
 * use these to transform the algorithm

The real problem is making the X and Y combinators combine into
something that can be easily compiled out into n x m rectangular
region combinators. Basicly, start with the loop you want to end up
with, and factor it into separate parameterizable pieces.

It seems like 'adding parameters' to an inner loop is what makes this
difficult. This can be solved with HOFs and partial application. When
going that way (fixed arity) maybe using a stack approach can be done
immediately?

EDIT: see onward entry://20080710-121719


Entry: about recent changes and insights
Date: Wed Jul  9 02:06:36 CEST 2008

Moving from 2stack->1stack for the core macro language (Coma) seems
like a pretty significant step. This will allow most Fortisms
(semantics) to be concentrated in a single implementation, next to its
syntax.

For the array/dsp language design: moving toward algebra of
combinators seems to be the right approach. The problem is how
exactly. I'm trying to align with these ideas in finished research to
see where I can add a bit of originality without re-inventing
everything.


Entry: That DSP language: fanout
Date: Wed Jul  9 14:02:19 CEST 2008

Thinking about the particular problem of writing the sobel algorithm
in combinator form, doing this in stack notation makes me miss a
``parallel'' function mapper. Basicly, a 'distribute' operation.

The essential problem is that a lot of DSP algorithms are many -> many
maps: they are far from linear (single use of variables), and are
quite parallel. To solve this, combinators need to do the duplication
and parallel application. (anamorphism followed by catamorphism).

It might be interesting to allow for code quotations to be
parallel. I.e. a list of functions can be interpreted as a
composition, or as a parallel application: It's useful to have a
vector of functions/closures.


Entry: porting old ip.ss
Date: Thu Jul 10 12:17:19 CEST 2008

Let's start from the previous approach from entry://20070330-160157
and see if there's something to do with the new insights. This code is
mainly about generating C code from a grid-function specification:

(fn (gain x)
  (* (gain)
     (+ (x 0)
        (x -1)
        (x +1)))

This has the advantage that it translates straight to array accesses:
arrays are finite functions mapping coordinates to value.

Porting to the new cgen.ss involves some switch from
symbols/lists->syntax


Got syntax port working, now need to change the 'grid' and 'loop'
macros + check if they worked before.. Looks like there's some code
missing.

Hmm.. Should i go back to working with symbols, or get the syntax
working? The problem is 'substitute'. This should be replaced by a
variant of 'syntax-case' that recurses over expression arguments.

Ok. using 'syntax-case/r' this becomes quite simple. Got it ported.

NEXT: this implementes 'map', now find a way to express fold/integral
style functions like the Hough transform, on top of this mechanism. No
DSP without inner product.. Maybe focus on the Hough-like accumulator
style first: that requires random access, which is genuinely
different. It seems like a lot of generality is necessary to express
such a specific data flow.


Entry: functional forms (FP in Coma)
Date: Thu Jul 10 16:21:51 CEST 2008

So, embedding functional forms in Coma. 

 Q: What is a functional form? 

It is a macro which takes multiple macros as arguments. Elaborating on
the first example in Backus' FP paper: the inner product, this becomes

  inner = trans (*) map (+) insert

This is as much about datatypes as it is about functional
forms. 'trans takes a 2D vector and transposes it, 'map takes a 2D
vector and applies a function to each inner vector, returning a vector
of results. 'insert takes a vector and folds it with a binary
operator.

The thing which feels a bit strange here is to have unary
functions. All stack operations are automatically mapped to vector ->
scalar functions. We could use a vector -> vector lift too, for macros
with multiple return values.

Embedding FP in a concatenative macro language:

   * all functions are unary (arity 1 -> 1)
   * pure stack macros can be lifted to FP functions in 2 ways:
        vector -> scalar  (one result)
        vector -> vector  (one or multiple results)
   * functional forms have arity n -> 1 and operate on unary functions.

So, the trick is to somehow hide the lifting of unary stack ops ->
unary vector ops. The easiest way is to see the stack as the outer
vector wrapper. This however doesn't 'unpack things by default.

I.e., in FP, the '+ function behaves like:

      +:<1,2> = 3

and not

      +:<1,2> = <3>

Using only the top element seems a bit dirty (for the same reason that
'map feels dirty), but it does seem to be the most convenient
approach.
      

 Q: Does everything look like a nail?

Is this just a small gimmick, to have an embedded FP macrolanguage for
creating (inlined) expression evaluators, or is it genuinely useful to
construct (static) DSP structures in control applications where most
glue logic is Forth? I guess if I can re-implement Krikit I have a
proof of concept.


Entry: fold
Date: Fri Jul 11 11:26:04 CEST 2008

Given a way to transform loop body specification (- (x 0) (x 1))
into an expanded loop, how to perform folds?

The Hough transform: for each (x,y) accumulate 

   r = x cos t + y sin t

into an (r,t) plane.

Maybe it's best not to write this as a fold. The problem is that this
is not a fold of a simple symmetric binary operator, but of a piksel
and the state (r,t) accumulator. In pseudo:

    (fold (lambda (piksel accu x y)
            (+ accu (sino x y))) 
          accu0
          image)

How to add this kind of operation?

Maybe formalize the grid version first: Sobel is

(fn (a)
  (let ((dx (- (a 0 0) (a 0 1)))
        (dy (- (a 0 0) (a 1 0))))
    (+ (* dx dx)
       (* dy dy))))
        
Aha! This is why sets are necessary for local binding of loop
pointers. It eliminates common subexpressions!

The 'let' form seems to work, however, current parsing messes up the
syntax by mis-identifying expressions as grids. This needs some kind
of parameter substitution.

'let' doesn't seem to work for expression statements. The code below
expands fine, but cgen.ss expands the 'let' incorrectly.


(src->code
 '(loop (= (grid result 0 0)
           (let ((dx (- (grid a 0 0) (grid a 0 1)))
                 (dy (- (grid a 0 0) (grid a 1 0))))
             (+ (* dx dx)
                (* dy dy))))))
=>

(statements
 (block
  (var int i)
  (for-head (= i 0) (< i (* 400 300)) (+= i 300))
  (block
   (vars
    (float* a_p0 (+ a (+ i (* 0 300))))
    (float* result_p0 (+ result (+ i (* 0 300))))
    (float* a_p1 (+ a (+ i (* 1 300)))))
   (statements
    (block
     (var int j)
     (for-head (= j 0) (< j 300) (+= j 1))
     (block
      (vars
       (float* a_p0_p1 (+ a_p0 (+ j 1)))
       (float* a_p0_p0 (+ a_p0 (+ j 0)))
       (float* result_p0_p0 (+ result_p0 (+ j 0)))
       (float* a_p1_p0 (+ a_p1 (+ j 0))))
      (loop
       (=
        (grid result_p0_p0)
        (let ((dx (- (grid a_p0_p0) (grid a_p0_p1)))
              (dy (- (grid a_p0_p0) (grid a_p1_p0))))
          (+ (* dx dx) (* dy dy)))))))))))

Seems like this is called let*
-> some small bug still. Ok, fixed.

It's working now:

(p '(loop (= (grid result 0 0)
             (let ((float dx (- (grid a 0 0) (grid a 0 1)))
                   (float dy (- (grid a 0 0) (grid a 1 0))))
               (+ (* dx dx)
                  (* dy dy))))))

=>

{
  int i;
  for (i = 0; i < (400 * 300); i += 300)
  {
    float* a_p0 = a + (i + (0 * 300));
    float* a_p1 = a + (i + (1 * 300));
    float* result_p0 = result + (i + (0 * 300));
    {
      int j;
      for (j = 0; j < 300; j += 1)
      {
        float* a_p1_p0 = a_p1 + (j + 0);
        float* a_p0_p1 = a_p0 + (j + 1);
        float* a_p0_p0 = a_p0 + (j + 0);
        float* result_p0_p0 = result_p0 + (j + 0);
        *(result_p0_p0) = ({
          float dx = (*(a_p0_p0) - *(a_p0_p1));
          float dy = (*(a_p0_p0) - *(a_p1_p0));
          ((dx * dx) + (dy * dy));
        });
      }
    }
  }
}


Now, how should i see this? It is an image comprehension with built-in
shift operators. What about the following specification syntax:


(for/grid (a result)  ;; grids
          ((i 100)    ;; dimensions, possibly inferred
           (j 100))
   (= (result 0 0)
      (let ((dx (- (a 0 0) (a 0 1)))
            (dy (- (a 0 0) (a 1 0))))
        (+ (* dx dx)
           (* dy dy))))))

where 'let' uses type inference from the values.

I'm not sure wether it's a good idea to have a grid iterator. Maybe
factoring in smaller for-loop like comprehensions is a better idea?

Sufficiently confused again..

The Hough loop + edge detection should look like this:

(for/grid (a)  ;; grids
  (let ((dx (- (a 0 0) (a 0 1)))
        (dy (- (a 0 0) (a 1 0))))
    (let ((sobel
           (+ (* dx dx)
              (* dy dy))))
          (if (> sobel 600)
              (accu! x y)))))

It doesn't look like this is going to work. Too experimental
still. Let's move to straight C.


Entry: writing C
Date: Sat Jul 12 12:24:01 CEST 2008

Funny, how i've been disgusted by C to then move on to a higher level
of abstraction, only to find that i'm actually enjoying writing C
quite a bit because I'm getting better at writing properly factored
code.

Maybe the trick is really to define an s-expression based language
that can do anything C can do, so the compilation becomes incremental
rewriting?

The approach in ip.ss should maybe be a bit more factored + expose
lowlevel constructs?

How to make a better C?

    -> type inference + polymorphy
    -> local functions (macros): maybe like purrr: manual inlining?
    -> 2 stacks? would make downward-only local functions easier    


My intuition says that it really can't be too hard to do this in a
proper, not too ad-hoc way, but any time i dive into it, it seems as
if i don't understand the problem fully. It does look like types are
the essential part. Maybe it's best to look at C without the
polymorphism of the math operators?

Another thing that might help is to simplify the control flow
constructs. Maybe only 'if and 'goto should be kept? Or 'if and
'while(1)? The 'for loop is at least better replaced with a 'while
loop.

With SSA and CPS being equivalent, how to generate C code such that
the SSA form that the compiler sees is actually the one we intend?



Entry: typed vs. untyped
Date: Mon Jul 14 10:59:55 CEST 2008

Been browsing through Oleg Kiselyov's papers on code generation. Most
of it seems to be based on MetaOcaml and a translation to C. It might
be interesting to try to summarize the difference between MetaOcaml's
approach to staging, and the hygienic 'syntax->datum / 'datum->syntax.


Entry: multi-stack Forth support
Date: Mon Jul 14 11:24:01 CEST 2008

Currently i'm using structure type inheritance together with pattern
matching to be able to perform base type operations on derived types:
they simply leave alone the extended state.

This is essentially the same as operating on a stack: each type
extension adds one stack element. Does this have implications on the
implementation level? Is it better to represent state as a stack of
states right from the beginning? This would make the update method
trivial. Current conclusion: Maybe it's best to keep that method
abstract.


Entry: rewriting
Date: Mon Jul 14 11:34:12 CEST 2008

What I call 'eager' rewriting probably has a better accepted name in
the literature. 


Entry: Generating optimal code with confidence
Date: Mon Jul 14 14:55:20 CEST 2008

http://okmij.org/ftp/Computation/Generative.html

 "A Methodology for Generating Verified Combinatorial Circuits", Joint
 work with Kedar N. Swadi and Walid Taha. roc. of EMSOFT'04, the
 Fourth ACM International Conference on Embedded Software, September
 27-29, 2004, Pisa, Italy. ACM Press, pp. 249 - 258. 
 http://www.cs.rice.edu/~taha/publications/conference/emsoft04.pdf

This paper and related work seems to be a ticket into the field of
Resource Aware Programming (RAP), to find a way to place Staapl's
dynamic type approach, and see how static type systems can be of
benefit. Reference number [3] talks about a linear functional
language, which is pretty close to where i'm going.

The roadmap seems to be something like:
 * get educated about type systems (TAPL)
 * see what there is to learn abot Cat's type system
 * translate this to a type system for Coma

References:

[3]  related to http://www.sac-home.org/
[30] "Generating heap-bounded programs in a functional setting", TAHA
     Walid, ELLNER Stephan, HONGWEI XI.

Resource aware programming:
  * highly expressive untyped substrate
  * stage distinction
  * static type systems

The latter is about typing code/circuit generators so they can be
composed. I don't know what the untyped substrate is about.



Entry: binary operations
Date: Mon Jul 14 15:23:22 CEST 2008

Composition of binary operations have the following structures:

non-struct: tree
assoc:      list
comm+assoc: set

Read the OBJ paper yesterday, and I'm thinking wheter the 'theories'
approach might be usable in Coma: expressing properties of operators,
i.e. associativity, commutativity,...



Entry: MetaOCaml / MetaScheme
Date: Tue Jul 15 12:18:50 CEST 2008

( Edit from: Fri Jun 27 13:02:41 CEST 2008 )

http://okmij.org/ftp/Computation/Generative.html#meta-scheme

4 special forms:

bracket
escape
lift (cross-stage-persistence)
run

  "Scheme's quasiquotation, being a general form for constructing
  arbitrary S-expressions (not necessarily representing any code), is
  oblivious to the binding structure."

But quote-syntax and unsyntax do this correctly, right? Hmm.. I don't
see it without thinking..

EDIT:

  "... uses a complex macro ALPHA that is aware of the binding
  structure. ALPHA traverses its argument, presumed code expression,
  and alpha-converts all manifestly bound variables to be unique
  symbols."

This I can understand: alpha-renaming to make sure names are unique
before splicing in code.

  "Since syntax-rules can only produce globally-unique
  identifiers but not globally-unique symbols, we must use syntax-case
  or a low-level macro-facility. 

OK, if the goal is to create code that has a symbolic representation,
this is clear. The syntax-case case uses generate-temporaries for the
unique names.

But does it need to be like that? If we're generating
code that is eventually to be compiled, why not generate a graph
structure directly?

  "The macro ALPHA is implemented as a CEK machine with the
  defunctionalized continuation."

Ok, so it's an interpreter basicly. CEK is the machine underlying
Scheme, as opposed to i.e. SECD for lisp. The CEK is implemented in
syntax-rules.

It might be interesting to see how alpha renaming and cross-stage
persistence are problematic or avoided in Staapl/Coma. Alpha renaming
is avoided by using a point-free target language. CSP is qw. It
supports numbers, target-address dependent expressions which can be
reduced during the assembly stage, and macros which need to be
eliminated during postprocessing.

Ok. Staapl is quite a bit simpler, because I'm doing metaprogramming
and code generation in the same spot. It's only because point-free
code is linear + that all code generators need a finalization step
that this trick works.



Entry: cleanup
Date: Tue Jul 15 12:33:47 CEST 2008

Since the roadmap for further work is pretty clear (TAPL, MetaOcaml,
typed stack languages) it's time to finalize Coma so it can be
released.

 * Fix undefined symbol bugs for monitor=module (OK)
 * Load monitor as a module (OK)
 * Upload monitor from within Staapl (OK)
 * Load/save addresses (OK)
 * Write documentation (OK)
 * Incremental upload.
 * Make Snot repl
 * Get the synth going


Entry: control.ss and label.ss
Date: Tue Jul 15 13:53:51 CEST 2008

About the space between state:2stack and state:compiler. It is
possible to define the control primitives in 2stack using the 'label'
pseudo-op as it used to be. Later replacing 'label' with the
intelligent construct in instantiate.ss gives the possibility to build
structured code graphs. Maybe it's worth to separate both?

This works well. It leaves the words

     exit or-jump sym label:
     
as hooks that can be used to plug in the control flow analysis code
from instantiate.ss

The code then uses the pseudo ops:

     label jw/if

To represent labels and conditional jumps.

Finalized this: added separate control/ project directory and renamed
the remaining macro/ to comp/ to indicate it's purely about
compilation (code tree generation) and postprocessing, not about
language definition.


Entry: name troubles
Date: Tue Jul 15 16:01:16 CEST 2008

Loading the monitor code into the target namespace works
fine. However, requiring it gives trouble. (undefined macro/f->)

This is probably due to the use of macro/f-> in the parsing words,
while that word is later defined in a source file: at the time of
expansion, the macro isn't yet defined. So it needs a stub.

This can be solved by adding pic18 specific parsing words that are
required into the file.

OK, that works. Added a stub of 'f->' in pic18/macro.ss and created
pic18/parsing-words.ss to add the word 'fstring:'.


Entry: documentation -> parser itches
Date: Wed Jul 16 13:00:14 CEST 2008

I feel that the most important layer Scat -> Coma + Control is
ready. It's simple enough now to be documented. However, the Forth +
parser part isn't very well written.. Does it make sense to spend some
time on cleaning it up? The main questions are:

 Q: Is it possible to write the compiler more as a substitution system
    instead of a CPS parser?

 Q: Is that desireable?

Sticking with the CPS approach, the state that's passed might need
some simplification. The mistake I made is to pass an expression that
will only be added to on the outside, but what is really necessary is
to be able to insert into expressions. This requires a cursor into a
tree. The problem is that I don't know that data structure well enough
to get to this implementation from an intuitive approach.

 Q: What is a zipper?

http://okmij.org/ftp/Scheme/misc.html#zipper

There are two views, one is a fairly straightforward 'reversal of
pointers' where each node encodes a path through to the top node using
the following data structures:

(define-struct path (left path right))
(define-struct cursor (current path))

A path contains a list of left sibling trees, a list of right sibling
trees and a path. If path is #f the current node is the top node.

According to this:
http://okmij.org/ftp/Scheme/zipper-in-scheme.txt

Zipper can be represented as a delimited continuation of a depth first
traversal. It seems this works only for updating nodes, not for adding
them. (or not?)

Anyways, simple straightforward manipulation might be enough to
transform the tree used in the expression accumulator of the Scat
parser into a zipper structure, to get rid of the 'wrapping'
problem.

Zipper as continuation is actually not so hard to understand: the
'reversed pointers' are really nothing more than stack frame links in
the recursive decent of the structure.


Entry: incremental dev
Date: Thu Jul 17 10:51:19 CEST 2008

I think all is in place, I just need to flesh out the normal flow of
operations.

How to create a project?

- make sure Staapl is installed in plt/collects/staapl (sudo make install)

- (require (lib "pic18/monitor-p18f1220.f" "staapl"))

  This file is an example of a self-contained Purrr project for
  PIC18. Loading it like this will compile the file and import all its
  macros and target words into the current namespace. In the following
  we'll take the interactive approach, but remember that it is
  possible to automate all of this: .f files are really PLT Scheme
  modules and can be composed as such.

  In order to handle the code interactively, it's more convenient to
  use a prj environment. This is a scheme namespace object into which
  all support code can be loaded.

- (require staapl/prj/pic18)

  Once the environment is loaded, a Forth repl is available using
  (repl <string>). This will provide the string to the reader present
  in the prj namespace. The following command will load a file into
  the current namespace. Note that this is different from require.

- (repl "load staapl/pic18/monitor-p18f1220.f")

  This loads, compiles and assembles the code. Use (print-all-code) to
  view it. Use (ihex) to view intel hex dump, or (save-ihex
  <filename>) to save it. For convenience, it's possible to call
  (piklab-prog <filename>) to program it.

  Set the port the target is connected to, i.e.:
  (current-console '("/dev/ttyUSB0" 9600))

  Use (prj> . code) to execute prj Scat code, and (target> . code) to
  execute possibly simulated interaction code. I.e. (prj> ping)


- re-establishing contact. this requires target word addresses.   

  (save-target-words <filename>)
  (load <filename>)



Entry: Walid Taha RAP video
Date: Thu Jul 17 13:25:40 CEST 2008

Jan 22 2007 @ google:
http://video.google.com/videoplay?docid=915594482273345538&q=type%3Agoogle+engEDU

Research question: What are the high level abstractions that can be
used to keep control over resource use?

goal:
  - support expressive abstractions
  - ensure safety by static analysis
  - don't let this get in the way

means:
  - multi-stage programming (MSP)
  - reactivity (I/O events)
  - advanced type systems

( Staapl does MSP in a traditional untyped / partly dynamically typed
  way, without special tools for reactivity and static type
  analysis. Contrasting principle: get MSP to work first in a simple
  paradigm, add static tools later using DSLs. )

Ideas behind MSP are old. The new approach is to combine this with
static tools. Reactivity is combined with MSP by creating program
generators for reactive programs with static guarantees.

Essence behind typed MSP: extend with types that are
  - delayed value
  - annotated what kind of delayed value

( An essential concern that seems to be solved by MetaOcaml is
  variable capture. The lucky thing about Staapl is that this problem
  is avoided: there is never any confusion about binding of values: in
  the pattern definition language standard Scheme lexical binding is
  used, while in composition, there are no bound names: all is
  point-free. The big disadvantage of course is that a stack language
  has no parameters. For multi->multi DSP code this is a
  problem. However, FP like array processing languages can be embedded
  in a similiar point-free style, replacing the stack with array
  structures that also can be re-used in every step. The essential
  insight is that it's not stack computing that's important, but
  point-free threaded state that gets discarded after each function
  application. )

What pops out of the FFT example in the talk is the use of
mathematical properties in the generation of the code: algebra of
programs, where the program in this case is ordinary algebra :)

Reactive programming: E-FRP, a scaled down version of FRP from Haskell
which compiles to event-loop C-code.

Linear types: values can be used only once (consumed). Hoffmann
(LFPL). The idea is that in pattern matching, on deconstructs a cons
cell, which is then passed to the RHS where it can be reused:

 cons(x, xs) at d -> cons(1, xs) at d

Indexed types: a bit like polymorphic types, but with parameters that
have values, i.e. lists of a certain size. This goes pretty far: you
provide proofs that the type checker checks. This can be very
specific: nasicly, the types could be made to perform the whole
computation completely in the type system.

EDIT: look at dependent types, Pierce p. 462

Onwards, maybe this is interesting: gradual typing:
http://lambda-the-ultimate.org/node/1707

( About domain specific knowledge and rewrites: basicly, these are
  theorems about the data types and operators. This might be
  abstracted by generating parameterized theorems, but then those can
  probably be specofied as more complicated rewrite rules. One
  possible step is to start from equations, and distill directed
  rewrite rules. )


Entry: The Expression Lemma
Date: Thu Jul 17 16:20:18 CEST 2008

http://blogs.msdn.com/ralflammel/archive/2008/07/16/the-expression-lemma-explained.aspx

Is this right in the middle of point-free code, where imperative and
functional meet? I.e. the application of the composition (f g h) to
the state x in

    x @ (f g h)

can also be seen as the interpretation of the sequence of messages f g
h by the object x:

    [x f]
    [x g]
    [x h]


Entry: Algebraic types
Date: Fri Jul 18 10:11:48 CEST 2008

Roadmap for today: find out exactly how 'pattern is based on algebraic
types or not. Explain this in the scribble doc, upload doc to server
and send an email to Walid Taha.

http://planet.plt-scheme.org/package-source/dherman/struct.plt/2/4/doc.txt

http://en.wikipedia.org/wiki/Algebraic_data_types

  An algebraic data type is a datatype each of whose values is data
  from other datatypes wrapped in one of the constructors of the
  datatype. Any wrapped datum is an argument to the constructor. In
  contrast to other datatypes, the constructor is not executed and the
  only way to operate on the data is to unwrap the constructor using
  pattern matching.

Let's stick to duck typing. I don't see any essential differences
between the way the instruction mapping works and a re-implementation
in terms of algebraic types. The pattern matcher solves the basic
organization of the data stack. On top of that (data within
instructions) any scheme data type and plt-match syntax can be used.


Entry: Generating typed programs
Date: Fri Jul 18 22:39:03 CEST 2008

Actually, generating a statically typed system in a dynamically typed
language isn't such a crazy idea. The type checking of the generated
program can happen at generator run time, at which also the generator
dynamic types are available.

However, what I am doing in Staapl is to generate the target program,
but to also compile it immediately. There is not really an
intermediate generated program representation other than the data
types exchanged between code processors. This data however could
really be representing code bodies for embedded languages.

In any case, static guarantees about the target program are really
dynamic checks of the data passed between code processors /
generators. At generator compile time, I can't check anything. But is
this really necessary? What is lost is the ability to typecheck
generator components. Only when they are instantiated, they can be
tested. Adding an explicit test suite for generators solves that
problem.


Entry: The API
Date: Fri Jul 18 22:48:13 CEST 2008

I'm worrying about publishing the API to the Staapl system. The
important observation however is that the 'patterns 'compositions and
'scat: and 'macro: forms are really enough to start building
abstractions on top of if necessary.

Maybe I should really take the slow approach: document only those
functions that are necessary, and take a slow start.


Entry: Zipper for the parser
Date: Sat Jul 19 12:28:27 CEST 2008

The idea is this: currently the parser api uses 2 syntax elements to
pass around, and some context. It would be better to dump as much of
the context into the passed state, and also encode that state into a
single object that acts as normal syntax transformer.

The standard tree to accumulate is this:

  (lambda (state) x)
                  |
                state

where the x point is added to. The 'locals' parser takes the current
lambda expression and assigns it to a variable in a let expression,
and builds a new lambda expression.

How to represent zipper in syntax objects? If we're only representing
trees where the cursor is at the rightmost slot in a node, the data
structure becomes simpler:

  (state ((lambda (state) #f))) :: <node <<siblings> <root-path>>>

(define (stx-zip-up stx)
  (syntax-case stx ()
    ((node ((siblings ...) parent))
     #`((siblings ... node) parent))))

(define (stx-zip-down stx)
  (syntax-case stx ()
    (( (siblings ... node) parent)
     #`(node ((siblings ...) parent)))))



Entry: incremental upload
Date: Sat Jul 19 14:07:20 CEST 2008

The idea is that code goes 'somewhere'. It is associated to a
resource. All code compiled inside a compiler namespace is registered
to a central code registry. Every time code gets TRANSFERRED somewhere
else, the corresponding code is marked as old. Transfer means to
either write out a hex file or similar, or to upload it directly to a
target.

The simplest interface seems to be indeed the mapper: this can ensures
the operation completed before state is changed.

Onward: using map/mark-target-code, create a function that uploads the
binary code using code defined in tethered.ss

Ok, I remember: last what i did here was to use the comprehensions to
build a formatter for upload-bin. That code now needs to be tied to
getting the last binary code. Simple match? No: i got annoyed by the
absence of a Scheme interface to the interaction code in tethered, one
that automatically connects to the console. Let's write that first.

Maybe the simpler solution is to add a default somewhere?

I moved some of the tools/io.ss code to live/console.ss since it's
quite specific. Can probably merge together.

Added the 'with-console function to run arbitrary code in connection
with the console. This is still not good enough. Need a real
connection. But let's postpone this decision until after the highlevel
part of the code is done.

Verified that the programming worked using a read 'fbytes>list

The 'ihex function in pic18.ss uses 'auto-bin to produce binary
code. Maybe upload should work similarly?

( This level is really a bit of a mess.. Internally code is organized
well, but on the outer levels it's a patchwork.. probably because it's
a state machine written around code state, and there are several
format conversions going on.. )

The interactive state consists of:

   * code compiled upto now, possibly marked as old
   * core + project macros (the concatenative language)
   * the current upload point for interactive dev

The last one is still missing: assembly doesn't save the memory
pointers. Let's place them in target/code.ss


Entry: functional code graphs
Date: Sat Jul 19 20:02:24 CEST 2008

I'd like to move back to functional data structures for the code
graphs. There is simply too much fuss with bookkeeping, so let's move
to functional types.

* Graphs: In order to make graphs in a functional language, one needs
  to see the graph as an infinite tree. Such structures can only be
  defined in a lazy manner. In scheme, this requires explicit use of
  delay/force.

* Unroll the updates: It is necessary to write updates as different
  data structures that refer to their parents. This involves:

    - code compilation + linking (target-word-code target-word-next)
    - assembly (target-word-address + target-word-bin)

Of course, thinking about it now, the reason this all is imperative is
that linking is simplified using the code -> word patch after
instantiation, and assembly is easy because the address and bin slots
can be updated multiple times.

Maybe this is a more sane approach: once code is marked 'old' it is
effectively frozen, and can never change (be re-compiled or
re-assembled). It is also completely concrete at that point, and
should be serializable to disk.

FIXED: made it a bit simpler, using separate *chain* and *bin* stacks
in pic18.ss

NEXT: 'upload-bin' seems to perform a binchunk-split, while this is
also done in pic18.ss : where is best?


Entry: Reachable vs. Incremental
Date: Sat Jul 19 20:40:51 CEST 2008

There are 2 models of developing code:

  * Standard incremental Forth: assemble everything that's generated,
    in the order in which it appears in source code. For subsequent
    code, just append to the already defined code.

  * Reachability: define some entry points into the code, and assemble
    the serialized reachable graph. This allows for more elaborate
    dead-code elimination.

To make it clear, i'm renaming code.ss to incremental.ss since that
part is only necessary for incremental dev.

Also, it seems there was a confusion between 2 parts of state that
need to be maintained:

 - compilation: symbolic assembly code
 - assembly: binary code + allot pointers

Maybe it's best to provide a simplified interface that performs all of
this at once? What is necessary is some kind of transaction model.

Instead of having 'repl' perform just compilation, it needs to do
assembly too. The result of 'repl' is an updated target/incremental.ss
state and a list of to-be uploaded code.

 Q: Is separate compilation/assembly necessary?

Probably only when debugging macros, and then it is probably easier to
use the 'macro> interface, so the control flow analysis and
optimization doesn't get in the way. We'll see later on if a finer
granularity in the api is necessary.

Let's replace 'register-code' with a hook, so the behaviour of what
exactly happens when a file is loaded/required is pluggable.

Hook works fine: this makes things a lot easier. For example now
prj/pic18.ss has full control over what happens when code gets loaded:
it defines two modes: one that accumulates binary code and increments
code addresses, and one debugging mode that simply prints out symbolic
asm.


Entry: hygiene
Date: Sun Jul 20 03:28:30 CEST 2008

Looks like variable capture in macros is quite a bit more complicated
than I thought. Reading the MacroML paper:

http://portal.acm.org/citation.cfm?id=507635.507646

Basicly, introducting binding forms during a source code
transformation requires the assurance that no free variables are
captured (= hygiene). A sure way of doing this is to use generated
names that come from a namespace exclusively allocated to that
particular source code transformation.

The other way around (referential transparency) the names introduced
should be related to those visible at syntax transformer definition
time, not those visible in the lexical expansion context.

As far as I understand, in MetaML and MacroML renaming is used also.

 Q: what are freshness conditions?

No idea. Maybe this has to do with generated names?

The paper contains an interesting section about recursive macros and
'early' parameters: those necessarily evaluated to make sure expansion
is finite.

Hmm.. So MetaML is really about evaluation order: making sure some
evaluations happen before other ones, independent of the language's
default normal/applicative order.

However, the point about substitution at the end of section 3 I don't
really understand. Why is there never any variable capture?


Entry: Staapl pilars
Date: Sun Jul 20 03:51:45 CEST 2008

STACK/POINTFREE:
  - stack machines have efficient VMs / hardware implementation
  - maps to clean functional semantics
  - imperative code looks functional (stack gives referential transparency)
  - easy to express partial evaluation / rewrite rules
  - metaprogramming simplified: no hygiene or reftrans problems

DYNAMIC TYPING:
  - simple dynamic type system: easy to understand. basis = pattern
    matchin transformation.

INTERACTION:
  - incremental development
  - target-view console


OPEN QUESTIONS:
  - type systems: how to add more static analysis
  - embed array processing languages

As a summary, I think over the course of a couple of years I've found
the proper factorization of the program, and as good as optimal
syntactic constructs for extending it.

Disadvantages? Mostly that base language is a stack language (matter
of taste). And dynamic generation can produce obscure errors and won't
catch type errors.

Bottom line: simple highly extensible metaprogramming system for tiny
controllers without arbitrary abstraction walls + a practical
interactive framework.


Entry: the target: language
Date: Sun Jul 20 12:59:14 CEST 2008

somehow '(target> ts) doesn't work any more:
reference to an identifier before its definition: scat/ts in module: "/home/tom/staapl/live/target-lang.ss"

but it is defined in the namespace.
FIXED: didn't include target.ss in parsing-words.ss so the
substitutions macros didnt see those words.

NEXT: full target console + synth


Entry: the synth
Date: Sun Jul 20 14:37:39 CEST 2008

i don't have control stack juggling words defined, so using the
opportunity to use some macros + locals (very useful for 1 -> many
maps like accessing 2-byte variables)


Entry: ,,geo-seq test case
Date: Sun Jul 20 15:06:38 CEST 2008

An opportunity to test table generation and recursion in macros.

geo-seq ( start endx length -- )

This brings up an inportant issue: availability of target values. In
the MacroML paper these are called 'early' parameters. Let's define
them in Coma to mean values that do not depend on target word
addresses, and as such can be evaluated at compile time.

The generator works fine, had to change some things due to the new
tscat: macro. But.. there's something wrong with phases: requiring the
code doesn't really seem to work! Identifiers don't get required in
time..

This looks like trouble...

Let's avoid this for now.


Entry: asm overflow errors
Date: Sun Jul 20 20:17:37 CEST 2008

Forward jumps cause problems due to target addresses begin aligned at
zero. The easiest way around this is probably to ignore these errors
in the first phase?

done.
got it to compile now.

Entry: pattern matching guards
Date: Sun Jul 20 21:23:28 CEST 2008

next problem: 

bang:
	0401 6EEC [dup]
	0402 52EF [movf INDF0 1 0]
	0403 52EF [movf INDF0 1 0]
	0404 52EF [movf INDF0 1 0]
	0405 52EF [movf INDF0 1 0]
	0406 52EF [movf INDF0 1 0]
	0407 52EF [movf INDF0 1 0]
	0408 52EF [movf INDF0 1 0]
	0409 50E9 [movf 4073 0 0]
	040A 6E18 [movwf other-task 0]
	040B 0E10 [movlw 16]
	040C 6EFC [movwf 4092 0]
	040D 0EF0 [movlw 240]
	040E 6EE1 [movwf 4065 0]
	040F 0EE0 [movlw 224]
	0410 6EE9 [movwf 4073 0]
	0411 52EF [movf INDF0 1 0]
	0412 501A [movf (sound 1 +) 0 0]
	0413 D50E [jsr 1 execute/b]

the first part comes from suspend, which properly expands using
'macro>

box> (macro> suspend)
[save]
[movf 4085 0 0]
[save]
[movf 4086 0 0]
[save]
[movf 4087 0 0]
[save]
[movf 4057 0 0]
[save]
[movf 4058 0 0]
[save]
[movf 4092 0 0]
[save]
[movf 4065 0 0]
[save]
[movf 4073 0 0]

It's in the binary .hex code too. Maybe a bug in postprocessing ?

It's this one:

   (([,op POSTDEC0 0 0] [save] opti-save)  ([,op INDF0 1 0]))  ;; NEED SYNTAX

hmm.. how to match to the value of a parameter.

ok, fixed by using a general curried function creator


Entry: compile/execute vs. run
Date: Mon Jul 21 11:23:30 CEST 2008

Due to multi-stage semantics, the meanig of these 3 words requires a
little thought. There are several cases of quoted data to be handled:

macro
label
symbol

Currently, 'compile can handle it all, 'run handles macros and
delegates to ~run, while 'execute handles labels and delegates to
~run.

What about providing a basic ~run, and wrappers around it?

(Note that this is probably a symptom of ill-typed code: macros cannot
be target values.. why is that?)

Conclusion:

  They really do different things. 'run is the clean Coma version
  (Coma doesn't have labels), 'compile wont delegate to the runtime
  ~run and 'execute is a possibly optimized lowlevel execute which
  delegates to ~run.


Entry: Higher order macros
Date: Mon Jul 21 12:38:23 CEST 2008

It seems pretty clear now that higher order macros should be built on
top of the Forth control primitives.

 * Forth code is not structured on the syntactic level: all control
   structures are a consequence of semantics of control macros. Now,
   this is a powerful mechanism in itself, but it really is more
   concrete/lowlevel than quoted code fragments: I don't see a simple
   way to extract structured data from this.

 * Otoh, all functionality to implement higher order macros is defined
   in the Forth control language.

So, to add control structures to Coma (anything that involves
branching), it is better to build those on top of control.ss and
shield that namespace using the module system.

Because higher order Coma has loop bodies in a clean rep, it can
perform more optimizations.

Conclusion:

        - Forth Control depends on pure core Coma
        - Coma Control depends on Forth Control.


Entry: snot repls
Date: Mon Jul 21 12:46:48 CEST 2008

Roadmap:
 - compilation repl OK
 - parser + interaction repl OK
 - polish commands.ss

It's probably more useful to only keep track of assembly code that's
not been uploaded or saved yet. So I'm changing pic18.ss moving
kill-bin! to kill-code!

Upload is working from console now.

Next: load .f files into the namespace using something a little less
raw than "load <filename>". This requires to move a piece of code from
forth/parser-tx.ss to forth/lexer.ss ... In order to get the relative
loading to work properly, forth-load/compile just expands to the
'load' word + filename inlined as string.

One more thing: in order to be able to use 'load' in the interactive
console, one needs to have access to reflective operations. So this
should work:

(define (forth-load filename)
  (eval `(forth-load/compile ,filename)))

This seems to work. I put it in live/reflection.ss

Next: mark (hmm.. lot of this convenience stuff needs to be
re-implemented..)

Entry: mark & empty
Date: Mon Jul 21 17:28:29 CEST 2008

Mark probably won't work like it used to: it needs a stack of current
words.. Maybe the run-time state in pic18.ss needs to be implemented
as a stack?

 Q: Can this be implemented properly instead of hacked together? This
    means: perfect restoration of a namespace. Can the namespace
    itself be dupped?

 Q: Can we somehow serialize the namespace? 

 Q: A procedure can be serialized, but a closure can't is this true?


Let's hack it together first.

Hmm.. All this depends much on what I want to accomplish. Simply put,
the only operation I'm interested in is to REPLACE some interactively
loaded code. In all cases I've working in sofar, the application
consists of:

  (C) a fixed core
  (I) incremental replacements

Here (C) is completely source-defined, while (I) is the incremental
part. Maybe this is a better model to work with than setting (C) to be
only the monitor code. Maybe 'empty' should always go right upto (C),
and not use a stack of restore points. It sounds as if it is cleaner,
but i've never used it effectively because it requires some mental
tracking while mostly you just want to start from a clean sheet.

So, let's pick the best of both worlds: no mark/empty. If you want
empty, recompile and reflash the app. This also reflects a need that
occured in the previous approach: sometimes things go bad, and what
you want is to go back to a working point fairly quickly.

Eventually, this will require custum programmers. But let's do it with
the ICD2 first.

'mark and 'empty are currently implemented in the simplest way
possible: just tracking the words. Some extra safety can be built on
top of this, but essentially, once you use 'empty the namespace and
target are out of sync.


Entry: substitutions
Date: Mon Jul 21 19:44:50 CEST 2008

Something I dont' really get is why substitutions don't get
name-checked before used.. They are macros, maybe that's why?

The problem is that some definitions might not work. Is there a way
around this? Maybe evaluate the code somewhere? No.. the identifiers
are only interpreted when the macro is invoked. Before that no checks
can be made.


Entry: project reload
Date: Mon Jul 21 21:48:52 CEST 2008

Can't install new namespace from within the namespace, so need to work
around this by throwing some exception/abort.


Entry: done?
Date: Mon Jul 21 22:07:27 CEST 2008

Need to check the synth code if it's still working, but as far as I
can see I'm done. Some minor toplevel organization things + ICD2
programmer interface.


Entry: problems
Date: Tue Jul 22 00:08:20 CEST 2008

So, what didn't work out? I'm getting a bit hyped up with a nearing
release, maybe time to list the things that I've been stressing about:

 * catching loop bodies into functional representations
 * the simulator: over-the-top staging challenge
 * dsp language: probably will become AP language
 * C code excursion: need firm ground to work on the grid iterators


Entry: disassembler
Date: Tue Jul 22 00:58:05 CEST 2008

Forgot about that.. Maybe try to get it working first.


Entry: Graph structured lambda calculus, SECD, ...
Date: Tue Jul 22 01:29:08 CEST 2008

I'm tired so this might be nonsense..

Something i never understood is the obsession with keeping lambda
representations flat. For source transformations it makes a lot more
sense to represent lambda terms as a graph instead of a tree:
explicitly connecting reference sites with binding variables.

EDIT: this is actually what de Bruijn indices do: they point upwards
in the graph structure, counting abstractions. Writing this as a graph
gives a directed acyclic graph which is (related to?) the dataflow
graph of the computation.

Anyways: SECD and Forth

S = param stack
E = allot stack
C = instruction pointer
D = return stack

http://www.cs.utah.edu/classes/cs6520-mflatt/s00/secd.ps

SECD is lispey while CEK is schemey

http://planet.plt-scheme.org/package-source/robby/redex.plt/1/0/doc.txt


Entry: dasm
Date: Tue Jul 22 12:41:44 CEST 2008

The assembler has an ad-hoc type system, where operand names determine
the type. This is used for checking overflows of jumps and
implementing absolute/relative addressing.

Anyways, I'd like to use the disassembler to build target code chains,
so they might be used later to be re-generated. The question is, where
should the labels refer to?

Maybe solve that problem later, and first get a bin->chain converter
working.

Ok. minimal dasm working. Needs some tuning + more configurable behaviour
(symbol resolve + word / address size etc)


Entry: the synth
Date: Tue Jul 22 21:05:59 CEST 2008

almost there: thing to do is 
  * boot + isr vectors
  * whole app build script
  * piklab-prog
  * project reload (scratch)


Entry: piklab
Date: Wed Jul 23 14:01:16 CEST 2008

Synth doesn't work. Time to get piklab-prog to work without having to
re-plug the board: using run etc.. OK

The problem seems to be in the binary code chunking: the first chunk
it produces is correct, but the remaining ones are not:

(map car (car *bin*))
(576 0 688 48 50 2142)

The problem is data chunks. How do they end up in the code? The
problem is conversion of words to binary: this should take only code
words. the error is in 'target-chain->bin : added a realm filter.

Looks like there's still a problem: there are 3 code chunks remaining
now:

box> (map car (car (bin)))
(576 688 2142)

The problem could be that data chunks do not get disconnected. Looks
like that was it: added 'terminate-chain after variable macro.

Ok, booted the synth, but it doesn't work properly. This means i get a
chance to test some of the debug features.

There's something wrong with the 2nd instruction 089A E1B3 [bpz _L717
1]. The address is way off. 3 is correct, but where does the 'B' come
from? (It should be E103).

compile> : boo 0 xor z? if -1 else 0 then ;
command> print-code
boo:
	0898 0A00 [xorlw 0]
	089A E1B3 [bpz _L717 1]
	089C 6EEC [dup]
	089E 0EFF [movlw -1]
	08A0 D002 [jsr 1 _L718]

_L717:
	08A2 6EEC [dup]
	08A4 0E00 [movlw 0]
_L718:
	08A6 0012 [return 0]

The 3 above looks like it's accidental. 

compile> : boo z? if 123 then 
command> print-code
boo:
	08A8 E1AC [bpz _L719 1]
	08AA 6EEC [dup]
	08AC 0E7B [movlw 123]
_L719:

This should be E102

Let's go back to only the monitor.


This is a problem with this:
 (bpc      (p R)     "1110 001p RRRR RRRR")
 (bpn      (p R)     "1110 011p RRRR RRRR")
 (bpov     (p R)     "1110 010p RRRR RRRR")
 (bpz      (p R)     "1110 000p RRRR RRRR")

wich i thought was fixed. This line was wrong:
   (([flag? opc p] [qw l] or-jump)     ([,opc (flip p) l]))

compile> : foo z? if 123 then
command> print-code
foo:
	0240 E102 [bpz 1 _L184]
	0242 6EEC [dup]
	0244 0E7B [movlw 123]
_L184:

It works!


Entry: packaging + prepare release
Date: Thu Jul 24 10:01:05 CEST 2008

http://docs.plt-scheme.org/mzc/plt.html

- clean up darcs project init (local collects?) OK
- build plt package
- clean up forth.pdf




Entry: old web site
Date: Thu Jul 24 12:04:55 CEST 2008

<p>
  To understand the development approach and the current form of the
  source code, it might be necessary to see it in the right context. I
  am an electrical engineer working mostly on embedded control and
  signal processing projects. I seek to optimize the development
  process of highly specialized software for embedded systems by small
  groups of 1 to 3 people. I got fed up with ad-hoc methods of
  metaprogramming and code generation that I see used in this
  engineering subculture, and decided to build a clean system on a
  solid base that can be understood and used by a single electrical
  engineer with an open mind towards modern programming language
  technology. I am not a programming language theorist, and if you
  want to use Staapl, you don't need to be either.

<p>
  The current emphasis is on work towards Purrr, a stand-alone
  standard Forth layer for generic microcontroller architectures, and
  Purrr18 an interactive tethered cross-compiled Forth dialect
  designed for the 8-bit Microchip PIC18 Microcontroller. Future goals
  include the design of a linear concatenative language as a successor
  or drop-in replacement for
  the <a href="http://zwizwa.be/packetforth">Packet Forth</a>
  interpreter, and the design of a declarative Scheme derived
  data-flow language to implement DSP functionality on a
  microcontroller or DSP processor. Eventually I want to cover the
  whole spectrum of tiny 8-bit microcontrollers to 32-bit machines
  that can run unix with an integrated language tree based on Forth
  and Scheme dialects, and an interaction system that can handle live
  software updates and debugging for distributed embedded
  applications.


Entry: staapl home
Date: Fri Jul 25 14:19:48 CEST 2008

In the reflection code, there are hard links to the location of the
staapl tree. Maybe these should be made soft such that staapl can be
installed anywhere: trying to host it on Planet gives some trouble..

Maybe I should provide a 'staapl-install function that will install
wrapper modules around the planet modules.

OK. I got some solution.
The preferred module language is 

   #lang planet zwizwa/staapl/pic18

This will allow easy install of staapl through planet. I've removed
all #lang references from the .f files though: going to use 'load' for
most things, and only wrap toplevel code in module languages.


Entry: cleanup dist + docs
Date: Sat Jul 26 09:26:22 CEST 2008

- make sure examples are in planet dist
- clean docs + add to planet dist

examples. there are 3

  * compile + burn monitor
  * compile only synth
  * start only repl

REPL is moved to core and renamed to staapl/prj/pic18-repl while the
examples are now written as modules and accessible through

  mzscheme -p zwizwa/staapl/examples/upload-monitor
  mzscheme -p zwizwa/staapl/examples/build-synth



Next is to cleanup the docs. Maybe the Purrr manual should be moved to
scribble too? This would allow some testing + documentation of live
interaction.


For the Forth doc, it would be nice to write a small macro for
evaluating chunks of literal forth code.


Entry: offline compilation example
Date: Sun Jul 27 12:07:27 CEST 2008

What is necessary is a script that compiles a PIC18F1220 application
from an input forth file, including a monitor and a proper boot
sequence.

This means:
  * read input arguments
  * create namespace with instantiated monitor
  * instantiate the script
  * add a simplified boot mechanism
  * dump out .hex and .dict

Added staapl/prj/pic18f1220-serial wich loads the 18f1220 defs and the
monitor code.

NEXT: need to solve the path issues with 'load'. But where? For
convenience i'm going to put it in the lexer module, but it should
really be somewhere else..

Yes. This is not trivial, since the path needs to be available at
compile time.

Currently:
 - lexer is free of paths
 - relative paths come from rpn-search-path parameter in parser-tx
 - pic18 path is encoded in parser-tx
 - that can move to forth-begin, where the param can be set

Problem: how to make the load path available at compile-time? More
specifically, how to set the parameter rpn-search-path? Simply setting
it at runtime doesn't work since it's a different instance. Maybe
using 'eval' helps?

Hmm... this is a can of worms. Maybe a way out is to add a form that
sets the load path, just like before. This works.

Remaining question: should this be a permanent state? Also, what with
modules? Requiring a module will already use load-relative i
think. YES.

So, the remaining problems are purely about interactive
'include'. This means it can probably be best solved there.

The remaining question is wether in parser-tx.ss, the search path
should be reset on each compilation.



Entry: the state file
Date: Tue Jul 29 10:03:44 CEST 2008

So, now that i'm converging to a certain workflow (fixed core
application + incremental dev on top of that), it's possible to define
a state file which contains:

  - a reference to code to be loaded for macros FIXME
  - a dictionary with target words OK
  - the pointers OK
  - console OK


FIXME: Make setting up the console part of the responsability of prj,
so it becomes easier to metaprogram from Scheme. In the end, Scheme is
the main composition mechanism, not Scat..
Oops. That sounds good but it messes up the unquoting in the macros.


Entry: associativity after instantiation
Date: Wed Jul 30 10:30:07 BST 2008

Working on the staapl/pic18 documentation... There's one thing I'm
noticing now: the highlevel semantics talks about associativity. This
is true for composition of macros, due to associativity of function
compositon. However, this property is NOT conserved through
instantiation!

    I(C(x,y)) != C(I(x),I(y))

C = concatenation
I = instantiation

Actually, this property is essential for some optimizations that
expose 'observable' code: jump targets. This is why the chain
splitting is so important in the instantiation step.


Entry: editing the forth paper
Date: Wed Jul 30 12:01:32 BST 2008

Took this out because it might confuse people:

\footnote{Being aware of patterns is what programming is all about. It
  is important to see patterns in your problem, so you can compress
  the problem down into a feasible solution. However, it might be
  \emph{more} important to close the loop and see the patterns in your
  \emph{solution}, so you can bring your understanding of the problem
  to a higher level. For significantly complicated solutions burried
  in code, the code can really talk back by throwing residual patterns
  at its creator.}

Entry: time for play?
Date: Thu Jul 31 02:23:33 CEST 2008

I'd like to use the dsPIC (PIC30) to make some sound. Was thinking
about targeting the gpasm assembler for it, instead of writing one
from scratch, since the architecture is significantly different from
the 8-bit ones. It would be an interesting test case for serializing
target label computations.

Hmm.. Looks like gpasm doesn't support PIC30. Ok, from scratch
then. Will start tagging posts.



Entry: dsPIC30
Date: Thu Jul 31 12:54:50 CEST 2008

I'd like to generate an assembler from an instruction set table for
dsPIC30, however, I can't seem to find one.

Just sent a request for information about the dsPIC30/33 to
Microchip. Also added the 12 and 14 bit core instruction sets. Maybe
try to get to a blink-a-led app for the 14 bit arch? It would be
interesting to figure out some pic code sharing, to test the
flexibility of the Staapl design.

EDIT: got an unhelpful reply from microchip, trying again.
In the mean time i found this:
http://ww1.microchip.com/downloads/en/DeviceDoc/mplabalc30v3_00.tgz

Which contains a file src/c30/c30_device.info and some C routines to
manipulate it.


Entry: 14 bit arch
Date: Thu Jul 31 16:58:15 CEST 2008

How to add a new architecture?

ASM
 - create an assembler description
COMP
 - add some pseudo ops for stack manipulation
 - write metapatterns for the arithmetic and logic operations
COMBINED
 - connect code to purrr like purrr.ss (so 'forth-compile can be used)
 - connect the assembler


Works pretty well. Now trying to restructure it a bit..
After fixing bug in meta-pattern which prevented the use of (macro:
...) and some minor cleanup, the 14bit core seems to work.

box> (forth-compile ": foo 123 + ;") (print-code)
foo:
	0000 307B [movlw 123]
	0001 0780 [addwf INDF 1]
	0002 0008 [return]

I'm taking a different approach for MC14: no intermediate instructions
except for the ones used in purrr. See where I get..

Entry: meta-pattern
Date: Thu Jul 31 18:09:24 CEST 2008

This is a classical evaluation order manipulation. I know what I want
to do from a high level, but I somehow don't understand the
particularities of it. Quasiquotation is really confusing.. Factoring
it in simple steps might help..

meta-pattern (M0) is a macro that generates a macro M1
M1 expands to a number of applications of a template defined in M0

Let's try to construct a toy example first.

What I don't understand is nesting of syntax-case, and nesting of
quasisyntax and unsyntax.

The rule: an unsyntax corresponds to the toplevel quasisyntax, just
like quasiquote, and nesting of syntax-case just binds new variables,
but the toplevel ones are still visible. Nothing special..

   Basicly, syntax-case is a binding mechanism that allows you to
   avoid unsyntax. Nested quote/unquote is a mess, so solving this
   with merging of namespaces (variables and metavariables) is more
   convenient.

   For higher order macros it's best to stick with syntax-case and
   syntax, and leave nested ellipsis and quasisyntax/unsyntax alone.


Looks like it's working now. It's quite readable.

I've also added a macro 'patterns-class that combines 'meta-pattern
and its invocation. This gives a pretty compact representation:

(patterns-class
 (macro)
 ;;---------------------------------
 (op    pe/op  opcode  w/op   ~op)
 ;;---------------------------------
 ((+    +      addwf   w/+    ~+) 
  (-    -      subwf   w/-    ~-)
  (and  and    andwf   w/and  ~and)
  (or   or     iorwf   w/or   ~or)
  (xor  xor    xorwf   w/xor  ~xor))
 ;;--------------------------------- 
 ((w/op)              ([opcode INDF 1]))
 (([qw a] [qw b] op)  ([qw (tscat: a b pe/op)]))
 (([qw a] op)         (macro: ',a movlw w/op))
 ((op)                (macro: ~op))
 ((~op)               (macro: w<-top drop w/op)))


Entry: multiple targets
Date: Fri Aug  1 00:04:29 CEST 2008

Time to start factoring and parameterizing code. First thing to tackle
is the chain/bin state management: everything that goes into mc14.ss
should be factored out.

created live/state.ss


Entry: ANS Forth
Date: Fri Aug  1 01:17:06 CEST 2008

So.. It seems like a good idea to pick up the ANS Forth again. The
only freedom in implementing one is what kind of threading model to
use, and where to put stuff.

Some requirements:

  * subroutine threaded
  * call/jump/or-jump
  * data word doubler

This can probably be implemented as a small layer on top of standard
forth as done before.

num -> 'num dup upper

Entry: control flow analysis
Date: Fri Aug  1 10:35:10 CEST 2008

Chunks of code in target-word structs are basic blocks in muchnick's
terminology. (Function calls don't count here, because they don't
actually change the flow of control arbitrarily: they are equivalent
to inlined instructions).

One thing I probably need to change is to separate basic blocks after
conditional branches.

These are the basic building blocks for control structures:

      * unconditional jump
      * conditional jump
      * conditions

It seems essential to be able to represent the condition generators in
an abstract form, so they can be easily inverted. (The code that
generates the flag can be inverted, instead of the flag being inverted
after its generated).

Otoh. It should be possible to build a control flow graph with
non-instantiated code. This is the problem I tried to solve with
delimited control, but there have to be better ways.. Some form of
reflectivity on the macro end might be necessary: representing
non-primitive macros as lists?


Entry: tool integration
Date: Sat Aug  2 10:41:35 CEST 2008

Preparing for professional usage, this project needs:
 * better integration with MPLAB (linker)
 * interface with C-based development.



Entry: Factor
Date: Mon Aug  4 18:43:59 CEST 2008

Let's take another look at Factor. What I'm interested to find out is
how the compiler is structured. Let's see if there are any documents
on the blog describing it.

These seem to be interesting links:

http://factor-language.blogspot.com/2007/09/two-tier-compilation-comes-to-factor.html
http://factor-language.blogspot.com/2008/01/compiler-overhaul.html

Hmmm.. They are more about the dynamic vs. static debate. I think I've
converged on that: both are nice, but static declarative modules
win. Toplevels can be built on top of that. For PIC, everything is
static, and redefinitions need to be reloaded, but it does allow for
an 'allot-stack' like development which allows separation of kernel
and application.


Entry: ANS Forth
Date: Tue Aug  5 09:32:01 CEST 2008

It seems that standardizing is an essential part to get to some
adoption. Basicly, nobody cares about nonstandard Forths: people write
their own. Makes sense really.

So, let's bring the PIC18 Forth to standard. The goal is to do it as
convenient as possible, without loosing too much time on
optimization. Some ideas I gathered before:

  * Data doubler: Add a layer that performs just primitive data size
    doubling.

  * Unified address space. Map part of RAM into the namespace.

  * Interpreter. It seems a good idea to stick with a native Forth,
    and write a dispatching interpreter on top of this. That way all
    primitives can be re-used.

It's simplest to first make the data doubler, so words written in the
doubled language are usable in the unit language. On top of this
memory access words can be written, which then can support a trivial
interpreter loop.

Alternatively, I can implement Taygeta's primitives, and optimize them.

Entry: Data doubling
Date: Tue Aug  5 09:52:59 CEST 2008

PRIMITIVES:
* math primitives: coded manualy (DONE)
* macro mapping: coded manually

COMPOSITON:
* parser map:
    - num   -> num hilo
    - word  -> _word

This should be written as a pure parser.

I think I'm running into a composition problem here: I can't find a
straightforward way to plug in the 'derived: word to 'forth-begin.

Let's go to the definition of 'forth-compile and work from there.

(define-syntax (forth-compile stx)
  (syntax-case stx ()
    ((_ str)
     #`(forth-begin
          #,@(string->forth-syntax #'str)))))

This inserts lexed syntax into forth-begin. Maybe 'forth-begin needs a
namespace argument? Let's see where '(macro) is hardcoded. That's in
'forth-begin-tx, where the records returned by the parser are assigned
to a namespace. 

I thought there's one more hardcoding reference, which is in rpn-tx.ss
where '(macro) namespace is used to check if a particular identifier
is a parser, but this is actually parameterized by
'rpn-map-identifier.

What about using the latter in 'forth-begin-tx too?

Next: should the '(target) namespace be remapped too?

Ok, parameterized a bit, but that leads to other spaghetti being
exposed. Atm, the thorn is the fact that instantiate.ss references the
'(macro) namespace. Shouldn't this be generic? I just moved these
references out of 'forth-begin, but the only thing that macro does is
to bind instantiate.ss to the forth parser-tx.ss

Trying to fix something else first: parser-tx.ss doesn't need to be
aware of the wrapping words, it should just provide an abstract data
structure with 'forth 'variable and 'macro tags. OK. The code classes
are interpreted in 'forth-begin-tx and not passed to 'forth->records.

Maybe the next step is to also generate the toplevel scheme forms as
part of the records structure to avoid awkward passing of out-of-band
data through muting parameters? Done.


Alright, it's a bit cleaner now. Maybe this is enough to build a
front-end that takes macros from a different namespace? Still there's
the problem of how to link target functions back to the original
target namespace. Maybe this is more of a nested namespace problem
actually? Using '(macro derived) and '(target derived) does make
symbols accessible as derived/+ dervived/- etc... in the core space.

This does requires a decision: to make name space mapping standard.


Entry: Derived Forth
Date: Tue Aug  5 13:10:00 CEST 2008

Let's concentrate the ideas from the previous post. To create a
derived Forth, create a separate namespace that is a child of the one
we build on top of. The language is then defined though a
parameterized forth-begin-tx such that:

 * 'derived-forth-begin uses only only (macro _) and (target _)
   namespaces for direct reference and definition.

 * the corresponding prefix macros in (macro _) map to (macro) for
   implementing functionality.

 * all (macro _) forms are accessible in the (macro) namespace through
   their direct mapping xxx -> _/xxx, but the (macro _) namespace is
   completely isolated from (macro).

Once this works, maybe the current 'live mode can be written as a
compile mode? But it's not really necessary, since it probably doesnt
require toplevel forms. Unless we allow definitions in the live
mode.. Maybe it is a cleaner model..


Entry: return stack
Date: Tue Aug  5 15:19:12 CEST 2008

Thomas Pornin:

  "ANS doesn't require the return stack to consist of stackable
   elements...  What ANS specifies is that, for each activation
   context, there is a stack-like storage area in which you may write
   celle values with >R, and get them back with R>. But these values
   are accessible only from the word itself, not from the caller and
   neither from the callees. Moreover, you are supposed to clean that
   stack before exiting from the word."

Elizabeth D Rather:	

  "Exactly.  A standard system must have a Return Stack whose entries
   are the same size as cells and data stack items.  And it must
   respond to >R, R@, and R>. What the standard *doesn't* require is
   that the system must use it for return addresses.


This is interesting. I didn't know that. This means it's probably best
to rename my 'x' stack to the 'r' stack. Anyways, i've removed all
references to 'r' so it can be added easily. Also, x will be renamed
to x@.



Entry: ANS Forth frontend
Date: Tue Aug  5 18:43:04 CEST 2008

The non-reflective words are going to be straightforward, but the
reflective ones are problematic.  The lexer, prefix processor and
macros are DIFFERENT entities in the unrolled structure, while in a
reflective Forth, they are all just Forth words.

I don't see a solution for this, other than completely replacing the
lexer and preprocessor parser with something more akin to an interpret
mode simulator.

http://www.ultratechnology.com/meta.html



Entry: documenting a port
Date: Wed Aug  6 12:24:47 BST 2008

Maybe it's a good idea to publicly port to 14 bit architecture, so
that process can be documented?




Entry: Formalizing Coma
Date: Wed Aug  6 20:26:42 CEST 2008

Scat is a concatenative language modeled after Joy.  Syntactically, a
program p is a concatenation of programs p_i or a primitive program
word p'

    p = (p_0 p_1 ...) | p'

This is isomorphic with the semantics, where each program word can be
associated with a function, and syntactic concatenation maps to
function composition.

For Scat, the operational semantics (the implementation in terms of a
primitive machine) is given by primitive Scheme functions closed over
a state space represented by a Scheme data type. In the case of Scat
and Coma, this is a stack, in case of Coma+Control, this a pair of
stacks.

For most practical use, but not necessary for theoretical use, the
state contains at least a stack. It's reason of being is to introduce
locality in the effect of functions. It is useful for creating a
practical programming language and to derive simple local syntactic
rewrite rules that can be derived from local stack operations.

Reduction (evaluation) of Scat expressions is eager, and happens from
left to right, where each primitive function part of a larger
composition is applied to the state, which is threaded through the
computation. This is the same as a sequential machine with global
state.

Note that, because function composition associates with function
application, the order of evaluation is arbitrary

    (S_0 a) b = S_0 (a b)

The function a applied to the initial state S_0 returns a value that
when passed to the function b yields the same result as evaluating the
composition (a b) 

Now, this only useful if you can proove that there is some c with
nice properties such that

    (a b) = c

which allows the application to be written as 

    S_0 c

The ``nice properties'' can be simplified to mean ``shorter code''.


For Coma, the eventual goal is to generate machine code (syntax) from
a concatenative source program (syntax). So instead of looking at
associativity of composition, we should look at associativity of
concatenation (syntax). More specifically, at rules that allow the
substitution of a concatenation of program words i.e. (x y z) by
another concatenation of words (s t).

     a b x y z c  =  a b (x y z) c  =  a b (s t) c  =  a b s t c

This uses the rule

     x y z = s t

Now, where do these rules come from? In Staapl, they are syntactic
transforms that preserve the associated semantics. These semantics are
operational semantics derived from a stack machine's instructions.


TODO: Implementation.

First: translate to QW, CW language.
Then implement rewrite rules.


Todo: 

  * explain where the assymmetry comes from: why does a rewrite rule
    operate on code from the right only?

  * explain in a simple way that all semantics comes from target
    idealized (un-projected) machine operations.

  * explain how you go from arbitrary substitutions to a greedy
    left->right substitution scheme.


EDIT: This needs some cleanup. It deserves a separate paper.

What I'm trying to explain:

  * Coma's primitives are intermediate language transformers. The
    intermediate language has essentially 2 instructions: execution
    and quotation. This is then extended with termination, jumps and
    conditional jumps.

  * Some of the intermediate language is real target language. Open
    question: good or bad? Should this be separated out? Due to the
    pattern matching, it behaves as opaque black-box code + it allows
    the implementation of simple peephole optimizations. (This is ok:
    it's a natural extension of the opaque CW data vs. QW that can be
    partially evaluated. actually, it's a mix of the 2.)

  * It also allows arbitrary data to be passed from macro to macro,
    which is a vehicle for arbitrary incremental code generation: this
    is the presence of QW in the language: it embeds a dynamic stack
    language.

  * This can be viewed as eager partial evaluation. In a concatenative
    language, PE is rather trivial (no variable substitutions). But,
    in most cases it is not complete: not all primitives exist at run
    time: they need to be specialized / combined.

  * It is not PE. Actually, it is, on the lowest level (math + stack
    shuffling).

--


Essentially:

This contains quite some posts about the semantics of Coma. It is work
in progress. Essentially, the semantics is defined from a set of
rewrite rules that implement CONCATENATE operation as a binary
function taking a program in intermediate form, and a program
word. 

This operation performs semantics preserving transformations.



Entry: evaluation
Date: Sat Aug  9 09:03:24 BST 2008

It's been a busy time lately.  What needs to happen next?  On of the
priorities is to get back into industry as soon as possible.  You
there, hire me!

Possible next steps:

  * reference documentation + API fixing

  * CSP

  * TAPL

  * other Microchip targets

  * zero-cost viral platform

  * better boot monitor protocol

  * array processing language

  * a 24-bit virtual dsp for PIC18

  * finding collaboration

  * standard Forth frontend



Entry: occam
Date: Sat Aug  9 09:18:21 BST 2008

see http://en.wikipedia.org/wiki/Occam_%28programming_language%29

- Communication between processes work through named channels.  One
  process outputs data to a channel via ! while another one inputs
  data with ?.  Input and output will block until the other end is
  ready to accept or offer data.

- SEQ, PAR and ALT for sequential parallel and conditional execution.

The difference between Concurrent ML (on which PLT Scheme's
concurrency is based) and Occam, is the way in which channels are
treated.  In CML they are first class (dynamic), while in Occam they
are static entities.


Entry: monitor rewrite
Date: Sat Aug  9 10:01:47 BST 2008

This involves 2 main parts: an asynchronous message-passing mechanism
over an abstract channel, and the definition of a low-level protocol
for different transports.

The current problem with the monitor protocol is that it is RPC
based. This is fine for 1-1 communication, but won't work well over a
many->many network.


Entry: Microchip programming protocol
Date: Sat Aug  9 10:07:51 BST 2008

Towards a zero-cost standard platform for Microchip PIC.  All PICs
support the microchip programming protocol, consisting of:

1 /MCLR Vpp
2 Vdd
3 GND
4 PGD
5 PGC
6 PGM

( see http://www.prc68.com/I/ICD2.shtml )

PICs without charge pump need 13V Vpp and can't program themselves.


It is necessary to split the development workflow into two parts:

  * Single chip applications: this can use a combined full programmer
    + debug monitor to also support chips that need Vpp. This is the
    one useful for teaching, so it makes sense to make the tethering
    hardware as simple as possible, i.e. to not depend on a Microchip
    programmer.

  * Networked applications: here interconnect needs to be as simple as
    possible, so the 5-wire programmer protocol is impractical.  Cost
    of tethering hardware isn't critical, so can have more complexity.
    It's best to stick with something standard here, i.e. I2C or CAN
    bus.  The topology can be symmetric: programmer host interface can
    be just one of the nodes.


Entry: ICD2 + serial?
Date: Sat Aug  9 13:46:38 BST 2008

Is it possible to combine the programmer port and the serial port into
a single connector?  If the hardware flow control pins can be used,
this could work.  I believe the FTDI chip has a bitbang mode too,
which could be useful.

serial   ICD2
GND      GND
/CTS
VCC      VCC
TXD
RXD
/RTS

DTE = master (Data terminal equipment)
DCE = slave (Data circuit-terminating equipment)

RTS = requirest to send
CTS = clear to send

Using standard serial, there are only 2 output lines: TXD and RTS.
The microchip protocol needs at least 2, clock and data, assuming the
MCLR and PGM pins are set correctly.

Maybe the best approach is really to stick with the ICD2 connector,
and device a protocol on top of that.


Entry: Staapler connector
Date: Sat Aug  9 14:31:37 BST 2008

Before the lowlevel boostrap can be solved, I need an adapter dat
converts serial to ICD2 protocol.  Let's combine the Olimex 6-pin
header with the FTDI 6-pin header into a 12 pin double header.  This
allows the use of 2x3 female headers for the Staapler.

How to arrange it? Daisy chaining Staapler should work without trouble
(using one Staapler to program another one). The other alignments can
be taken to simplify GND and VDD connections.


Staapler connector: target board male 2x6 header with ICD2 and TTL232R
serial connector.  This is placed at board edge.  Staapler (outline =
dotted lines) fits on top of this, sticking out (downward) over the
target bord edge (dotted edge).  The serial connector is optional.

  . . . . . . . . . . . . .
  . +-------------------+ .
  . |  1  2  3  4  5  6 | .  ICD2
  . |  7  8  9 10 11 12 | .  Serial (optional)
  . +-------------------+ .
  .                       .
  .  target board edge    .
-----------------------------
  .                       .
  .                       .

ICD2
 1 /MCLR  white
 2 VDD    red
 3 GND    black
 4 PGD    blue
 5 PGC    green
 6 PGM    yellow

SERIAL
 7 GND    black
 8 /CDS
 9 VDD    red
10 TXD    orange
11 RXD    yellow
12 /RTS


Entry: Staapler programmer
Date: Sat Aug  9 16:33:00 BST 2008

I started building a 18F1320 prototype which has this male connector
as its programming connector.  Next is to find out where to connect
the outputs and inputs of the female connector.

INPUTS:

A single input is necessary for the bit-banged serial receive.
Probably another one, or this one shared, for some client->host signal
when using the ICD2 port alone for monitor operation.

Either interrupt on change, or the INT0-INT2 pins can be used.


OUTPUTS:

TXD         bit banged serial transmit
/MCLR O     target reset
VDD   O     optional target power
PGD   I/O   data
PGC   O     clock
PGM   O     low voltage programming

Later this could be extended with a charge pump for generating the
programming voltage.


Other constraints:

  * RA4 = open drain
  * general purpose ports (not using analog): RA0-3
  * more (not using oscillator): RA6-7

Probably best to use RA0-3 for the 4 outputs-only ports /MCLR VDD PGC
PGM.  Let's use INT0/RB0 for PGD I/O.

So, what should I do first?
  * Get the monitor to work with the ICD2 connector only.
  * Build a programmer.

The programmer seems rather trivial.  The most difficult problem to
solve now is to get a bidirectional communication going over the ICD2
connector.

Some things to think about.  The only benefit of this device is to be
able to use both from-scratch programming + staapl console.  It is
beneficial for smaller targets when there is actually a charge pump
available.  Maybe it is better to just modify the firmware of an
already existing programmer?  The ICD2 would be a good target.


Entry: Staapler roadmap
Date: Sat Aug  9 17:54:17 BST 2008

Eventual Staapler goals are:

  * standardize on ICD2 connector for interactive debugging and get
    rid of serial connector.
  * add both LVP and HVP through ICD2 connector
  * create Staapler bootstrap method using parport prog

To bootstrap Staapler itself, this approach can be used:

 (1) Standardize all PIC development on ICD2 connector only.  Create a
     protocol for bi-directional serial communication on top of
     master-slave Microchip programmer protocol.

 (2) Build two Staapler boards A and B with ICD2+serial input (or
     Staapler + one other board with just ICD2+serial input)

 (3) Connect A's serial output to B's serial input and devise
     patch-through code.  This process then gives a workflow for
     working with multiple projects (namespace) at the same time.

 (4) Build the ICD2 comm master on A and ICD2 slave on B, using the
     serial->serial patchthrough for B development.

Independently after (3)

 (5) Connect A's ICD2 output to B's ICD2 input and write PIC LV
     programmer code. 

 (6) Use programmer code to emulate a minimalistic parport programmer.

 (7) Write host-side bootstrap code for PC parport programmer.

 (8) Staapler v2: add support for charge pump + USB, write HV
     programmer code, add support for different busses (for networked
     debugging).


Entry: Staapler protocol
Date: Sat Aug  9 19:10:37 BST 2008

1. Document the ICD2 master-slave protocol
2. Add to this slave->master messaging (maybe just polling?).
3. Allocate pins on master + slave side


1. From the programming manual for 18F1220 (DS39592B)

   Commands and data are transmitted on the rising edge of PGC,
   latched on the falling edge of PGC, and are sent Least Significant
   bit (LSb) first.

   All instructions are 20-bits, consisting of a leading 4-bit command
   followed by a 16-bit operand.  Depending on the 4-bit command, the
   16-bit operand represents 16-bits or 8-bits of data.

   COMMANDS FOR PROGRAMMING           4-Bit Command
                                                                                          
   Core Instruction                   0000 (Shift in 16-bit instruction)
   Shift out TABLAT register          0010
   Table Read                         1000
   Table Read, post-increment         1001
   Table Read, post-decrement         1010
   Table Read, pre-increment          1011
   Table Write                        1100
   Table Write, post-increment by 2   1101
   Table Write, post-decrement by 2   1110
   Table Write, start programming     1111


2. The hardware protocol used is really enough to provide
   bidirectional communication if it is extended with a simple 'data
   ready' signal from the slave.  This could either be an
   asynchronous slave signal (pull a line low/high) or an answer to a
   poll.

   A direct slave signal is probably easiest to implement.  Then it
   should be the data line since that is already a multiplexed port at
   the master, leaving the clock to remain output-only.

3. On the slave side it's easy: use data and clock from the
   programmer protocol.  On the server side, the clock will allways be
   output, but the data line receives an asynchronous signal.

   In case of asynchronous signalling, the data line is probably best
   implemented as a wire-or bus.

   Slave side data line for 18F1220 is RB7.  It has a weak pullup;
   maybe this can be used instead of on the master side, i.e. thinking
   about driving with PC parport, which has open collector output?
   This http://www.beyondlogic.org/spp/parallel.htm suggests using a
   4k7 pullup.  

   target        host
   ---------------------
   PGC           RA0
   PGD           RB0
   PGM           RA1
   VDD           RA2
   /MCLR         RA3
   GND           GND


Notes.

It's probably best to start with reading the device using the ICD2
protocol.  That way core routines for write and read can be created.  

The return protocol is self-delimited: each return message is
prepended with the size of the message.  Probably the master->slave
protocol should do the same so it can be routed.

RB7 is slave data line; sending is a simple shift.  Same for RB0
master sending.

What with Microchip's in circuit debuggin protocol?  Is that specified
somehwere?


Entry: Revising boot monitor
Date: Sun Aug 10 09:43:44 BST 2008

To prepare for proper routing of the monitor protocol, all commands
should be self-delimiting.  Note that the protocol remains RPC: each
request receives a reply.

 Q: Should the interpreter ignore messages it doesn't understand?  If
    so, what should be the reply?

Probably not.  This is a debugging protocol where the slave gives full
access to the host.  Limiting access by checking if some messages are
legal or not makes no sense in this setting: it's the host's
responsability to properly drive the target in this mode.

 Q: Should the monitor protocol be explicitly specified?

No.  It is the responsability of the application developper to use the
proper protocol since it might be extended for specific applications.

 Q: What with PING? I'm starting to run out of boot monitor space.

Maybe it's best to take this out. It's not essential. And
identification data can always be added to block 0, which is
essentially unused: the host knows what kind of target it is, and for
each target type, a storage area can be assigned.

Next: clean up live/tethered.ss so there is a clear delimited message
send/receive part in the protocol, instead of the current "send header
then send body" approach.

 Q: Should we send "write at address" or "set address pointer" +
    "write" ?

Opting for the latter. It seems to be easiest when sending multiple
chunks.

So, rewrite is done.  All messages in 2 directions are now
length-prefixed byte strings that do not require interpretation to be
repeated or routed.  The tethered.ss code is refactored into async
send/receive for messages and rpc functions.  'ping is removed and
replaced with a simple target-sync ack = OK mechanism, which enabled
the monitor code to fit into the 256 words again, after interpreter
changes for delimited messages.


Entry: popularity
Date: Mon Aug 11 08:01:12 BST 2008

To arduino or not?  It would sure help popularity, but I'm afraid it
will shift focus too much toward AVR, and leave PIC in the shadow.
I've invested quite some time in getting familiar with Microchip's
architectures, to for the tool as a whole, it might be better to stick
to that single architecture until most of the highlevel workflow and
interoperation design reflects my knowledge there.  This is
nontrivial: it includes the whole monitor.f + thethered.ss chain.

Standard Forth frontend or not?  It might help to get more people
interested, but would distract from the original idea.  If I find a
proper way to combine standard Forth with the current approach so they
can interoperate, and provide metaprogramming support only for the
standard one, it might work though.

Usage statistics.  I have no control over the PLaneT version.  How to
find out usage stats?  Maybe the PLaneT version should download
updates?  Or, I could put the installer in PLaneT only?



Entry: Staapler
Date: Mon Aug 11 11:30:15 CEST 2008

( It looks like Staapler is redundant since PicKit2 provides all the
necessary functionality.  It can program .HEX files and act as serial
passthrough. )

I started building 2 prototypes for the first iteration of the
Staapler based on a 18F1320.  Currently limited to programming /
debugging of Staapl based projects for PICs that support LVP.

It uses the Microchip 6-pin ICP/ICD interface, using the pinout from
the Olimex ICD2 clone (RJ jacks are too cubersome). 

http://www.olimex.com/dev/images/PIC/PIC-ICSP.gif

In addition, the connector has an optional second row of 6 pins with a
FTDI serial TTL header in case an additional serial port is desired.

The hardware interface is a male 2x6 header with ICD2 and TTL232R
serial connector.  This is placed at target board edge.  The Staapler
is plugged on top of this (board outline = dotted lines), sticking out
(downward) over the target bord edge (dotted edge).

  . . . . . . . . . . . . .
  . +-------------------+ .
  . |  1  2  3  4  5  6 | .  ICD2
  . |  7  8  9 10 11 12 | .  TTY Serial (optional)
  . +-------------------+ .
  .                       .
  .  target board edge    .
-----------------------------
  .                       .
  .                       .

ICD2
 1 /MCLR  white
 2 VDD    red
 3 GND    black
 4 PGD    blue
 5 PGC    green
 6 PGM    yellow

SERIAL
 7 GND    black
 8 /CDS
 9 VDD    red
10 TXD    orange
11 RXD    yellow
12 /RTS


Next to the female connector for programming a target board, the
Staapler has a male Staapler-compatible connector.  This is used to
bootstrap the Staapler boot monitor using an ICD2 and connect to the
host using the serial interface.  It contains the following
connections for the female header:

   Target        Staapler 18F1320
   ------------------------------
   PGC           RA0
   PGD           RB0
   PGM           RA1
   VDD           RA2
   /MCLR         RA3
   GND           GND

This has PGD wired to RB0(INT0) so the Microchip protocol can be
easily extended with a target -> host ``terminal ready'' signal,
enabling the host to wait for replies without the need for polling.

The bootstrap plan is documented here:
http://zwizwa.be/ramblings/staapl/20080809-175417






Entry: ANS : Forth in Forth + ???
Date: Mon Aug 11 13:11:57 CEST 2008

What are the necessary primtives to implement a Forth in Forth?  The
problem I'm trying to solve is to simulate an on-target Forth between
full simulation and full stand-alone.

The only real primitives are 

    @ ! execute

Which means: memory and execution model are abstract.  This works for
the standard PIC18 boot monitor, which is really nothing more than the
3 instruction Forth [1] together with some implemented primitives for
programming and more block transfer with less overhead.

So, let's first build a complete abstract forth machine.

What does a completely abstract ANS Forth machine look like? 

  * two stacks of cells (the parameter stack and R stack: >R R> R@)
  * a cell array allocation mechanism (ALLOT)


From the standard document[2].

  3.3 The Forth dictionary

  Forth words are organized into a structure called the
  dictionary. While the form of this structure is not specified by the
  Standard, it can be described as consisting of three logical parts:
  a name space, a code space, and a data space. The logical separation
  of these parts does not require their physical separation.

  A program shall not fetch from or store into locations outside data
  space. An ambiguous condition exists if a program addresses name
  space or code space.

  3.3.3 Data space

  Data space is the only logical area of the dictionary for which
  standard words are provided to allocate and access regions of
  memory. These regions are: contiguous regions, variables,
  text-literal regions, input buffers, and other transient regions,
  each of which is described in the following sections. A program may
  read from or write into these regions unless otherwise specified.

So, '@' and '!' can _only_ access data space.  


 Q: The important next question is: When defining reflective words
    (macros), do they have access data space at all?

No. Data space does not exist during compilation, which means that all
words that are accessible at run-time should also be simulated.  This
is the only proper way to unroll the behaviour completely, and have
simulated reflection that can be TRANSPARENTLY moved to real reflection.

Roadmap:

  - write a reflective ANS Forth that can generate simulated programs
    using some access to the target memory (for I/O)

  - from the representation of this, extract a kernel using dependency
    analysis of words.  draw primitives from a library.


 Q: How to represent the reflective Forth?

I'm not sure if it's useful to write this on top of Coma.  The result
of reading a Forth file is a structured code graph that can be
processed to generate a Forth kernel in terms of Coma and some
primitives.

 Q: Where to start?

Let's port JONESFORTH[3].  In fact, it might be a good exercise to
stick with Richard Jones's literal file, and replace the x86 assembly
with Scat code.

Actually, it can be ported to plain scheme code.  Let's write it in
PLT's rs5s language.

Hmm.. it's got me completely confused again.  There are some problems.
I'd like to write this on top of STC primitives, which are not
compatible with direct threaded code.  Alos, the dictionary model
needs to be worked out a bit.  So I need a standard model where
primitives can be plugged on top of some execution/dictionary model I
can live with.  Maybe the best way is to implement one myself after
all, or figure out how to modify one of the portable ones.

It doesn't look like JONESFORTH is a good starting point.  Going to
remove it from the darcs archive.

I need a different set of primitives.. Maybe eForth[4] is the way to
go after all?

I found a link on comp.lang.forth[5] about this.  This brings me back
to taygeta MAF[6] which is what I was looking for actually.

EDIT: One night of sleep later, I think effort is best spent
elsewhere.  The essential problem is that dictionary layout and
threading model need to be abstracted.  If the Forth has to run on a
Flash controller, Flash programming needs to be in there too..  This
is already a large part of the interpreter.


References:

[1] http://pygmy.utoh.org/3ins4th.html
[2] http://lars.nocrew.org/dpans/dpans.htm
[3] http://www.annexia.org/_file/jonesforth.s.txt
[4] http://www.baymoon.com/~bimu/forth/
[5] http://groups.google.com/group/comp.lang.forth/browse_thread/thread/287c36f0f2995d49/10872cb68edcb526?#10872cb68edcb526
[6] ftp://ftp.taygeta.com/pub/Forth/Applications/ANS/maf1v02.zip

Entry: Minimal bootstrap
Date: Mon Aug 11 15:46:05 CEST 2008

From http://groups.google.com/group/comp.lang.forth/browse_thread/thread/287c36f0f2995d49/10872cb68edcb526?#10872cb68edcb526

---
 FORTH Primitives Comparison (use a fixed width font)
---
3     primitives - Frank Sargent's "3 Instruction Forth"
9     primitives - Mark Hayes theoretical minimal Forth bootstrap
9,11  primitives - Mikael Patel's Minimal Forth Machine (9 minimum, 11 full)
13    primitives - theoretical minimum for a complete FORTH (Brad Rodriguez)
16,29 primitives - C. Moore's word set for the F21 CPU (16 minimum, 29 full)
20    primitives - Philip Koopman's "dynamic instruction frequencies"
23    primitives - Mark Hayes MRForth
25    primitives - C. Moore's instruction set for MuP21 CPU
36    primitives - Dr. C.H. Ting's eForth, a highly portable forth
46    primitives - GNU's GFORTH for 8086
58-255 functions - FORTH-83 Standard (255 defined, 132 required, 58 nucleus)
60-63 primitives - considered the essence of FORTH by C. Moore (unknown)
72    primitives - Brad Rodriguez's 6809 CamelForth
74-236 functions - FORTH-79 Standard (236 defined, 147 required, 74 nucleus)
94-229 functions - fig-FORTH Std. (229 defined, 117 required, 94 level zero)
133-?  functions - ANS-FORTH Standard (? defined, 133 required, 133 core)
200    functions - FORTH 1970, the original Forth by C. Moore
240    functions - MVP-FORTH (FORTH-79)
~1000  functions - F83 FORTH
~2500  functions - F-PC FORTH

FIXME   27 ?     - C. Moore's MachineForth

For comparison:
---
8       commands - BrainFuck (small,Turing complete language)
8     primitives - Stutter LISP
8     primitives - LISP generic
11     functions - OS functions Ritchie & Thompson PDP-7 and/or PDP-11 Unix
14    primitives - LISP McCarthy based
18     functions - OS functions required by P.J. Plauger's Standard C
Library
19     functions - OS functions required by Redhat's newlib C library
28       opcodes - LLVA - Low Level Virtual instruction set Architecture
51-56  functions - CP/M 1.3 (36-41 BDOS, 15 BIOS)
56     functions - CP/M 2.2 (39 BDOS, 17 BIOS)
40      syscalls - Linux v0.01 (67 total, 13 unused, 14 minimal, 40
complete)
71       opcodes - LLVM - Low Level Virtual Machine instructions
92+    functions - MP/M 2.1 (92 BDOS, ? BIOS)
102    functions - CP/M 3.0 (69 BDOS, 33 BIOS)
~120   functions - OpenWATCOM v1.3, calls - DOS, BIOS, DPMI for PM DOS apps.
150     syscalls - GNU HURD kernel
170    functions - DJGPP v2.03, calls - DOS, BIOS, DPMI for PM DOS apps.
206    bytecodes - Java Virtual Machine bytecodes
290     syscalls - Linux Kernel 2.6.17 (POSIX.1)

eForth primitives (9 optional)
----
doLIT doLIST BYE EXECUTE EXIT next ?branch branch ! @ C! C@ RP@ RP! R> R@ >R
SP@ SP! DROP DUP SWAP  OVER 0< AND OR XOR UM+ TX!
?RX !IO $CODE $COLON $USER D$ $NEXT COLD IO?

9 MRForth bootstrap theoretical
----
@ ! + AND XOR (URSHIFT) (LITERAL) (ABORT) EXECUTE

9 Minimal Forth (3 optional)
----
>r r> 1+ 0= nand @ dup! execute exit

drop dup swap

23 MRForth primitives
----
C@ C! @ ! DROP DUP SWAP OVER $>$R R$>$ + AND OR XOR (URSHIFT) 0$<$ 0=
(LITERAL) EXIT (ABORT) (EMIT) (KEY)

20 Koopman high execution, Dynamic Freq.
----
CALL EXIT EXECUTE VARIABLE USER LIT CONSTANT 0BRANCH BRANCH I @ C@ R> >R
SWAP DUP ROT + = AND

46 Gforth
----
:DOCOL :DOCON :DODEFER :DOVAR :DODOES ;S BYE EXECUTE BRANCH ?BRANCH LIT @ !
C@ C! SP@ SP! R> R@ >R RP@ RP! + - OR XOR AND 2/ (EMIT) EMIT? (KEY) (KEY?)
DUP 2DUP DROP 2DROP SWAP OVER ROT -ROT UM* UM/MOD LSHIFT RSHIFT 0= =

36 eForth
-------
BYE ?RX TX! !IO doLIT doLIST EXIT EXECUTE next ?branch branch ! @ C! C@ RP@
RP! R> R@ >R SP@ SP! DROP DUP SWAP OVER 0< AND OR XOR UM+ $NEXT D$ $USER
$COLON $CODE

BrainFuck
-------

> < + - . , [ ]

Stutter LISP
----
car cdr cons if set equal lambda quote

generic LISP
----
atom car cdr cond cons eq lambda quote

LISP, McCarthy based
----
and atom car cdr cond cons eq eval lambda nil quote or set t 




Entry: next
Date: Tue Aug 12 09:11:43 CEST 2008

Maybe it's a good idea to leave the standard Forth idea alone for a
while.  It is definitely doable and an interesting challenge, but at
this moment, there are probably more useful things to focus on.
Additionaly, having two different paradigms for Forth might be
needlessly confusing.  So let's move on.  To do:

  * Staapler

     - just continue the roadmap. next goal = device ID readout.

  * Reference documentation

     - The forms 'patterns 'compositions and 'substitutions.

  * Internal language standard

     - control flow primitives: document this when writing the 14 bit
       core port.

     - standard library: I'm not sure if this is useful yet.  Probably
       best to wait and see until there are more targets.  It would be
       nice to be able to share most of the monitor code though.


Entry: comp.lang.scheme
Date: Tue Aug 12 09:37:14 CEST 2008

Trying a different kind of announce here..
--

Hello,

Announcing the recent release of Staapl, a library for metaprogramming
microcontrollers.

It is centered around the concept of an ``unrolled'' Forth language
tower, impedance-matched to PLT Scheme's declarative module system,
and uses a stack-based pattern language to implement primitives for
code generation, partial evaluation of the pure functional target
language subset and parameterized metaprogramming.

The representation language is a thin layer on top of Scheme
implementing a concatenative language with threaded state which can be
used independently of Staapl.

Current implementation contains a Forth syntax frontend to the
concatenative macro language, a backend code generator for Microchip's
PIC18 architecture, a tethered interaction system, and a test
application implementing a sound synthesizer.

Download & Documentation at http://zwizwa.be/staapl

Enjoy!
Tom




Entry: debugger protocol
Date: Tue Aug 12 11:55:27 CEST 2008

Apparently the debugger protocol for 18F is proprietary, but for
18F877 it's available here:

http://www.beyondlogic.org/pic/f877-6bk.pdf
http://ww1.microchip.com/downloads/en/DeviceDoc/51242a.pdf

The main idea behind the debugger is the use of a breakpoint register
and external halt.

Looks like this is for ICD and is obsoleted, replaced by ICD2.

Anyways, I don't really need it.  The use I've found for ICD2 is to
debug the debugger..  I might add some support for ICD2 later, but
let's focus on a more direct interpreter approach.



Entry: double debugging
Date: Tue Aug 12 12:21:49 CEST 2008

A problem I ran into during development of KRIkit is the double
debugger problem.  When writing an application involving a client and
a server, it is beneficial to be able to access both systems from the
same host.

I'm thinking about a simple daisy-chained system. The unused bits in
the boot monitor interpreter could be used as address bits.  

The next step is serial patch-through.





Entry: parameterized code
Date: Wed Aug 13 09:14:46 CEST 2008

Context: writing a synchronous serial slave for the ICD2 programmer
protocol.  This involves code parameterized by 'clock' and 'data' pin
macros, and provides 'read' and 'write'.

It is time to properly tackle the problem I tried to solve with
loading different code modules into a namespace.  Currently, the only
way to introduce new bindings is to write parsing words.  These are
essentially prefix words that expand into arbitrary (prefixed) Forth
code.

I'm not entirely happy with:

  * 'load' into a shared namespace only works for 1 instance.

  * this pattern is too important to have it specified on top of the
    Forth prefix parser.

 Q: Is it possible to write down a simple solution in Scheme/Coma and
    translate this to a Forth prefix solution?


Entry: Coma code + instantiation
Date: Wed Aug 13 09:29:23 CEST 2008

Maybe it's time to start moving things from Forth syntax to Coma/sexp
syntax.  What's currently missing is an instantiation syntax for Coma
words.  Something like:

;; Define code generators
(compositions (macro) macro:
  (a  1 2 3)
  (b  a c)
  (c  a b))

;; Declare which of them employ run-time instantiation.
(instantiatie (macro) c b)

Problems:

  * Recursive macros.  During instantiation some recursive expansions
    might lead to infinite code size.  This needs a detection
    mechanism and possibly automatic instantiation.

  * Fallthrough and multiple exit points.  This needs special
    syntactic support.  Moreover, the ';' used in Forth is awkward to
    use in Scheme syntax.

  * Somehow it feels wrong to use Forth's structured programming words
    in the s-expression definitions.  Code blocks in the form of
    higher order macros seem to make more sense there..  Is this just
    aestetics?

The fallthrough/local-exit problem could be avoided by not allowing
them in a simple version of 'instantiate'.  The cost of these features
needs to be analyzed more: they are not for free and significantly
complicate the code graph instantiator.


Entry: problem with darcs-1 -> darcs-2
Date: Wed Aug 13 09:59:56 CEST 2008

I missed a patch on my laptop.. The one that cleans up instantiate.ss
How to fix?  Roll back the darcs-2 repo to the point right before this
patch, compute a diff and patch the new tree.

Alternatively: inspect the patch itself, see what changed and copy
over the files.

FIXME: the test doesn't work any more since the monitor changes. ok


Entry: Staapler change of plan.
Date: Wed Aug 13 18:52:06 CEST 2008

The plan has changed to move to PicKit2, since there's no way to do it
cheaper, and the platform seems open.  So what to do with Staapler?
Maybe just focus on using the ICD2 connector as a serial port.

Interesting: PicKit2 uses an 18F2550.  It might be directly
reprogrammable for Staapl use.



Entry: Interaction simulator
Date: Thu Aug 14 13:56:32 BST 2008

Maybe.. Instead of using a forth-style interaction mode, it is
possible to just completely simulate everything: interaction mode is
built on top of core Coma without target specialization, and the
resulting QW,CW code is interpreted.  Compiled definitions are kept in
the interpreter so they can be used in interaction.

This looks like a much saner model than the current one + it allows to
work towards some standard Coma semantics.



Entry: next
Date: Thu Aug 14 18:05:20 CEST 2008

  * specification of an internal language standard through simulation
    of non-specialized coma output.

  * create an instantiation syntax usable from scheme and write the
    fort-begin form on top of this.

  * think a bit about this whole csp/occam-pi thing.  figure out what
    the core automation problem is in the occam-pi compiler.  maybe a
    'manual' version can be included in Staapl?


Entry: books
Date: Thu Aug 14 19:37:02 CEST 2008

This is the collection of books I'd like to finish.  I'm being foolish
and read books without making exercises, trying to incorporate
knowledge in the design and implementation of Staapl.  TSPL, EOPL and
SICP were real eye-openers.


Done:
* TSPL http://www.scheme.com/tspl3/
* EOPL http://www.cs.indiana.edu/eip/eopl.html
* SICP http://mitpress.mit.edu/sicp/full-text/book/book.html (except logic)

Reading:
* CSP   http://www.usingcsp.com/
* TAOCP http://www-cs-faculty.stanford.edu/~knuth/taocp.html
* TAPL  http://www.cis.upenn.edu/~bcpierce/tapl/
* TAPOC http://www.comlab.ox.ac.uk/people/bill.roscoe/publications/68b.pdf

Todo:
* PLAI  http://www.cs.brown.edu/%7Esk/Publications/Books/ProgLangs/
* CTMCP http://www.info.ucl.ac.be/~pvr/book.html



Entry: dsPIC
Date: Fri Aug 15 08:48:46 BST 2008

Microchip is not really being very helpful providing anything else
than the .pdf programmers reference.  So, let's see if there's a way
to get a hold on the instruction set without typing it in.

The difference between dsPIC and the 8-bit PICs is the addressing
modes.  This chip has more of a classical RISC ISA.

Data memory hierarchy:

RAM, first word      (WREG)
RAM, first 16 words  (Wxx - Working registers)
RAM, first 4K        (File registers, Near RAM)
RAM, all 64K


Data addressing modes:

              | File Register  
   |- Basic --| Immediate       | No Modification
   |          | File Register   | Pre-Increment
   |          | Direct          | Pre-Decrement
   |          | Indirect -------| Post-Increment
   |                            | Post-Decrement
   |                            | Literal Offset
   |                            | Register Offset
   |                     
   |            |- Direct
   |- DSP MAC --|               
                |             | No Modification
                |             | Post-Increment (2, 4 and 6)
                |- Indirect --| Post-Decrement (2, 4 and 6)
                              | Register Offset


The only difficulty is to somehow encode the addressing modes
properly.  The generic template is:

 
 (file  (o b f d)       "oooo oooo obdf ffff ffff ffff")
 (lit10 (o b k d)       "oooo oooo obkk kkkk kkkk dddd")
 (lit5  (o b w k d)     "oooo owww wbqq qddd d11k kkkk")
 (alu3  (o b w q d p s) "oooo owww wbqq qddd dppp ssss")


lit5 -> lit4 for shifts

I'm not feeling much for typing it all in.. Isn't there a way to snarf
the assembler from a file generated by mplab?

Typing the address modes manually, the opcodes i can probably get that
way. So, roadmap:

1. generate an ASM file with all opcodes
2. run mpasm30
3. interpret output (binary?)

Setting up mpasm30.. I have an XP image somewhere.. Wait, there are
linux binaries.


Entry: Architectures: where to draw the line?
Date: Fri Aug 15 16:22:27 BST 2008

dsPIC has a gcc toolchain:

http://iridia.ulb.ac.be/~e-puck/wiki/tiki-index.php?page=Cross+compiling+for+dsPic

http://iridia.ulb.ac.be/~e-puck/share/cross-compiler/packages/pic30-binutils_2.01-1_i386.deb
http://iridia.ulb.ac.be/~e-puck/share/cross-compiler/packages/pic30-gcc_2.01-1_i386.deb
http://iridia.ulb.ac.be/~e-puck/share/cross-compiler/packages/pic30-support_2.01-1_all.deb


It's a bit silly to try to compete with that.  If Staapl should
support dsPIC, it needs to do so on top of C.  

It might make sense to try to write some dsp-ish language that
compiles to assembler, but it doesn't look like there is much to gain
in writing an assembler + forth compiler in the same style as for
PIC18.  Once it's a multi-register RISC chip, C is really the way to
go.  Same for 32-bit ARM/MIPS.  Also, when there's a C compiler
available, not being able to integrate with it is commercial suicide.
For the small controllers, you're going to be the only tool in the
chain.  Not so for the bigger ones... there are going to be libraries
and C developers.

So, where should Staapl live?

- For 8-bit controllers < PIC18 Staapl = Forth based macro assembler.
  Implements native code generator.

- For 16/32-bit controllers that have a decent C compiler: Staapl
  provides Forth based scripting language + DSP-ish array processing
  languages.  Built on top of C.

- For 32-bit systems based on PPC/Intel: Staapl's PLT Scheme based
  meta system.

The unifying idea is concatenative languages: the bare metal macro
Forths for the low end, a linear typed Forth for the mid end and the
functional Scat/Scheme for the high end.



