This is the dev log for the BROOD project.
For more information see http://zwizwa.be/brood

Project history:

BROOD 5.x
(a full rewrite of basic structure, see http://zwizwa.be/ramblings/scat)
- static name management: modules + lambda (no more hash tables)
- separation of rpn macro language and forth syntax (parsing words)
- base scat language: stack + hidden state threading
- macro laguage: 2stack
- more macro / postponed-word unification (i.e. multi-exit macros)
- better target specialization: controlled redefine of core words
- delimited continuations for state extensions
- simplified assembler + improved code data structure
- asm pattern matcher with static checking

BROOD 4.x
- host language implemented as mzscheme macros (without interpreter)
- made most macros hygienic, including pattern + forth preprocessor.
- purification: eliminated some side effects
- added better state syntax + state combinators

BROOD 3.x
- switched to host language with static binding
- moved from implicit functional store -> explicit state binding (lifting)
- introduced pattern matching language for peephole optimizer

BROOD 2.x
- switched from forth to functional host language with dynamic binding
- uses functional store
- sheepsint 2.0 on 18f1220

BROOD 1.x (original BADNOP)
- imperative forth host language
- simple PIC specific
- sheepsint 1.0 on 12f675


The idiomatic forth language (now called PURRR18) and the ideas behind
the peephole optimizing compiler have remained fairly constant since
the 1.x version. The forth is implemented as a collection of macros:
functions which process a stack of assembly instructions.

The thing which changed throughout the versions is the use of better
(higher) abstractions to implement it, to make the code more
understandable, and the application more usable and debuggable.

Note that this project is much more about functional programming and
scheme -- plt scheme and the way it deals with macros and other
abstractions -- than about forth. However, the end result is nice to
play with PIC chips, and it is used for real-world stuff.

Version 2 of BROOD contains a separate dev log. I started from scratch
in version 3.

---------------------------------------------------------------------

WARNING. This file is a dev log. It contains notes about problems I
encounter during the learning and development process, and mainly
serves as archive and sound board for myself.

It is largely unedited. Some notes might only make sense in my mind,
especially if it's about things I don't understand yet. Some notes are
just plain wrong, and this is not always indicated.

It is a story about how concrete ideas grow out of a puddle of mud,
how a series of mistakes and half--assed understanding can lead to
something beautiful in the end. This description of the process is
something you rarely find in research papers. However, I've allways
found texts and talks about mistakes and the process of correcting
them much more valuable than any shiny end result. This text is my
contribution to that part of knowledge, and like most blogs, can be
extremely boring if you don't know what to ignore.

The single rule for this log is: I am allowed to delete embarassing
erroneous entries if I explain exactly what went wrong in the
reasoning.

---------------------------------------------------------------------


Entry: monads
Date: Sun Jan 28 14:43:30 GMT 2007

EDIT: i clearly didn't get it here.. much of the monad stuff i talk
about is not what you find in Haskell. the thing is: a stack that's
threaded through a computation behaves a bit like a monad. it solves
some same practical issues, especially if you start to use syntax
transformations that can convert pure functional code to code that
passes a hidden top element. but it doest have the power of a generic
bind operation. i talk about this later though.


i think i finally start to get the whole monad thing. in layman's
terms: it is centered around splicing together (using 'bind')
functions that take a simple object to a container.

.in what i'm trying to accomplish, this is just compilation: take in
some code and output an updated state.

maybe i should give up on the whole CAT thing after all, and
concentrate on using scheme and some special structures to actually
create a proper language and macro language. i already have a way to
write concatenative code in scheme without too many problems (see
macro.ss)

the layer is probably just not necessary.. scheme is more powerful,
and everything i now do in CAT i could try to move over to a virtual
forth: write everything from the perspective of the forth
itself.. something like: 'spawn this process on the host'.

another thing that's wrong with CAT is the lack of decent data
structures. it's overall too simple for what i'm trying to do: proper
code generation on top of a simple basic language.

let's go back to my recently restated goals for BROOD:

* basis = forth VM on a microcontroller for simplicity
* write cross compiler in a language similar to forth
* use as a FP approach for the compiler language


the middle one can take on different forms. i still think it is very
beneficial to have an intermediate language to express forthisms. but
this language can just be embedded in scheme stuff.

so let's just start to build the thing, right?


Entry: CAT design
Date: Sun Jan 28 17:28:56 GMT 2007

[EDIT: Sun Oct  7 00:20:32 CEST 2007
- the pattern matching grew to a 'quasi algebraic types' construction
- from forth -> machine code there are now a lot more passes
- the shared forth elimination is made machine specific.]

the design is quite classic forth, but it might be simplified a
bit. CAT consists of the following pipeline:

      (1)	  	    	       (2)
forth --> peephole optimized assembler --> absolute bound machine code

currently (1) is the compiler while (2) is the assembler. it might be
more interesting to actually split it up in two parts. introduce a
peephole optimizer that can separate out the forth compiler to a
higher level and the assembler to a lower, machine specific level,
making it a bit more like a frontent/backend multiplatform compiler.

also, several things could be made declarative. peephole optimization
is basicly pattern matching. currently CAT implements it as a pretty
much imperative process: if last instruction is dup, then drop undos
dup etc..

give the target we are using, it is possible to completely write the
assembly language in forth style.

so, summary: split peephole opti in 2 parts:
- shared forth elimination (as a result of macro expansion)
- machine specific assembler opti


Entry: declarative peephole optimizations
Date: Sun Jan 28 17:38:31 GMT 2007

basicly, this is a rewriting system. currently i use a tree structure
for this (ifte). this is a list of transformations:

(
 [(dup drop) ()]
 [(dup save) ()]
 [(save drop) ()]
 [(1 +) (inc)]
 [(1 -) (dec)]
 [((lit?) +) ()]
 )

what do do with things that do not fit this? for example
literals.. i really do need predicates. actually, i should make a
list of all optimizations to make it a bit more clear. currently
the code is way to dense.


Entry: rewriting
Date: Mon Jan 29 00:26:40 GMT 2007

funny.. google search on rewriting lead me to the pragmatic
programmer. maybe i shouldn't read joel on software. especially not
his rant about never scratching a whole project and starting
over. 

anyways.. there are some serious things wrong with the way i'm trying
to solve the compilation / optimization problem. i'm using a massive
tool and am still writing in the gerilla hack style most forths are
written in. i have proper data structures now, so why not use them?
why not make some minilanguages that do special tasks.

it would be interesting to start moving functionality over to the lamb
core as soon as possible. most of the code is optimization though,
so..

i'm curious about this rewriting business.. looks like there's
something to learn there. i know it works in a naive way, since that's
what i already have. but i'm curious if this can be taken
further. with faster static code it might be possible to do a whole
lot more (non deterministic stuff).

some problems to face are literals and some..

one thing that worries me is 'how to prevent loops'.  i know, if
things get smaller, there can be no loops, but i can imagine some more
fancy expand/collapse rules that might start looping with a naive
approach.

looking at the optimizations, most are about reducing stack juggling
and moving it to register transfers. this is almost universal on all
machines. i do need to think a bit about a sort of 'base line forth'
that will be the end of the optimizations, such that eventual
compilation is straightforward. this seems like an elegant solution.


Entry: purely compositional approach (joy)
Date: Fri Feb  2 12:35:56 GMT 2007

whenever program text is read, it is immediately compiled. each symbol
is replaced by it's particular function, and each constant is replaced
by a function that pushes the constant.

sym   -> (lookup sym)
data  -> (lambda stack (cons data stack))

to get back the value of a data atom (practical issue), you pass any
list, apply the function, and pop off the data.

this can be done at compile time.

so, can we do with data structures composed entirely of functions?
probably yes.. probably this isn't even such a bad idea..

it looks like it is not a good idea to map composition to single
lambda expressions, but to have an interpreter for it instead, so we
can implement things like CAR efficiently: it is possible to implement
CAR on an abstract function which represents a list by 'testing' it on
a stack, however, this is a lot worse than just getting the left
element of the first pair..

then, why not represent constants by constants instead of their
wrapping functions? back to square one..


Entry: jit compiler + parser
Date: Fri Feb  2 15:24:47 GMT 2007

if i'm absolutely sure that function names are static, it's possible
to use a jit compiler without sacrificing this semantic property:
leave them as symbols until they are encountered, then compile
them. this would also eliminate the problem of forward/backward
declarations etc.

this seems to work very well in the simple first experiments..

the parser seems to work too: parse code = list of things. if one of
the things is a list, parse it and wrap it in a lambda.

so what about closures?


Entry: rewriting
Date: Sat Feb  3 10:52:54 GMT 2007

i'm working on the rewriting, and it looks like this is ideal to use a
compositional mini language for.. so i've quickly extended the 'run'
function to take a 'compiler' argument which will resolve symbols to
functionality, still using the jit compiler.

the 'compiler' term could be used to give context (lexical) dependent
information about symbols. however, it should really be tied to the
stored code then..

the idea is to represent pattern matchers as ordinary composed code,
but using a special compiler (macro) to instanitiate them.

so this gives the list of problems for today:
- solve lexical compilation issues
- how to 'execute' from within a primitive


lol. this is again exactly the same thing as i already had: each forth
word is a macro :)

so the problem reduces to the lexical thing (namespaces) and how to
compile a generic pattern matcher into a macro.


Entry: do i really need lambda?
Date: Sat Feb  3 11:24:04 GMT 2007

what i need is local names, just for the sake of code organization and
different sublanguages. i don't need lambda really. i don't need
runtime binding of symbols to names. the whole idea of
combinatory/compositional/concatenative languages is to eliminate
variable names...


Entry: macro semantics
Date: Sat Feb  3 12:37:36 GMT 2007

i have something like this now:

(define (macros name) name)
(define-resolver register-macro macros)

(register-macro 'nop (lambda stack stack))
(register-macro 'dup (lambda (asm . stack)
                       (pack (pack `(movwf POSTINC) asm) stack)))

which can be executed as

> (run (parse macros '(dup dup)) '(()))
(((movwf POSTINC) (movwf POSTINC)))
> 

so in this, compilation is the execution of one program to produce
another program. let's stay in the forth syntax as long as possible,
and rewrite this to:


(define-syntax forth
  (syntax-rules ()
    
    ((_ output () (rwords ...))
     (pack rwords ... output))

    ((_ output (word words ...) (rwords ...))
     (forth output (words ...) ('word rwords ...)))

    ((_ output (words ...))
     (forth output (words ...) ()))))

(define (macros name) name)
(define-resolver register-macro macros)

(register-macro 'nop (lambda stack stack))
(register-macro 'dup (lambda (out . stack)
                       (pack
                        (forth out (POSTINC1 movwf))
                        stack)))

> (run (parse macros '(dup dup)) '(()))
((movwf POSTINC1 movwf POSTINC1))
> 


Entry: dynamic code
Date: Sat Feb  3 12:45:23 GMT 2007

note that once something is run once, it will be compiled in place and
can never be accessed as data again. it is important to make it
impossible to do things like

     (1 2 3 + +) dup run

and then use the 2nd copy. this is easily solved by first creating a
copy though, but sort of defeats the current way the JIT compiler
works. maybe i can make sure that the 'run' word, which is the
interface to the internals, always makes a copy of a list whenever it
encounters dynamic code?

to summarize

- parsed lists are safe. they are always pure code and can never be
  interpreted as pure data.

- anything that's 'run' at run time is not, so here a copy needs to be
  made.

the solution here is that 'parse' is necessary to run symbolic code:

    (1 2 3 + +) parse run

and 'parse' already makes a copy of the list, since it is functional.

NOTE:  (define (list-copy x) (append x '()))

so i defined   interpret <- (parse run)


Entry: bind and stuff
Date: Sat Feb  3 13:17:03 GMT 2007

i think it's starting to dawn on me.. the disadvantage of functions
like the above is that state accumulation is explicit: there is this
chunk of accumulated state on the top of the stack while none of the
functions actually need it to be there. enter the 'bind' concept. in
order to get rid of these arguments, you define a macro according to:
( -- code ), but automaticly lift it to ( code -- code ).

ok.. this seems to work. what i have now is a way to generate simple
substitution macros.


Entry: rewrite macros
Date: Sat Feb  3 15:54:08 GMT 2007

the next step is rewrite macros. this should be done in two steps.  in
order to make a single 'intelligent' macro, different patterns need to
be combined into one fucntion, and one function needs to have
information about different patterns. a sort of 'transpose'.

- make a list of rewrite patterns
- compile it into code

rewrite macros are easier understood as operating on output forth
code.

i don't know. ok.. time to be stupid then. state the previous
solution, then abstract it.

previous solution was explicit

(dup drop) -> ()

drop is a function that discards anything that comes before it and
produces a value (without other side effects: it's important to write
macros so that the last operation is a mutation)

so drop needs to be intelligent.

ok. it's easy enough to implement this in exactly the same way as in
CAT/BADNOP. however, there should be a more highlevel construct that
eliminates the explicit if-then things.


Entry: compositional languages suck
Date: Sat Feb  3 17:07:00 GMT 2007

it's a feast to use them to glue things together, but more complicated
things are easier expressed using lambdas.. i think the approach of
writing the core algorithms in scheme with full fire power, and
keeping the language itself mainly for interaction, is a valid one.

compositional languages are cool because they lift you from the burden
of having to name things, and allow you to think in terms of structure
(more geometricly) vs. random connections in parametrized things.


Entry: more lifting
Date: Sat Feb  3 17:44:56 GMT 2007

i ran into a new class of functions. i already had

  . -> code

which are just constants. now i have

  code  -> code

which are code transformers that need to look at the current generated
code state (never the source code!)

i added the default resolver for macros to be a quote to the forth
output stack. i do need to change the way other types are handled
though. it looks like this is better solved earlier in the process. to
keep the JIT compiler like it is, the parser could be adapted to
already compile constants to quoting procedures.

this works nicely.


Entry: now for the meta stuff
Date: Sat Feb  3 21:16:12 GMT 2007

some questions remain. how to generate more boilerplate for some kinds
of peephole optimizations, and how to check if it is actually possible
to optimize towards a 'core forth' that can be straight compiled to
assembly.

let's find out by systematicly porting some macros.

the main question is: what with arguments?

    123 ldl

means load literal in the top of stack register.


Entry: cat snarf
Date: Sun Feb  4 00:16:56 GMT 2007

porting stuff from old cat to new cat. seems to work really well. not
having state on stack to deal with makes things a lot easier..

but for badnop this means the database needs to be designed in a
proper way. maybe for the assembler we use some kind of dictionary as
a state?

the thing is.. i'd like to keep as much of the 'functional OO' that
was present in CAT. this makes it possible to do parallel stuff and
backtracking in a very easy way, especially now that it's kind of
fast.


Entry: intermediate language
Date: Sun Feb  4 10:51:20 GMT 2007

since i am optimizing for a register machine, it might be best to
write the rewriter in terms of a the register machine primtivies i
used in BADNOP.

the main thing to decide is: is it easier to optimize code like this:

    	 1 2 +

or this
	 (dup) (lda 1) (dup) (lda 2) (call +)

it's definitely easier to do the former.. maybe i should implement the
assembler now, so i can see this a bit clearer.


the problem i'm trying to solve is: rewrite forth in such a way that
assembly becomes trivial. things which make this problematic are
folded constants: constants that are already bound to a machine
operation as a literal. maybe i should just write them as 'pseudo
forth code' but group them, like this:

         1 2 +   ->   dup  (drop 1) dup (drop 2) 

here every grouped instruction is meant to be replaced later by one
machine opcode. the advantage of this approach is that there are no
'self-quoting' things in the code after a first pass.

considering the targets i'm using don't have an instruction to do
dup+ldl in one go, i guess this idomatic approach is a valid one.

it is probably better to do this in more phases:

1 forth based semantic substitution (rewrite)
2 conversion to idomatic representation (compile)
3 direct mapping from idiomatic cells to assembly code (assemble)


because 3 can be made invertable, it's possible to easily decompile,
flatten and semanticly optimize back!


Entry: pattern matching
Date: Sun Feb  4 13:41:49 GMT 2007

having a look at the plt pattern matching code. i really need this
kind of stuff :) the basic thing is

(match  x  (pat expr) ...)

when x matches one of the pat, the correspionding expr is evaluated
with symbols of pat bound to values in x.

ok.. seems to work pretty well. but i still need to find out how to
reverse a pattern.


Entry: next
Date: Sun Feb  4 16:06:45 GMT 2007


* find out how to reverse patterns
* lift rewriters above 'rest'
=> compile patterns

* assembler
* state


Entry: compiling patterns
Date: Sun Feb  4 16:09:43 GMT 2007


i need to make my own pattern language to compile substitutors from a
more highlevel definition like

((dup drop)  ())
((a b +)     (,(+ a b)))
((a b xor)   (,(bitwise-xor a b)))
((a not)     (,(bitwise-xor a -1)))
((a negate)  (,(* a -1)))

((dup dup (drop a) !)  ((,a sta))
((dup (drop a) !)      ((,a sta) drop))

using syntax-case. hence the latter 2 expressions will be merged into
one '!' rewriter macro.

as a preparation i can already try to see if all macros fit in this
category. yes, they do.

but i need to solve the problem of solving type matching first, since
the arithmetic above only works if the numbers are immediate. type
matching is part of the match.ss language, but i didnt figure out yet
how to also bind a matched item to a name..

i think i solved the rewriter problem for the pattern language
above. just need to sort out some macro issues, probably best use
syntax-case with some explicit rewriting.


Entry: syntax transformation
Date: Sun Feb  4 22:48:32 GMT 2007


1. one pattern i've been trying to solve is this

(definitions
	(some (special) structure here)
	(same (special) structure there))

it's easy enough to write the first transformation, but how do you do
the next one without having to explicitly recurse using names etc.. ?

in other words: "now just accept more of the same".
this seems to be the answer:

(define-syntax shift-to
  (syntax-rules ()
    ((shift-to (from0 from ...) (to0 to ...))
     (let ((tmp from0))
       (set! to from) ...
       (set! to0 tmp))    )))

one ellipsis in the pattern for every ellipsis on the same
level. or something like that.. need to explain better.


2. what's the real significance of the argument of syntax-rules ?

"Identifiers that appear in <literals> are interpreted as literal
identifiers to be matched against corresponding subforms of the
input."


3. how to get plain and ordinary s-exp macro transformers using
define-syntax?

i was thinking about something like this:

(define-syntax nofuss
  (syntax-rules ()
    ((_ (pat ...) expr)
     (lambda (stx)
       (match (syntax-object->datum stx)
              ((_ pat ...)
               (datum->syntax-object stx expr)))))))

(define-syntax snarf-lambda
  (nofuss (args fn)
          `(lambda (,@(reverse args) . stack)
             (cons (,fn ,@args) stack))))


but that doesnt work, since nofuss is not defined at expansion
time.. but it should work. there has to be some way of doing this.


Entry: pattern language 
Date: Mon Feb 5 11:30:59 GMT 2007

seems to work. some problems remaining though:
- default clause
- literal/parameter
- fix 'rest'
- type specific match

done


Entry: remaining problems
Date: Mon Feb  5 21:26:33 GMT 2007

two big problems remaining. the assembler and state storage. the
assembler is a bit nasty. lots of tiny rules to obey.. i wonder if i
can make something coherent out of this.

the state store is tightly coupled to the assembler. here i can
probably do another trick of accumulating the dictionaries using some
binding functions.

what are the tasks of the assembler?
-> creating instruction codes from symbolic rep
-> resolving all names to addresses (2 pass?)
-> making sure all jumps have the correct size

the result of assembly is a vector, and some updated symbol
tables. the input is optimized and idiomized forth code that can be
straight translated.

it would be nice to use nondeterministic programming for choosing jump
sizes, but that's probably overkill: moving code down is probably easier.

some observations: 

* if i only shrink jumps instead of expanding them, the effects are
  always local: no bounds are violated, but the solution might be
  sub-optimal. for backward jumps, the correct size is known, only
  forward jumps need to be allocated.

* a proper topological representation which indicates where jumps go
  and where they come from is a good thing to have: a single cell has:
  - list of cells that go from here: max 2
  - list of cells that come here
  - instruction + arguments

* maybe that's overkill. since it can always be generated if analysis
  is necessary.. what about this:

  - assume incremental code: old code is not going to be jumping to
    new code.

  - within a code block under compilation, forward jumps are possible,
    so they need to be allocated: use maximum size. however, they
    should be rare: function definitions could be sorted before hand?

  - recursively work down from the current compilation point, and
    adjust all jumps. backtrack if necessary. this can be done in a list

* yep.. it's probably simplest to just perform the 2 ordinary steps of
  backward then forward address resolution, and add as many passes as
  necessary to resolve the shrinking.

* there are 2 x 2 x 2 types of jumps wrt. a single cell collapse
    abs/rel x start:before/after x finish:before/after

    -> relative : adj if they cross the border
    -> absolute : adj if they end past border


Entry: tired
Date: Mon Feb  5 21:44:26 GMT 2007

probably all a bit over my head atm. feeling a bit sleepy. maybe do a
bit of cleaning. like writing fold in terms of match instead of if
etc..

(define (fold fn init lst)
  (match lst
         (()       init)
         ((x . xs) (fn x (fold fn init xs)))))

(fold + 0 '(1 2 3))

good coffee :)

i'm feeling a bit ambitious.. instead of writing a flashforth based
standard forth, it might be feasible to try to write a functional
programming language for the micro.


Entry: ((dup cons) dup cons)
Date: Tue Feb  6 03:43:13 GMT 2007

just added 'compose', then i found out the quine in the title doesnt
work any more. it does work in joy, so what's up? the problem here of
course is that consing 2 quoted programs does not give a quoted
program: these are abstract data types and not lists.. manfred must be
using some explicit definition of cons on quoted programs somewhere,
or i don't really understand his list semantics.

the problem with mine is that quoted programs are in fact lists, but
they have a header containing a link to the symbol resolver.

so, am i missing something about Joy? is the: "quoted program is list"
necessary? have to check that.

in the mean time, i can get to quines by defining 'qons'.

there is a possibility of embedding lex information inside the list,
so instead of

   (lex a b c) -> ((lex a) (lex b) (lex c))

which might even be better, since it allows for mixing of dicts. this
also makes it possible to use a simpler interpreter, since the lex
state doesn't need to be maintained.

hmm.. tried, but to tired. but something is rather important. having
lists and programs on the same level as in joy is nice, but requires a
single semantics. since i already almost automaticly introduced 3
kinds of sematics for symbolic code (cat, forth rewriter and forth
compiler) this is not really feasible.

so lists and programs should probably be separate entities, where
programs are abstract and not dissectable, but compose and qons are
defined to do run-time composition.

list	 program
concat	 compose (qoncat)
cons	 qons

where quons will take 2 quoted programs, and concatenate the quotation
of the first with the second, or (a) (b) -> ((a) b)

however, if i change the interpreter as mentioned above, the two
columns will be identical, and programs can be manipulated just as
lists, without giving up any other functionality.


so, good to learn, CAT not really Joy because

- i'm using fully quoted lists instead of quoted programs to represent
  data lists: there is a clear separation between code and data.

- joy probably uses numbers directly in the lists, which i can't due
  to different number semantics for cat and purrr. for me, they have
  to be incapsulated in numbers.

- parsing from symbolic -> code is explicit because of different
  semantics: this allows reuse of interpreter for different mini
  languages.


Entry: interpreter cleanup
Date: Tue Feb  6 11:33:15 GMT 2007

instead of using set!-car, it might be better to use delayed
evaluation for the instruction type: everything evaluates to a
procedure. that way it stays functional.

ok. this seems to work: () and nil are the same now. plus there is a
structural equivalence: since compilation from symbol ->
procedure/promise is 1-1, the size of a compiled program (list size)
is equal to the original code list size.

ok.. now ((dup cons) dup cons) still doesn't work!!

the reason is that nested lists in code get executed.. is this still
valid? something smells here..

let's first change the other parser/compilers..

ok, if i swap around the quoting such that new definitions are always
wrapped in a closure that will call 'run', i should be safe.

this works, but i run into a difficulty:
i cannot unquote a program wrapped in a closure (interpret quoted stack)

the solution to this is to change back the semantics of 'parse' -> it
will return a quoted program instead of an executable one, and put the unquoting in 'def'.

i had to change this:

(define-word run     (code . stack) (run code stack))

to this

(define-word i        (code . stack) (run-quoted code stack))
(define-word execute  (code . stack) (run-unquoted code stack))


to only take quoted programs, not pure closures. 'execute' is only
there for completeness, since it is rarely needed (it is equivalent to
(nil cons i))

maybe that's again one of the key points? meaning, to make a very
clear distinction between quoted programs = aggregation of primitives,
and primitives.

ok.. this looks like it's working. this makes the language a bit more
introspective. it looks like the quine works too now.

this was quite surprisingly non-trivial!

so what do i learn?

------------------------------------------------------------------- 
it is necessary to explicitly distinguish QUOTED PROGRAMS from
PRIMITIVES. the latter is a black box, but the former is a list of
primitives. this structure is NOT recursive!
-------------------------------------------------------------------

having quoted programs obeying the list interface adds very flexible
introspection. this probably means that the difference from Joy is
purely syntactic now.


tiens.. ((dup cons) dup cons) broke again...

ok. the reason is that after constructing a program from another
program, you need to 'compile' it before it can be run with 'i', so
the quine is relative to 'interpret' and not to 'i'.

i'm going to switch back to my previous notation and use 'run' instead
of 'i'. the conclusion here is:

       --------------------------------------------
       { QUOTED programs } is a subset of { LISTS }
       --------------------------------------------

i think this is not the case in Joy. whenever you operate on a quoted
program using list construction, you need to 'compile' it to a program
again. this is a projection from the set of lists to the subset of
quoted programs.

so 'compile' really needs to be a projection, meaning (compile
compile) and (compile) are equivalent.


it is possible to change this simply by having run-unquoted cons
everything to the stack that's not executable. however, keeping this
explicit allows more wiggleroom in the semantics of different
sublanguages. or maye better: it is cleaner, since there is no
'default' behaviour (the way the 'cond' is setup in run-unquoted
allows overriding.. that's not so clean).

so, final word. the interpreter implements:
* interpretation of a list of primitives as NOP,TC,RC
* lazy evaluation of primitives (JIT: delayed compilation)

all the rest needs to be implemented in the source transformers.

NOTE: [dup cons] is the Y combinator, sort of..


Entry: assembler
Date: Tue Feb  6 16:48:07 GMT 2007

alright. the assembler. finding instructions is the trivial part. the
hard part is finding addresses of jumps.

1. resolve backward references  (+ find instructions)
2. resolve forward references

if there are multiple size jump instructions, and there are relative
and absolute jumps, extra passes can be added that resolve efficient
allocation of these. allocating all forward references with maximum
range, and adjusting them one by one seems to be the best approach:

3-N. shrink forward references if necessary.

things get complicated if forward small offset relative jumps are used
in compilation, since constructs to work around this are necessary. i
need to find a way to abstract this kind of behaviour. basicly mappinng

A   O-O-O-O-O-O
     \_______/

to
      ___
     /   \ 
B   O-O-O-O-O-O
       \_____/

or the other way around. for PIC18 going from B to A reduces the jump
distance by 2, since the long jump is eliminated.

it can probably be just kept at 1 & 2 for now, with all jumps equal
size, and fix that later.


Entry: bookkeeping
Date: Wed Feb  7 09:49:20 GMT 2007

the other problem is the bookkeeping. for the assembler i basicly need
a symbol table, which means the dictionary object from cat can be
reused, and some binding operations need to be devised.

there are things to separate:
- labels     (write accumulate, read random)
- 'here'     (read/write random)
- asm output (write only)

it is probably easier to do most of this in scheme, together with some
syntax. let's see.

maybe best to do everything in 3 steps:
1. assembly to polish notation, keeping symbol names

2. forward symbol resolve
3. backward symbold resolve

the last 2 are stateful, the first one is just pattern matching.


Entry: lifting problem
Date: Wed Feb  7 10:25:31 GMT 2007

how to call a generic prototype function within the body of a to-be
lifted prototype? this is still one of the bigger problems i had when
writing old cat: cannot execute from stack macros!


Entry: screen scraping
Date: Wed Feb  7 12:41:41 GMT 2007

ok, that was fun. using emacs macros to convert text pasted from the
pdf datasheet into lisp code :) but it doesnt work very well though. i
think i should just get the data from gpasm, hoping it's a bit more
structured. (in the end i just typed it in).


Entry: great success!
Date: Wed Feb  7 14:51:36 GMT 2007

writing the assembler, and i'm realizing something. scheme is really
cool :)

but i'm not sure if scheme is the core of what i'm finding cool. i
think it's pattern matching. since a compiler is mainly an expression
rewriter, this comes as no surprise in hindsight.

the biggest mistake in the previous brood system is to try the problem
without pattern matching constructs. brood's approach (and previous
badnop) is really too lowlevel.

for expression rewriting, lexical bindings are a must. since the
permutations involved are mostly nontrivial, performing them with
combinators instead of random access parameters is a royal pain, and
the resulting code is completely unreadable.

i think this can be distilled in yet another "why forth?" answer, but
in the negative. if the task you are programming involves the encoding
of a very tangled data structure, then a combinator language is a bad
idea, since you have to factor the permutation manually.

so it's about this: forth is bad at encoding fairly random or ad-hoc
permutation patterns like you would find in a language
compiler/translator.


and, don't forget: match & fold are your friends!


Entry: assembler working
Date: Wed Feb  7 21:14:25 GMT 2007

at least the part that's not doing symbol resolution. now for the
interesting part: the assembler has some state associated:

- dictionary
- current assembly point

which has to be dragged along. i was wondering how hard it might be
to solve this with some closure tricks in scheme..


Entry: lambda again
Date: Thu Feb  8 09:47:23 GMT 2007

trying to get my head around this lambda thingy.. there are a couple
of problems, the most important one is the decision of wether lambda
should be form or function.

* form: everything is compiled at compile time. this means lambda has
  to be a parser macro, and the only way to do that consistently is to
  have it be a prefix macro. this would introduce the semantic
  simplicity of the language by intruducing syntax.

* function: lambda does runtime compilation, in which case the lexical
  environment has to be bound to the compiled representation of the
  lambda call. it also introduces runtime compilation. speed wise this
  is no problem, since all dictionary lookups are postponed till later
  anyway, but conceptually it is different again.

maybe the latter is the lesser of the 2 evils. 'lambda' still needs to
be a parser exeption since it needs to capture the parse
environment. so lambda is really delayed parsing. maybe that makes
more sense.

ok, following this:
- the argument needs to be symbolic, not a quoted program. (raw source)
- nested lambda's will work
- the run time part is called 'apply'

now, what does a compiled lambda expression look like?

   '(A B C) '(foo B bar) lambda  ->  (bind-C bind-B bind-A foo B bar)

that's the easy part. now, where is the storage? clearly, storage is a
runtime thing, so we can change the code to:

   (alloc bind-C bind-B bind-A foo B bar)

now 'B', for which code is generated at compile time, needs to know
where to find this storage. what about just putting it on the top of
the stack, and modifying all code that's not accessing the parameters
to ignore the bindings?

some problems here with passing the lexical state to
subprograms.. wait: this is always done by 'parse'. it's ok to think
about lexical scope as dynamic scope of the parser.

but... passing stuff on the data stack is kind of dangerous, since all
subforms which have lambdas will do the same, so how do the inner
forms find the values of the outer variables?

the only real solution is probably to have the interpreter pass around
an environment pointer..

maybe that's a good point to just stop, and leave out lambda entirely.


Entry: monads
Date: Thu Feb  8 19:18:49 GMT 2007

i guess it's safe to say that 'bind' really is 'lift' as i defined it:
take a function that maps values outside into the monad, and turn it
into a function that can be composed.


Entry: lambda again
Date: Fri Feb  9 10:37:59 GMT 2007

let's see.. what does lambda do? actually two things:
* functions as values (delayed evaluation)
* locally (lexically) defined names

i already have the first one as quoted programs. so the problem i
should be solving is not the lambda problem comprising both
subproblems, but only the latter subproblem: lexical variable
binding. this is forth's "locals".

some more ideas: write the interpreter in Y-combinator form
(CPS?). this would allow the interception of invocations, basicly
allowing any kind of binding of the state that's passed around. maybe
this is the interesting problem for today?

btw. i ordered friedman's "Essentials of Programming Lanugages". First
edition got it very cheap on amazon. Now reading "The Role of the
Study of Programming Languages in the Education of a Programmer."
Done. Gives me a bit of good faith that i'm on the right track. I just
need to study and experiment more.. and learn to smell ad-hoc
solutions.

One of the things the paper mentions is that it is a good thing to
learn to implement your own abstractions / language extensions /
... and to invest some time into learning the general abstract ideas
behind language patterns, mainly (automatic) correctness preserving
transformations.

It looks like the approach Friendman suggests is kind of radical. I'm
doing this from a Forth and Lisp perspective for quite a while now,
but it looks like i am getting stuck in certain simple
paradigms. Rewriting BROOD kicked me out of that and made me think
about better approaches, adopting pattern matching, a static language
and lazy compilation.


The idea with PF as one of the BROOD targets is probably a good
idea. It's going to be a hell of a problem to tackle though.


things to try:

- convert the dynamicly bound code in BADNOP to something i can run on
  the new core.ss : this approach seems like a nice one and i can't
  really say why.. there's the idea that dynamic binding is bad, but
  it's quite handy from time to time (i use it in PF C code all over
  the place). why is this? and what should be the proper construct?

- see what CPS can bring. for one, it should make control structures a
  lot easier to implement. so THAT is what i was looking for. obvious
  in hindsight. but how to do this practically?


Entry: re re re
Date: Fri Feb  9 16:20:40 GMT 2007

so next actions.
1. is scoping important / feasible / desirable?
2. should i solve the assembler purely monadic?

one great advantage of NOT using static (or dynamic) scoping is the
independence of context. it does make a whole lot of sense to actually
just write the components as simple functions, and combine them
later.

what i have already is the core of the assembler, generated as simple
n-argument functions generated from an instruction set table. these
functions return a list of opcodes generated from this instruction.

currently this is executed as:

(define (assemble lst)
  (map
   (match-lambda

    ;; delay assembly
    (('delay . rest) rest)  

    ;; assemble    
    (((and opcode (= symbol? #t)) . arguments) 
     (apply (find-asm opcode) arguments))
    
    ;; already assembled
    (((and n (= number? #t)) . rest) `(,n ,@rest)) 

    ;; error
    (other raise `(invalid-instruction ,other)))
   
   lst))


instead of writing this as a map which is independent, i should write
it as a for-each (an interpreter which accumulates state changes).

ok that was easy enough: the interpreter is split into 2 parts: one
that does pure assemblers (independent of state), which are the ones
generated from the instruction set table, and one that does impure
ones.

now for the disassembler. it's probably easiest to organize this as a
binary tree decoder. the argument decoding could be done working on
the binary representation string.


Entry: values
Date: Fri Feb  9 20:23:22 GMT 2007

i never understood why 'values' would be useful. well, i think i
understant now..

to compose 2 functions A and B
A   (x y z) -> (x y z)
B   (x y z) -> (x y z)

one would need to write
(apply B (A 1 2 3))   , with A returning a list

using values this becomes something like

;; values
(call-with-values
    (lambda ()
      (call-with-values
          (lambda ()       (values 1 2 3))
        (lambda (x y z)  (values (+ x 1) (+ y 1) (+ z 1)))))
  (lambda (x y z) (values z y x)))

;; lists
(apply
 (lambda (x y z) (list z y x))
 (apply
  (lambda (x y z) (list (+ x 1) (+ y 1) (+ z 1)))
  (list 1 2 3)))


i'm not convinced about the values thing.. lists are easier for
debug: they don't requires special call. i think what's easier to
read is a straight composition, where every function passes a
list to the next one, which is then appended to a list of
arguments, like this:

(chain `(,ins ())
       (dasm 1)
       (dasm 2))

maps to

(apply dasm
       (append '(4)
               (apply dasm
                      (append '(4) `(,257 ())))))


(define-syntax chain
  (syntax-rules ()

    ((_ input (fn args ...))
     (apply fn (append (list args ...) input)))

    ((_ input (fn args ...) more ...)
     (chain (chain input (fn args ...)) more ...))))

(chain `(,257 ())
       (dasm 4)
       (dasm 4))


ok. i got the disassembler body working. now still need to do the
search..

this binary tree search looks fancy bit is it really necessary?
might even be simpler actually.

ok. i need some binary trees for that.. just made some code, but
it's kind of clumsy: the tree is created on the fly if some nodes
do not exist. less efficient, but easier to do is probably to
generate a full tree, and then just use set to pinch off a
subtree somewhere.

ok.. dasm seems to work.

some minor issues with parsing multiple word instructions
though.. will have to change the prototype.

so the next step is to move some code to runtime, and to unify
the dasm and asm: basicly they do the same: convert between bit
strings and lists. the real 'problem' is the permutation of the
formal symbolic parameters into the order they occur in the bit
string.


Entry: asm/dasm cleanup
Date: Sat Feb 10 09:34:48 GMT 2007

fix the multiple instruction problem: it's probably easier and
cleaner to have one symbolic instruction correspond to exactly
one binary word. all the targets i have in mind are
risc-like. multiword instructions are then handled as multiword
opcodes.

once this is done, the asm and dasm pack/unpack could be combined
into one single 'interpreter'.

ok. maybe it's best to stop here. it's not 'perfectly clean' but
i guess what's left of dirtyness can easily be cleaned up when i
encounter another instruction set that's not compatible with this
approach.

another thing i need to consider, or at least need a 'reason for
ignorance' for, is: "why am i not generating pic assembly
code?". the reasons are 1. full control, 2. have dasm available
in core for debug. 3. easier incremental assembly & linking.

adding support for text .asm output is rather trivial.

ok...

next: branches

the two passes, fairly simple.
1. backward branches can be immediately resolved.
2. forward branches need to be postponed.

this is a combination of the directives 'relative' 'absolute' and 'delay'


Entry: PIC18 compiler
Date: Sat Feb 10 12:44:07 GMT 2007

time for the crow jewel :)

but first, i need to clean up the core.ss register code to accept
an abstract store with default. ok done.

i don't like the way i've got the generic register compiler and
the PIC18 compiler completely separated. it is good to share
code, but in this case, the sharing can probably be done better
by just copy/pasting the patterns, or at least, inserting them
from a common include.

what about keeping the register compiler as a general purpose
example and figure out how to do proper sharing once i have
different architectures running?

yep.. i think it's best to keep that idomatic compiler for other
experiments, and go straight for proper pattern matching peephole
optimizer.


Entry: more PIC18 compiler
Date: Sun Feb 11 09:10:13 GMT 2007

i think i made a mistake by writing it as just a pattern
compiler.. this thing should be a proper language with recursion,
otherwize i can't implement recursive substitution macros and other
lanugage patterns: one machine that maps forth straight to asm.

the only preprocessing stage should be the reducer, which folds
expressions like '1 2 +'. even better, this reducer should be part of
the compiler too, so that expanded macros benefit directly from this.

summarized: separate reduction and expansion phases might lead to
suboptimal performance: it's probably best to condense all this into a
single phase, and make an extensible pattern matcher.

this would be the same design as before. there are more of these: the
little interpreters for macro mode etc.. it was pretty good already it
seems. just the global variable thing was a mistake.

ok.. it probably pays to make the pattern matcher programmable. add a
minilanguage there too.

NEXT:
* control structures
* extend pattern language

the latter is not so trivial in the current implementation since a
nice thing to have would be a 1 -> many mapping. i could use a special
'splice' word for this though. maybe it's best to work around this
though.

anothering i'm thinking is: now that i'm no longer afraid of this
pattern matching business, why don't i write my own? this would make
it possible to do some of this at runtime, making it a bit more
flexible for additions etc..

time to taka break.


ok.. what i have is 2 conflicting operations: a pattern replacement
and a reverse. this needs to be sorted out properly: what exactly do i
want the programmable part to do?

ok.. it seems to work now. needs cleanup. i'm really curious about
runtime though.. probably these are all written in terms of the syntax
expander, and need to be syntax?


Entry: merging dictionaries
Date: Sun Feb 11 14:10:36 GMT 2007

i'm trying to port the intelligent macros now.

long standing problem.. should you merge macro and bottom language
dictionaries, or keep them separate? i think the best way to go is to
manually import or link what you need.

about variables and allocation. i think it's easier to just use
variable names for this, and shadow them when they are changed. then
after a compilation is done, the whole dictionary could be
filtered. the other option is to use a functional store like before,
which might be a good idea anyway.

NEXT: 
- functional store (it's cleaner, and might come in handy later)
- conditionals + optimization
- for loops + loop massage


Entry: stateful macros
Date: Mon Feb 12 01:25:53 GMT 2007


let's see if i actually learned something.. basicly, i have two
options now. to write all the macros as explicitly handling the asm
buffer, or to have them spit out just a list of instructions.

i don't think there is any code that has to look back to the past asm
state: all words that do that are written as pattern matching partial
evaluators.

so, let's write all control structs as producers, just like the other
macros.

so.
i think i sort of disentangled the problem:

------------------------------------------------------------------
If there is a lot of state that has to be dragged along, split all
operations into classes that operate only on substates, or have a
simple, consistent way of operating on state, like concatenation.
Then, lift all these subclasses to a common interface that can be
composed.
------------------------------------------------------------------

The thing i'm using is really the Writer Monad.


Entry: monads
Date: Mon Feb 12 00:53:53 GMT 2007

about a year ago i made a decision to use a functional dynamic store
to solve the problem of state, because i didn't understand the idea
behind monads. this was a mistake, but i guess a necessary one. i
probably wasn't ready for the ideas at that time.

now i think i sort of get it. monads (haskell style) are about
dragging along state implicitly.

the irony is, i implemented that!

what i did was to have an implicit state objected being dragged along
as a top of stack element, invisible to some computations. this is the
'State Monad'.

the mistake is: this too general. it's better to use a smaller gun to
solve the problem at hand on a more local scale, instead of basicly
using a state machine model (albeit one without destructive mutation).

the small gun is mostly related to the 'Writer Monad'. the operation
that's made implicit is 'append'. i call this 'lift-stateful'. this,
together with some other state dragging (if the data stack is not
used, it can be dragged: some operations, like the pattern matching
peephole optimizer, work on the produced code as a stack.)

the thing that's really interesting though is this: if you start to
think about forth as a compositional language, then this whole monad
thing is nothing more than a way to 'lift' words so they can be
composed in linear code.

basicly. if the things you want to compose are operations A x B -> A x
B, but what you have is operations like

A -> A
B -> B
A -> B
A -> A x B
B -> A x B
A x B -> A
...

together with a higher order function (hof) that will correctly lift
them to A x B -> A x B, then what you're doing is abstracting away the
trivial parts of such a map in this hof.

for the writer monad, the trivial part is 'append'. replace 'trivial
work' with 'hard work' and you get this:

http://lambda-the-ultimate.org/node/1276#comment-14113

"By using a monad, a simplified interface to the necessary
functionality can be provided, while the hard work of maintaining and
passing the context is handled behind the scenes."


so, what i need to do is to work out some abstractions so i can
perform this kind of magic in straight cat without having to resort to
scheme code.


Entry: backtracking
Date: Mon Feb 12 08:35:51 GMT 2007

in 2.x there are the for .. next macros that perform an optimisation
for which a decision has to be made early on. does it make sense to
use 'amb' for this?

probably yes, becasue explicit undo is going to be more expensive than
just going back to a previous point and re-running the compilation..

the tricky part is to keep it under control :) in an interactive
interpreter, where state can be accumulated on the stack, having
lingering continuations in the backtracking stack might be dangerous,
since 'fail' effectively erases all changes made since the last
success.

i've provided 2 lowlevel words:
kill-amb!     reset the backtracking engine
amb	      make a nondeterministic choice from a list

the code in amb.ss supports (possibly infinite) lazy lists in case i
ever need them.

so. let's make 'amb' binary. this way it's easier to implement lazy
amb by embedding another call to amb in one (both) of the
alternatives. yep. this looks like a better idea.

haha. keep it under control! i've just been chasing a 'bug' where amb
apparently didn't return properly, however, it was just waiting for
input: the continuation had a 'read' in it, and the fail depended on a
previous read, so it just wanted that read again. so conclusion:

-------------------------------------------
be careful with amb and non-functional code
-------------------------------------------

i fixed the 'cpa' "compile print assembler" loop to read lines instead
of words, so at least the backtracking is ok on a line base.


Entry: commutation
Date: Mon Feb 12 16:44:30 GMT 2007

there are a lot of places where just swapping the order of
instructions might be beneficial. i ran into a bug where it is not
possible, although on first sight the operations seem independent:

	((['movlw f] 1-!) `([decf ,f 0 1] [drop]))
	
because 'decf' has an effect on the flag that's used in the macro for
'next', this is not always correct! drop, being movf, sets the Z,N
flags. however, decf sets the carry flag, so this could still be
used. however, i've disabled the optimization..


Entry: next actions
Date: Mon Feb 12 16:52:47 GMT 2007


- conditions
- variables
- constants in assembler


a variable allocation is just a dictionary operation, so it really
should be an assembler step. i need to think about that a
bit. something's wrong...


Entry: bored
Date: Mon Feb 12 23:18:29 GMT 2007

let's play a bit. 

generators.. a generator is easiest understood as something
which, when activated, returns a generator and a value. in other
words: a generator is a lazy list.

(((3) 2) 1)

is a finite generator

manfred von thun has an interesting page about using reproducing
programs as generators:
http://www.latrobe.edu.au/philosophy/phimvt/joy/jp-reprod.html


i wonder how to do this in lisp?
suppose fn is a state update function

(fn init) -> generator


(define (gen fn init)
  (lambda ()
    (cons init
          (gen fn (fn init)))))

in cat it's quite simple too

(gen (2dup run swap gen) qons qons) ;; (init fn -- gen)

as mentioned by manfred in

this is related to the Y-combinator.
http://www.latrobe.edu.au/philosophy/phimvt/joy/j05cmp.html
basicly, a generator or lazy list is a delayed recursion.

so in cat, applying 'run' to a lazy list, has the same result as
applying 'uncons' to a list.


Entry: misc ramblings
Date: Tue Feb 13 12:06:14 GMT 2007


i'm going to change terminology a bit so it's more Joy like, if
only for the reason that it makes joy code easer to read.

duck -> dip

http://www.nsl.com/papers/interview.htm

There is a ... combinator for binary (tree-) recursion that makes
quicksort a one-liner:

    [small] [] [uncons [>] split] [swapd cons concat] binrec

then for-each:
i need to find the more abstract pattern, which is 'fold'.

what about a fold over a lazy list?


Entry: lazy lists
Date: Tue Feb 13 12:22:52 GMT 2007

right now i use them in (amb-lazy value thunk), where 'value' is
returned immediately, and thunk will be evaluated later.

the question remaining is that of interfaces. if i say "a lazy
list" do i mean thunk or (val . thunk) ?

(there is another question about using 'force' and 'delay'
instead of explicit thunks. for functional programs there is no
difference, but for imperative programs there is. maybe stick to
thunks because they are more general.)

i think 'amb-lazy' should be seen as a 'cons' which contains only
forcing, and leaves the delay operation to the user. i provide 'amb'
to construct a full list from this. unrolled it gives:

(amb-lazy first
          (lambda ()
            (amb-lazy second
                      (lambda ()
                        (amb-lazy third
                                  (lambda () (fail)))))))


for generic lazy lists: maybe using 'force' and 'delay' is
better, since it allows for 'car' and 'cdr' to trigger the
evaluation. this enables the definition of lazy-car and lazy-cdr
without fear for multiple evaluations that have different
results, and it still allows for non-functional lists.

ok.. cleaned it up a bit, and moved most of it to lazy.ss lazy
operations have a '@' prepended to the name of the associated
strict operations. i have @map, but @fold doesn't make sense since it
has a dependency in the wrong way.

i should also change ifold to something else.. there has to be a
proper lisp name for it. i renamed it to 'accumulate'. makes more
sense. (accumulate + 0 '(1 2 3))

the corresponding lazy processor makes sence, but only if it returns
the accumulator on each call. so it's more like
'accumulate-step'. it's better to just create the @integral
constructor, which gives a new list from an old one.


Entry: had this idea
Date: Tue Feb 13 21:28:02 GMT 2007

can you do something like:
1. resolve label
2. oops can't do. save 'cons' but continue
3. run all pending conses with the obtained info.

now, this isn't much different than storing all unresolved symbols in
a table and later fix them, only this stores actions. (don't set a
flag, set the action!)

more specificly, suppose there's the input

     		x x x y z z z

where y is not resolvable. the way to solve this is to have y run z z
z and then try to resolve y and concatenate the results. basicly just
swapping the order of execution.. 

something that could be done is to make the assembler essentially
2-pass, where the first pass performs normal assembly, but on the fly
creates its reverse pass which just resolves the necessary items and
works its way backwards.

talking about overengineering :)
a simple 2 pass is probably good enough.

but still..  this is more efficient, since the reversing which would
happen in an explicit 2-pass is not necessary + the scanning of things
already compiled can be avoided.

so:   x1 x2 x3 lx y1 y2 ly z1

-> (... z1 (ly y2 y1 (lx x3 x2 x1 (...))))


Entry: backtracking -> an argument against dictionaries as sets
Date: Wed Feb 14 10:50:57 GMT 2007

another thing what i didn't think about.. what's the actual cost of
the continuations? i don't think it's much, because the data is mostly
shared: asm is just appended to until it's completely finished, and
the code list is just run sequentially. there's no rampant data
copying going on: the garbage is created only at the compile end.

so, it might actually be better to NOT keep dictionaries stored as
sets, but just shadowed association lists, to make backtracking memory
efficient. (in case i want to create lots of choice points). the
redefining of 'current allocation pointers' tends to re-arrange and
copy things on functional stores..


Entry: bit instructions
Date: Wed Feb 14 15:58:04 GMT 2007


there are a lot of bit instructions that are better handled in a
consistent way. one of the problems with the assembler is that bit set
and bit clear have different opcodes. i think it makes more sense to
handle them as one opcode + argument.

all bit instructions are polar, take another 'p' argument, so they can
be easily flipped as part of the rewriting process. the extra argument
is placed as first one, to make composition easier.

    bcf bsf -> bpf
    btfsc btfss -> btfsp
    bc bnc  -> bpc
    ...

ok, it seems to be solved with a set of pattern matching macros, and a
factored out label pusher :)


 ;; two sets of conditionals
 ((l: f b p bit?) `([bit?  ,f ,p ,p]))  ;; generic -> arguments for btfsp
 ((l: p pz?)      `([flag? bpz ,p]))    ;; flag -> conditional jump opcode
 ((l: p pc?)      `([flag? bpc ,p]))

 ;; 'cbra' recombines the pseudo ops from above into jump constructs
 ((['flag? opc p] cbra)   `([r ,opc ,(invert p) ,(make-label)]))
 ((['bit?  f b p] cbra)   `([btfsp ,(invert p) ,f ,b] [r bra ,(make-label)]))

then we have the recursive macro (if == cbra label) and the pure cat
macro (label == dup car car swap)

a lot more elegant than the previous solution. i like this pattern
matching approach.

NEXT: 
* variable and other namespace stuff
* forth lexer
* parsing words
* intel hex format


Entry: forth lexer + parsing words
Date: Wed Feb 14 21:40:13 GMT 2007

which is of course really trivial. see lex.ss
i'm not doing '(' and ')' comments again, just line comments '\'

i think i know why i always had problems with my ad-hoc parsers and
word termination etc.. splitting in lexing and parsing makes sense,
because the first one is purely flat, while the second one can be
recursive. it helps when in the 2nd phase there are no more stupid
problems with word boundaries..

parsing words are, well, extensions of the parser :)

since these will make things move away from straight 1->1 parsing, the
parser needs to be rewritten as a recursive process / fold.

ok, the scaffolding is there: written in terms of
reverse/accumulate. now i need to really think about how to solve the
'variable' problem.

-> how to solve parsing words?
-> where to do the actual allocation?


Entry: ihex
Date: Thu Feb 15 00:26:02 GMT 2007

this used to be written in CAT, but was a mess. it's one of those
simple things that are hard to express in a combinator language
because they drag along so much state if you want to do them in one
pass. again, they are about merely re-arranging data!

maybe i should just try it again, but using a multipass algo, just
to see if i learned something.. on the other hand, this would be
nice to have as scheme code, so i can use it outside of the project.

ok.. it seems to work fine. got some binary operations for free that
can be used in the loader too.


Entry: parsing words
Date: Thu Feb 15 09:55:27 GMT 2007

so. i need:   

    : variable 2variable constant 2constant 

the thing which is different from the previous implementation, is that
i have a separate compile (parse) and execute phase, so parsing words
cannot be compilation macros.

on the other hand, parsing words are always about quoting things,
mostly names, so probably a simple list of names mapped to number of
symbols is enough. limiting the number of symbols to one makes it even
easier.

sort of got something going here with variables and constants, but
there's another problem:


Entry: dictionaries
Date: Thu Feb 15 12:01:35 GMT 2007

i'm using a hash table to store 'core' macros: those that are
fixed. however, a forth program can create macros, so these need to be
defined somehow..

maybe make that a bit more strict?

the same goes for constants.. i'm using fixed machine constants in a
hash table, and some user defined stuff in other places.

this needs some serious thinking..

constants can be implemented as macros which inline literals. so the
only remaining question is: how to handle macros?

macros are really compiler extensions. they are a property of the
host, not of the target code.

it would be really inconvenient having to split a project into two
parts, so i should aim for macro defs inside source files. however, a
clear distinction needs to be made between host and target things:

target properties are related to on-chip storage == addresses

host properties are related to code generation only

the result is that there are 2 possible actions on a source file:
- reload macros + constants
- recompile = realloc code and data

to track the state of a project, the only thing that needs to be saved
is the source code + a dictionary of target addresses. all the rest
(macros) can be obtained directly from the source code.

actually, this is a lot better than the old approach, where macros are
stored in a project state file.


Entry: new badnop control flow
Date: Thu Feb 15 12:28:56 GMT 2007

in  = project sourcecode
out = compiled target code + dictionary

1. PARSE EXTENSIONS

   Read all source files and extend the compiler to include the macros
   and constants defined in the source files. This effectively builds
   a new special purpose compiler for the code in the project.

2. COMPILE CODE

   Convert all code definitions and data allocations to a form that is
   executable by the CAT VM, and run this code. This generates
   optimized symbolic assembly.

3. ASSEMBLE CODE

   In a two-pass algorithm, convert the symbolic assembly to binary
   opcodes, allocating and resolving memory addresses. This process
   uses the current dictionary, reflecting the state of the target,
   and produces a new dictionary and a list of binary code.


Entry: parse extensions (borked)
Date: Thu Feb 15 13:13:17 GMT 2007

and hupsakee, i'm writing a parser state machine again!

amazing what a not-so-good night's sleep does.. let's do this a bit
more intelligent using my favourite one-size-fits-all hammer pattern
matching!

seriously, the syntax is really simple, so i shouldn't be writing a
state machine, just a set of match rules.

one thing though. how to extend it? previous brood needed parse words
to be written explicitly. i should do that now too.. just a dictionary
of parse words, that output a chunk of cat code and the remainder of
the input stream.


Entry: forth parser - different pattern
Date: Thu Feb 15 18:45:52 GMT 2007

ok. got some sleep.

the thing is that this is a different pattern than all the other
things i've been doing. the previous pattern matching code for the
assembler is basicly a partial evaluatior, which looks backwards
instead of forwards. so this needs new code!

in short i need a different kind of parser or a preprocessor to map
forth -> composite code.


let's try to arrange the thoughts a bit since i feel i'm not seeing
something really obvious..

i have an urge to write the parser as a state machine, or as a pattern
matcher. both of them seem to lead to code with a similar kind of
complexity, but with some obvious redundancy. i can't see the higher
level construct.

ok.. what i'm missing here is elements from SRFI-1

it's quite clear what i want to do: generic list pattern
substitution. so basicly, the prototype is:

(in) -> (in+ out)

with (out) being concatenated.

let's call this the 'parser' pattern, and write an iterator for it.

ok. it needs a bit of polish, but the idea is there i think..


Entry: ditching interpret mode
Date: Thu Feb 15 21:33:13 GMT 2007

what about ditching interpret mode and relying fully on partial
evaluation? i can use the following trick: the partial evaluator does
NOT truncate results to 8 bit during evaluation, only after. so in
principle, there is a complete calculator available with full numeric
tower.

maybe it's good to create some highlevel constructs for the partial
evaluator. literals are still encoded as symbolic assembly, which is
ok, only somehow a bit dirty. this is effectively a second parameter
stack..

to make this more explicit, the macros 'unlit' and '2unlit' are
defined. these will reap literal values from the asm buffer and move
them to the parameter stack. the implementation of these macros is
split into two parts: a pattern matching part, and a generic macro
part '_unlit'.


Entry: more parsing
Date: Fri Feb 16 10:23:00 GMT 2007

so the basic infrastructure is there, now i just need to figure out
how to put the pieces together. this host/target separation needs some
more thought.

the problem i'm facing atm is 'constant'. this should define a
constant as soon as it's parsed, but the value comes from partial
evaluation which happens at macro execution time!

maybe i shouldn't really care about this 2-pass stuff.. i can just
compile code for it's side effects, being the definition of macros..

another thing, which is related to the comment about the asm buffer
being a second parameter stack: why not compile quoting words as
literals instead of loading them on the data stack? this way a simple
pattern matching macro can be used to implement the behaviour of
parsing words..

i have to be careful though, since this arbitrary freedom must have
some hidden constraint somewhere..

the hidden constraint is of course: literal stack encoding is
machine-dependent! it's actual assembler dude!

maybe keep it the way it is, however, 'forth-quoted' feels wrong. also
the combination of literals coming from the asm buffer, and the symbol
coming from the stack, feels awkward. but it does seem to be the right
thing..

anyways.. it seems to work now.


Entry: dictionary
Date: Fri Feb 16 14:07:09 GMT 2007

so the only thing that's remaining is the runtime dictionary stuff:
variables (ram allocation) and associated things.

mark variable names as literals during parsing. done.


i'm still not sure wether the muting operations are such a good
idea.. maybe a separate macro parsing stage is better after all.. as
far as i understand, the thing which makes this difficult is the way
that 'constant' works: it's dependent on runtime data (partial
evaluator), so the definition needs to be postponed..

what about using some delayed eval here? or i can use the same trick:
reserve the name so it can be treated as a literal, but fill in the
value later?

so, on to the fun stuff..  dictionaries. basic functionality seems to
work using the 'allot' assembler directive.


Entry: parse time macro definition
Date: Fri Feb 16 14:53:23 GMT 2007

what if i can:
- define all macros
- reserve all constant/variable names (which are just literal macros)

during parsing only?

and fill them in whenever the data is there?

the problem is how i'm handling 2constant now.. this can be fixed with
a gensym.

ok. this looks doable, but not essential. something for later.


Entry: forth loading and machine memory buffer
Date: Fri Feb 16 17:53:05 GMT 2007

two things i just did:
- added a function to load symbolic forth code
- draft for memory stuff

need to figure out where to do 'load'

load is a quoting parser, then just executes..


Entry: optimizations - need explicit unlit
Date: Fri Feb 16 23:40:27 GMT 2007

i'm running into several conflicting eager optimizations, which is
normal of course.. i was thinking about making this a bit more
manageble, by prefixing operations that have a lot of different
combinations with virtual ops that will just re-arrange things for the
better..

the most occuring mistake is to combine a dup with a previous
instruction so the lit doesn't show any more. i think in 2.x there is
an explicit 'unlit' that puts the drop back..

ok. this pattern matching is definitely an improvement for writing
readable code, but it does pose some problems here and there..


TODO:
- intelligent then
- better literal opti (unlit)
- port the monitor
- device specific stuff
- code memory buffer
- host side interpreter


Entry: optimization choices
Date: Sat Feb 17 10:37:12 GMT 2007

instead of having 'stupid' backtracking, it might be easier to do
'intelligent' backtracking. this means: at some point a choice is
made, but if at a later time it is realised this choice is the wrong
one, then this particular choice needs to be changed.

the pattern i encounter is this:
1. do eager optimization
2. realize later this optimization was not optimal
3. undo previous optimization to perform better one

every time there is an 'undo' this could be solved by an automated
backtracking step. what about a sort of 'electric save' ??

(it would also be interesting to somehow 'cache' the choices that have
been made in the mean time, so when a whole subtree is executed again,
the right choices are made first..)

interesting stuff :)

it looks like the search space is not really a tree, but more like a
snake line: 10010011001, where at some point one of the choices is
deemed wrong, for example 10010x11001. the remaining part 11001 then
needs to be re-done, but using the same pattern might be an
interesting optimization.

another thing is the storage of choices. backtracking needs a stack to
operate. well, i already have one! the asm buffer serves that purpose
quite adequately. this also solves the problem of the backtracking
using mutable state.


on the other hand, working purely algebraicly does have the advantage
of simplictly, but it requires the explict construction of inverse
operators.


Entry: literal opti
Date: Sat Feb 17 11:33:49 GMT 2007

instead of making pe operate on DUP MOVLW, let's make it work on MOVLW
only, so the extra SAVE is not necessary.

hmm.. i'm going in loops. the thing is that i'm using the literals in
the asm buffers really as compile time stack. simply making the
partial evaluation respect 'save' would enable to keep that
paradigm. otherwize the DUP in front of MOVLW (DUP MOVLW) needs to be
handled explicitly every time. this then needs to be handled by a
recombining DROP operation, which is really no different from handling
SAVE properly...

so back to the original solution.

to keep everything as pattern matching macros, i could also run an
explicit recombination after the literal operations.. quick and dirty.

wait a minute.

i can just dump code in the asm buffer, and add a bit to the pattern
macro to check for this, and execute it. then the only problem is:
quoted code or primitives? probably primtives are best, since they are
already packed into one item, and don't need 'run'.

ok. that seems to work just fine :)


Entry: monitor
Date: Sat Feb 17 16:22:13 GMT 2007

ok.. seems i'm almost to the point where i can compile the full
monitor code. some things are missing, like the chip specific configs,
but i can see that the partial evaluator is going to help quite a lot
to keep things simple: more things can be configured in the toplevel
forth source code file instead of a lisp style project file.

something that needs to change though is support for 'org'. this
probably means that assembly code needs to be tagged somehow.

ok. org is simply solved by embedding (org <addr>) in binary code.


Entry: intelligent then
Date: Sat Feb 17 19:59:12 GMT 2007


since i don't exactly remember what the code does, and i can't read
the old 2.x code just like that, let's decipher it.

the problem is something like this:

l4:
	btfsp 	1 TXSTA 1 0 
     r 	bra 	l5 
     r 	bra 	l4 
l5:

which comes from 

      begin tx-ready? until

which expands to

      begin tx-ready? not if _swap_ again then

the important part is the 'then', which should decide that it should
flip the polarity of the skip and the order of the two jumps IF the
first one corresponds to the symbol on the stack.

this works not only for braches, but for any single instruction
following after the forward branch.

ok. implementation. this doesn't fit the pattern matching shoe, since
the label on top of stack needs to be incorporated in the
check. however, it is possible to just compile the 'then', and perform
the optimization afterwards, which is possible using a pattern
matcher.

ok. this works.
i don't check the label though.. should do that, or prove that it
can't be anything else..


Entry: reverse accumulate
Date: Sat Feb 17 20:27:38 GMT 2007

now, something that has been getting on my nerves is the reverse tree
stuff.. there is absolutely no reason for it. the original reason was
to split code into chunks of forth idioms, but i sort of lost that...

this whole reverse tree stuff makes things to complicated so it has to
go.

temporarily i will take out the reverse-tree function.

ok. this seems to have worked.
a lot of code is a lot simpler now..

no there's still a bug. fixed.


Entry: tip
Date: Sat Feb 17 23:44:33 GMT 2007

(require (lib "1.ss" "srfi"))

yep. sometimes it takes a while to figure out the small things..

another thing: srfi 42 is about list comprehensions (loops &
generators). seems worth a look.


Entry: time to upload
Date: Sun Feb 18 08:37:04 GMT 2007

looks like stuff is in place to start dumping out hex files.  so, i
need to make an effort to not fall into the same trap as before: it
would be nice to have cat completely on the background, and do
everything from the perspective of the target system.

the easiest way to do this is to use the current debug interpreter,
and plug in a proper 'interpret' mode for interaction. yes, here there
is some confusion. what about interpret mode?

do i switch to compile mode explicitly? i kind of like the colorForth
approach where there is only editing and commands, no command line
editing.

everything between : ... ; is always compile mode. the tricky stuff is
what's before that, because i completely rely on conditional
compilation for constants etc..

but, constants are really the only exception. if i make an
interpret-mode equivalent of constant, then i could fake that.

oth. a proper compile vs interpret mode might be a better solution. it
is definitely cleaner.

so we converge on this?

-> compile mode = exactly the same as what's in files
-> interaction mode = all the rest

implemented as 2 coroutines.


Entry: state
Date: Sun Feb 18 08:59:04 GMT 2007

at this time, it becomes rather difficult to maintain all the state on
the stack, so i probably need to move to a more general state
monad. basicly what i had before in 2.x, but without executable code.

fist, let's see about what state needs to be accumulated:
- assembler buffer
- target dictionary
- forth code log?

data necessary in different modes:
compile:       asm buffer
assemble:      asm buffer + target dictionary
interpret:     target dictionary

i can probably avoid explicit monads (i don't know how to really do
that: have to lift a lot of code!), and just use a main driver loop
that runs the applications with the dictionary dipped.


what i have is a proper class based system:

- classes are cat dictionaries (implemented hash tables)
- inheritance is based on chaining these dictionaries.
- objects are association lists.


so that's for later. i'm in no need for objects with encapsulated
behaviour. the only thing i need is a local scope, so it's really just
used as a data structure.

this means i can start writing the main loop of the program, which is
basicly written as a method bound to state.

the thing i need to be careful about though is tail recursion. this
works with 'invoke'.

now that i'm here.. looks like this is an interesting way to implement
the assembler too, by writing an object that's a list, and using a
'comma' opcode to compile instructions.

thinking about this, there are really 3 major ways of symbol binding:

- method: aggregate
- lexically nested
- dynamicly nested


ok. brace for impact. going to do the asm 'object' thing.

ok... unresolved yet. this is too convoluted, precisely for the reason
of recursive calls. i'm still thinking dynamic binding here..

but there's something to say for the idea..


trying again..

TODO
- i need a better way to create a compiler for compositions:
(register, parse, name)
- should have a state base clase with just: self self@ invoke


ok. done.


Entry: passing state to quotations
Date: Sun Feb 18 15:32:47 GMT 2007


now for code quotations. how to recurse?

the problem is that if quotations are executed using 'run', they will
not obtain the state, so they need to somehow be wrapped such that
running them passes alog the state. is that at all possible?

yes. using some kind of continuation passing..

instead of wrapping the code simply in 'dip'

so:

- quoted programs need to be parsed recursively

- they need to be modified such that running them results in the
  object being loaded on the stack.

- it is not possible to override every word that performs 'run' to
  incorporate this behaviour.

- this trick is only LEXICAL no dynamic binding of words, only dynamic
  passing of state.


same old same old..
this goes way back :)

the problem is of course in the shielding. as long as every primitive
is really shielded from the state, there is absolutely no way to
access it. so  (blabla) dip  is not a good apprach.

it should be hidden but accessible, and not shielded.

let's do this manually for now: when you want to use quoted code in a
method definition, you have to explicitly parse, compile and invoke
it. the default will be globally bound code only, and shielded
execution for simplicity.

the alternative is to compile quoted code as a method (recursive
parse-state). this is kind of strange since the invokation has to
happen manually. no 'ifte' for example.

so unless i find a way to solve the 'ifte' problem and other
implicit 'run' calls, there is no way to do this automaticly: this is
really a modification to the core of the interpreter.

so i am going to let go of the scary bits, and conclude:

* only flat composition done automaticly
* recursive composition possible using 'invoke'
* quoting method code is done manually using special parser/compiler


so it all remains pretty much a state monad. some special functions
can be thrown into the composition to act on the state through some
interface, while the rest is 'lifted' automaticly.


Entry: fixing amb
Date: Sun Feb 18 16:15:23 GMT 2007

postponing the real work, i can try to fix amb to make it operate
only on the assembler store. what i need to do to make this work is to
return the continuation explicitly. so amb will do:

amb-choose   ( c1 c2 handle -- c1/c2 ) + effect of handle

here handle will store the continuation on a stack somewhere if c1 is
chosen. if this continuation is called, c2 will be chosen without
handler.

ok. looks like it's working.
still need to strip out the continuations in the assembler though.
done.


Entry: the app
Date: Sun Feb 18 18:14:22 GMT 2007

time to write the main loop.
- based on the store monad containing:
   - asm buffer
   - forth input stream (per line)
   - state memory
- written from the target perspective
- compile mode / interpret mode

ok..
seems i'm at least somewhere. now i need to think about the design a
bit more.. the state stuff is encapsulated in a small driver loop, the
rest is still functional.


Entry: byte continuations
Date: Sun Feb 18 19:07:52 GMT 2007

i was thinking about a way to use more highlevel functions in the 8bit
forth. obviously, a jump table can be used to encode jump points as
bytes. but why stop there? the return points can be mapped also,
giving the possibility of encoding return stack in bytes too, as long
as code complexity is small enough.

the compiler could do most of the bookkeeping.

this would make sense in a setting where the code is simple, but the
number of tasks is big. since that needs a ram-returnstack, which is
better implemented as a byte stack anyway.


Entry: application
Date: Mon Feb 19 09:30:44 GMT 2007

some remarks.
bin needed?
probably not.. just keeping the assembler and generating assembly on
the fly is probably best.

the basic editing step is:
- switch to compile mode, enter/load forth code
- switch to interpret mode -> code is uploaded

cat should only be for debugging

ok, so

CPA = forth compile mode. this is to edit the asm buffer using forth
commands. the asm buffer is stored in the 'asm file. in CPA mode it is
possible to test the assembly by issuing 'pb'. however, this doesn't
use the stored dictionary.. need to fix that.

ok, what i have now are 2 modes, switched using ctrl-D

* compile mode = compiled forth semantics ONLY
  not even special escape codes for printing asm etc

* interpret mode = simulated target console. target is seen as what it
  actually is + some interface to a server. the language used is forth
  lexed, but piggybacks on cat words.

looks like it's working fine this way. let's keep it.


Entry: literals again
Date: Mon Feb 19 12:05:39 GMT 2007

ok, i need to do this properly. back to the unlit strategy. basicly:
try to recover literals one by one, instead of massive combined
patterns.

let's try this:

lits asm>lit asm>lit

ok. seems to work. still needs some explicit code that might be
optimized, i.e. the literal patching. but i can live with it like it
is now..


Entry: inference
Date: Mon Feb 19 14:27:02 GMT 2007

it should be possible to infer some more about the state of the stack
given there are no jumps from arbitrary places, which is a sane
assumption.


Entry: another day over
Date: Mon Feb 19 18:01:19 GMT 2007

and i'm running it in the MPLAB simulator. it generates correct code
at first sight. so, time to hook it up :)

still some features missing: one of them is proper byte/bit
allocation.

so TODO:

- host side monitor
- state save/load

ok, i'm getting bytes back from the monitor running on the chip. time
to start writing the monitor code.


Entry: dynamic code
Date: Tue Feb 20 00:41:11 GMT 2007


cleaning up a bit now. funny, what i need now is dynamic code :)
anyways. it's easy enough now that i have a general purpose store. all
kinds of hooks can be added here, which can be saved later. they all
go in symbolic form. to make them full circle (symbolic words in
symbolic words) i need to add some kind of explicit interpreter
probably..


Entry: parse
Date: Tue Feb 20 09:55:04 GMT 2007

to wake up today, i'm going to change all the 'parse' stuff to
'compile', since that's what it really does: parse+compile. 'bind'
would be better maybe. thesaurus.

well, 'compile' is relly quite understandable.. so let's keep that.

maybe i better make compile = (bind + parse), and turn 'bind' into a
proper CAT function? this way the whole semantics and parsing thing
can be handled in CAT code.

the other thing to think about is CPS. does it make sense to use that?
i'm still thinking about run vs invoke. maybe it's better to just keep
it explicit until my current approach takes more shape and patterns
fall out..

change 'unquoted' to 'primitive'


parse:     ( source binder -- compiled )
find:      ( symbol -- delayed/primitive )


i changed names to the following protos:

a couple of syntaxes:
  cat-parse state-parse

a lot of namespaces:
  cat-find  <whatever>-find


ok, need to do clean up this stuff later.. maybe tonight.


TODO:
- fix the toplevel interpreter stuff + reload
- on reload, macros should be reloaded from source files also. means
  compile + ignore asm.
- fix proto of binder (+ parser?)
- CPS with dynamic variables?


Entry: duality
Date: Tue Feb 20 13:54:52 GMT 2007

something interesting happened here.. 'state-parse' is now implemented
as a delayed parse operation, which exposes the semantics:

parse:   list of things -> list of primitives
find:    thing          -> primitive

generalizing find's symbol -> primitive semantics. i could probably
find a better name, but let's stick to this since it's all over the
place. from now on 'find' means: map a "thing" to primitive behaviour,
and 'parse' means: map a collection of "things" to a LIST of primitive
behaviours, representing the functional composition of these
primitives.

in case of a 1-1 relationship between source syntax and compiled code
in list form, parse is really just (map find source). this is one of
the properties of CAT source code.

so there is something very simple hidden in all this..


---------------------------------------------------------------------

* PARSE: handle the structure or SYNTAX of source code.  this will
  translate source code to to a very basic COMPOSITE CODE
  representation, which is a list of primitive code elements,
  effectively reducing any form of syntax to a simplified one. in
  doing so, parse can use 'find' recursively to translate primitive
  source objects to primitive machine code.

* FIND: handle the meaning or SEMANTICS of source code. this will
  translate a source code atom, and translate it to PRIMITIVE CODE,
  possibly using 'parse' recursively to translate atoms comprised of
  structured source code.

this is the source code / compiled code duality.

       parse code collection        <-> interpret primitive code list
       find semantics of code atom  <-> run primitive machine code

---------------------------------------------------------------------

here 'machine code' is the code representation of the underlying
machine, which in this case is scheme, with primitives represented as
functions operating on a stack of values.

this is just eval/apply in disguise. the difference being that for
lisp, the functionality is represented by the first element in a list,
while here it is a composition.

eval:   (head more ...)  ==  (apply (eval head) 
                                 (list (eval more) ...))


Entry: next actions
Date: Tue Feb 20 17:13:23 GMT 2007

run time state? or where to store the file handler? do this
non-functionally, since it's I/O anyway.. why not?

that seems to work. got ping working too. and @p+ next couple of
things should be really straightforward, but i am missing one very
important part: I CAN'T USE QUOTED PROGRAMS!!!

so i need to do something about that..

again, as far as i understant, the problem is in 'run'. if you hide
information by 'dipping' the top of the stack, there is no way to get
it back, unless you can bypass this mechanism somehow.

the thing that has to change is the interpreter.

ok. it should be possible by doing something like

    '(some app code) compile-app (for-each) invoke

making sure that the dict gets properly tucked away.

the nasty thing is this is dependent on the number of arguments the
combinator takes.

(invoke-1 swap run)
(invoke-2 -rot run)


invoke is bad for the same reason...

something is terribly wrong with the way i'm approaching this.. no
solution. too many conflicting ideas.

1. i need combinators to "just work"
2. i need to be able to run non-state code properly

possibilities:
- patch all quoted code -> parsed as state code
- do not patch combinators

maybe i should just try?

this is crazy...

i just don't get it.

heeeelp!


i don't know how to solve it.. but i can work around it :)

basicly, the problem i have is that i can't use higher order functions
in combination with the state abstraction: basicly, because the
abstraction effectively uses a different kind of VM. to solve it, i
need to either accept i have to change the VM, or just make the data
i'm using persistent. there are several options:


* turn the n@a+ and n@p+ into target interpreter instructions. this
  just makes them static, so i do not have to use references to
  dynamic state in the core routines. might be the sanest practical
  solution.

* just forget about the functional approach to the dynamic state and
  store this in a global variable. a bit drastic, and i will probably
  regret that later, since it feels like giving up on a good idea at
  the first sight of real difficulty...

i will go for the first one so i can at least finish the interaction
code.. this has the advantage of making the monitor itself a bit more
robust, since it will provide full memory access.

one thing i didn't think about though: making ferase and fprog
primitives will make them a bit less safe (ending up sending random
data). i should add a safety measure.

ok, that seems to work.


Entry: monitor update
Date: Tue Feb 20 21:52:02 GMT 2007

triggered by some unresolved conflict between hidden dynamic state and
the interpreter, i made most of the functions in the monitor available
as interpreter bytecode. this makes it a bit more robust and
apparently a whole lot faster also.

still to fix is some kind of safety measure to prevent the erase to be
triggered accedentally by some unlucky combination of input
data. a password if you want :)


Entry: monitor progress
Date: Wed Feb 21 11:46:43 GMT 2007

got most of it working this morning. next actions:
- variable/bit alloc
- save/restore state
- sheepsint core compile + macros

i do rely a bit on parsing macros in the original sheepsint 3.x
code. that's not so good. time to thing about working out some
abstractions a bit better.

for isr:

flag high? if flag low handler retfies then

now variables/bits

ok, no bits.. do that later, sheepsint doesnt use them: explict
allocation.

next:
- state loading on startup
- interrupt handlers


Entry: getting tired
Date: Wed Feb 21 23:55:34 GMT 2007

yes, time to get it done.. overall, i'm quite happy with the
result. it's a lot better than the previous two. i can't really see
much further from here, other than elaborating towards higher
abstractions (different language), and fixing some jump related simple
optimizations.

the bad guy is quoted method code, which has a strange conflict of
concepts. more on that later.

another thing i miss is inline cat code, i.e. for generating
tables. i think i better do this in a different file, and only in
scheme: no more intermediate cat-only files. 1 1.1 16 table-geom

then the lack of proper run-time semantics is kind of weird. the
partial evaluator replaces this, but in an implicit manner: not
everything is accessible, and the bit depth is different.

about literal opti: still not completely happy, since the patterns
should do the literal preprocessing automaticly.

looking at pic18.ss gives me a warm fuzzy feeling :) most of the
knowledge is encoded in 2 patterns: assembly substitution patterns and
recursive macros. language support is encoded in 2 more: some asm
state monad and writer monad.

the thing which would help a bit is reducing the redundancy for the
rewriter macro specification. the way it is right now is very
readable, but maybe a bit too much clutter. on the other hand, it
might be a bit overengineering.


Entry: monads again
Date: Thu Feb 22 00:57:22 GMT 2007

http://en.wikipedia.org/wiki/Monads_in_functional_programming

Alternate formulation

Although Haskell defines monads in terms of the "return" and "bind"
functions, it is also possible to define a monad in terms of "return"
and two other operations, "join" and "map". This formulation fits more
closely with the definition of monads in category theory. The map
operation, with type (t -> u) -> (M t -> M u), takes a function
between two types and produces a function that does the "same thing"
to values in the monad. The join operation, with type M (M t) -> M t,
"flattens" two layers of monadic information into one.

The two formulations are related as follows:

(map f) m ≡ m >>= (\x -> return (f x))
join m ≡ m >>= (\x -> x)

m >>= f ≡ join ((map f) m)

--

isn't that what i'm doing? 

'map' is my 'lift', it lifts a function operating on only a stack to
one operating on a stack + state information.

'join' is i.e. concatenation of lists in the writer monad i'm using
for assembly,

'return' i don't use? yes i do. it's how i initialize state, i.e. by
loading an empty assembly list on the stack, and how some functions
return a packet of assembly code.

http://citeseer.ist.psu.edu/wadler92essence.html
the basic idea  in monadic programming is this: a function of type
a->b is converted to one of type a->Mb (monadic form)

i.e. assemblers:
a function '(movlw 123) is converted to '((movlw 123))

'bind' is there to compose 2 functions in monadic form.

in the example of assemblers, 'bind' will do the concatenation of the
assembly.


Entry: higher order pattern matching
Date: Thu Feb 22 09:56:59 GMT 2007

meaning: match pattern generation based on templates. it seems to
work, but involves double quoting, which is a bit hard to wrap your
head around.. there's one thing i've been trying to understand for a
while, is how to do this:

`((['dup] ['movf a 0 0] ['lit] ,word) (,'quasiquote ([,opcode ,',a 0 0])))

without having to use the "quasiquote" symbol. maybe i should have a
look at paul graham's "on lisp" again...

ok, i think i got it:

;; ORIGINAL: explicit quoting of the quasiquote symbol
`((['dup] ['movf a 0 0] ['lit] ,word) (,'quasiquote ([,opcode ,',a 0 0])))

;; WORKS: using a name binding to avoid double quoting
`((['dup] ['movf a 0 0] ['lit] ,word) (let ((opc ',opcode)) `([,,'opc ,,'a 0 0])))

;; MAYBE WORKS: pattern generated is (quasiquote (unquote (quote
;; thing))) instead of (quasiquote thing)
`((['dup] ['movf a 0 0] ['lit] ,word) `([,',opcode ,,'a 0 0]))

yep it works..

----------------------------------------------------------------------------------
the trick is to generate this:     (quasiquote (... (unquote (quote thing)) ... ))
instead of attempting to generate: (quasiquote (... thing ...))
----------------------------------------------------------------------------------

to really understand this, it might be interesting to implement quasiquote.
http://paste.lisp.org/display/26298

another thing to note: this merging of quoted/unquoted stuff is what
the syntax macros actually do a lot better automaticly..


Entry: interpret mode
Date: Sat Feb 24 10:15:38 GMT 2007

i got the synth core to run. next actions:
- interpret mode
- setting interrupt vectors
- note table
- figure out line voltage + impedance
- identify


interpret mode seems to +- work

i'm using overriding: if a word is not in the target dictionary, it is
executed on the host. maybe this will lead to some obscure problems?
maybe i really need to separate the 2 a bit better, and use an
explicit debug mode.

then state association. i should have an 'identify' command, so a
connected target chip can tell the host which state file to load. but
how to implement this? i could reserve some space in the bootsector
for this actually.

so, applications..

i was thinking about keeping the monitor independent. i don't think
this is a good idea, since the boot code is really application
dependent. so an application is everything, including monitor.


Entry: syntax macros
Date: Sat Feb 24 21:45:46 GMT 2007

been playing a bit with macros.. i don't really understand them fully
though.. especially the use of local syntax in syntax expansion etc..

also, it's really better to move the preprocessor hook out of
pattern.ss DONE.


Entry: 16bit code
Date: Sat Feb 24 23:19:13 GMT 2007


looking at the sheepsint controller code.. there is no real reason to
not make it 16-bit. all computation i'm doing is on 16bit numbers, and
the overhead of switching everything to 16bit is probably minor.

todo:
- 16bit interpret mode
- way to map symbols


Entry: toplevel workflow
Date: Sun Feb 25 08:41:42 GMT 2007

mainly about program organisation. a program consists of these parts:

1. boot block (first 64 bytes)
2. monitor
3. fixed application code
4. variable application code


a project is a directory. 2. should be made as standard as possible,
and i really shouldn't care about the size, since that only matters
for code protection. 3. should, if possible, stay on the target too
(mark). 4. should have a 'scratch' character.

empty = erase till previous 'mark', but no further than monitor code.
-> replace dictionary with saved dictionary
-> round 'here' up to the next 64 block
-> erase from there
DONE

also, the reset vector should jump to #x40
DONE

and i need to find a way to update the monitor code on the fly.
-> either copy the monitor as a whole (since it's place-independent)
-> copy a minimal copy routine.
as good as DONE


reloading core with minimal effect on state = 'core'

setting interrupt vectors : should save 'here' etc.. -> interesting,
since it really involves a run-time assembler stack. maybe it does
make sense to de-scheme the assembler..


Entry: state monad
Date: Sun Feb 25 10:46:05 GMT 2007

http://www.ccs.neu.edu/home/dherman/research/tutorials/monads-for-schemers.txt

let's see. the problem i had was to use 'for-each' in state
code. because of the way the state needs to be passed, all higher
order functions need to be aware of it. i just need a special
'for-each'.

the way around this is to use the stateful functions ONLY to access
the data, and use pure functions to do manipulation. i.e. 'logic' vs
'memory'. currently in the interpreter it works fine.

this can seem like a drag, but in fact it is a good thing: functions
are not unnecessarily infected by state.

so, monads are about order of execution, and really central to a
compositional language, where there is only order! this in contrast to
lambda languages, where there is an intrinsic parallellism in the
order of evaluation.

about interpretation: if you see a concatenative language as a series
of sequential operations, it is 100% serial (the way it is
implemented). however, if you see it as composition of functions,
there is no evaluation order, because there is no evaluation, only
composition.

i need to look into list comprehension etc...


Entry: call conventions : de-scheming
Date: Sun Feb 25 10:56:48 GMT 2007

instead of doing some real work today, i'm going to have some
fun. make the interpreter more reflective, meaning, converting all
important routines to operate on a stack.

actually, that might not be a good thing.. i'm using non-stack
functions for convenience, so primitives are simpler to code, and i
can use the lambda abstraction instead of combinators.


Entry: time to do some work
Date: Sun Feb 25 13:36:34 GMT 2007

- interrupt vectors DONE
- a/d converter for board
- 16 bit interpreter
- constants and variables


Entry: alan kay oopsla lecture and stuff
Date: Sun Feb 25 21:36:56 GMT 2007

- 'core' should not distroy any DYNAMIC state AT ALL, only static
  background.

- every restart is a failure.

- need better debugging.


i need to be more observing of things that are annoying, and fix them
immediately instead of some short-term goal. the thing i'm trying to
do is to build a better tool, not to finish some product. i need to
always try to distill the important core idea, instead of bashing away
to 'just make it work'..


Entry: variables
Date: Mon Feb 26 09:52:27 GMT 2007

i've got a problem with variable names: the dictionary does not make a
distinction between flash and ram names, but the interpreter does need
to treat them differently. this is solved properly by using two
dictionaries, or a nested dictionary.

probably 2 dictionaries is better.. requires a little rewrite though.

ok. i started to rewrite the assembler to separate assembly from
dictionary operations, so it's easier to make the dictionary an
abstract object.

then i need to change to recursive operations '(ram here) instead of
'here etc..

seems to be fixed. the implementation is abstracted, and currently
solved as a simple sub-dictionary. maybe move this to 2 separate
dicts..


Entry: analog -> digital
Date: Tue Feb 27 08:25:45 GMT 2007

i don't know what's wrong, but it doesn't work properly. but first,
documentation. let's copy and paste the previous one.


Entry: reasons
Date: Tue Feb 27 12:12:44 GMT 2007

all i want is lisp, but:

- cat is terse
- cat is more editable
- forth works on small things
- forth with linear data is predictable

the first to are from the point of writing software, and interacting
with a system, the last two are practical solutions to needing a lot
of programming power under constraints (small or RT)

something i learned though: it's bad to waste time writing
combinators. in BROOD 3.x i solved this by writing some core things in
scheme. basicly they are combinators: interpreters for certain kind of
code propagating state.

in PF and previous forth experiments i ran into this problem several
times: trying to express something which really needs 'hidden
state'. mostly i solve this using global variables. which is ok as
long as there is no pre-emptive multitasking going on.


SCRIPT    = toplevel organization: large amount of trivial code
ALGORITHM = small amount of  nontrivial code

the idea is to use a scripting language to glue together algorithms,
while the nontriviality of the algorithms is hidden, and the
connectivity between them is made managable by the features of the
scripting language.


Entry: linear lists -> PF
Date: Tue Feb 27 12:18:46 GMT 2007

yes. it does make sense to rewrite malloc. malloc is not what i need
if i'm using linear data structures. i don't need free, only free
lists. and yes, it would be cool to have access to the page
translation table too :)


Entry: compilation is caching
Date: Wed Feb 28 01:41:21 GMT 2007

compilation is really caching.. maybe i should find a way to add
dynamic loading of code without full image reload, by using a custom
made 'promise'. one that can be un-cached whenever a new word (or
group of words) is defined, so code can be re-bound.

more about the caching.. this means that symbolic code is really the
only representation of code. the compiled representation is an
invisible optimisation, and should be hidden from the programmer. if i
replace all atoms with a struct containing their symbolic version and
a possibly cached behaviour, i can re-interpret on the fly..

this should give all the benefits of late binding, without the
drawbacks of having to reload the whole image all the time.. however,
cache invalidation probably needs to do this anyway: invalidate a
whole dictionary of code, unless all references can be found
somehow.. probably not.

so what's the difference? what would a proper cache offer?
uncompilation for one.. it's probably good to keep the symbolic data
and environment around..


Entry: no more quotation
Date: Fri Mar  2 14:08:38 GMT 2007

quotation sucks.. and it's really not necessary if i install a default
semantics. my previous argument was: no default semantcs (no
defaults!) because i need more than one.. however, everything will run
on the VM as primitives, so there is no real good reason to have no
defaults: the symbolic representation might be "the bottom line", with
compilation viewed as optimization/caching.

what needs to be done to fix this? i probably need a better object
representation, a more abstract one. an object has properties, one of
them being its cached rep.

so.. what is an object?
- syntax, form.. this is the 'data' part
- semantics in the form of an associated interpreter object

optimizable properties:
- cached semantics.

this is really just OO. need to look at smalltalk.. maybe it's good to
have some ideas propagate. data=object data, interpreter=class

summary: the idea is to parse into something which retains the
symbolic representation, so semantics can be late bound, and
compilation is still possible, but is done with memoization.

clearing the cache is then possible by scanning the entire memory from
the root and invalidating some bounds.

this trick can also be used in PF. a linear language with late binding
but aggressive memoization.

hmm.. i read something in "the essence of functional programming"
http://citeseer.ist.psu.edu/wadler92essence.html
about values versus processes. to paraphrase:

- in lambda calculus, names refer to values
- in compositional languages, names refer to functions

the first one only has values, (while functions are a special case of
values), while in compositional languages there are only functions,
with values represented as functions.

going the intuitive route: a name is a function, and only a
function. an object is only a function. it has an associated
action. data is represented by a generator.


Entry: a new PF
Date: Fri Mar  2 15:22:12 GMT 2007

summary:

- object oriented: objects are functions. each object has a 'syntactic
  representation S' and an 'associated interpreter I'. (the result of
  applying I to S is X, an executable program which acts on a data
  stack.

- the basic composite data structure is a CONS cell.

- composite data is linear: no shared tails.

- the interpreter needs to be written in (a subset of) itself, to
  allow easy portability (to C).


problems:

all the problems are related to the linearity of the language. to make
things workable, some form of shared structure needs to be
implemented. however, this can lead to dangling references.

-> continuations / return stack
-> mutual recursion

if i clean up the semantics such that dangling pointers are allowed in
some form, like 'undefined word', this should be managable. to keep
things fast, this needs to be cacheable: it should be possible to
detect wether an object is live etc..

to rephrase: looks to me that a completely linear language is really
unpractical. how do you tuck away non-linearity so behaviour is still
real-time?

i keep running into the idea of 'switching off the garbage
collector'.. decompose a program into 2 parts: one that uses a
nonlinear language to build a data/code structure, and a second one
that runs the code: trapped inside the brood idea: tethered
metaprogramming.

-> a predictive real-time linear core (linear forth VM + alloc)
-> a low priority nonlinear metaprogrammer (scheme)

together with the smalltalk trick to simulate the real-time linear
core inside the metaprogrammer.

the VM:
- no if..else..then: only quotation and ifte
- no return stack access: use quotation + dip

this can be a lot more general than for next gen PF. i can run this
kind of stuff on a microcontroller too, to have a different
language. one with quotation, and no parsing words.. the idea is to
make the VM as simple as possible: i already have a way to implement a
native forth, maybe the catkit project should be just that: CAT is
that thing that runs on the micro? linear CAT?


Entry: linear CAT vm
Date: Fri Mar  2 15:59:41 GMT 2007

- run:     invoke interpreter
- choose:  perform conditional
- quote:   load item from code onto data stack

- tail recursion: this is really important
- continuations (return addresses) are runnable

using variable bit depth? code word bitdepth is determined by the
number of distinct words. an 8 bit machine is for small programs,
while a 16bit machine is for larger programs and/or programs that need
to do more math. something inbetween is also possible. most practical
is 12 bit. but the most important thing is: the data stack needs to be
able to hold a code reference.

for the 18f, i think it's best to go to 16bit. the forth is for
inconvenient features, while the highlevel language should be that: a
highlevel language.

in order to properly implement tail recursion, the caller should be
responsible for saving the continuation.


Entry: direct threading
Date: Fri Mar  2 16:33:47 GMT 2007

i'm trying to write an interpreter with these properties:
- proper tail calls (caller saves continuation)
- continuations can be invoked by 'RUN'
- direct threading.


in direct threading, threaded code is a list of pointers that points
to executable code, and a continuation is a pointer that points to a
list of such pointers. so yes, these constraints can be satisfied:

- composite = array of primitive
- continuation = composite
- a composite code can be wrapped in a primitive using a simple header

TBLPTR -> composite code
PC     -> primitive code

see direct.f -- summary: the most important change is threaded code +
proper tail calls by moving the continuation saving to the caller.


Entry: linear languages
Date: Fri Mar  2 19:13:23 GMT 2007


http://home.pipeline.com/~hbaker1/Use1Var.html

"A 'use-once' variable must be dynamically referenced exactly once
within its scope. Unreferenced use-once variables must be explicitly
killed, and multiply-referenced use-once variables must be explicitly
copied; this duplication and deletion is subject to the constraint
that some linear datatypes do not support duplication and deletion
methods. Use-once variables are bound only to linear objects, which
may reference other linear or non-linear objects. Non-linear objects
can reference other non-linear objects, but can reference a linear
object only in a way that ensures mutual exclusion."


what he describes a bit further on is an 'unshared' flag. a refcount =
1 flag, but it looks like this is more in the context of a mark/sweep GC.

an attempt to make some patterns automatic? reverse list construction
followed by reverse! is an example of a pattern that might be
optimizable if the list has a 'linear' type: the compiler/interpreter
could know that 'reverse!' is allowed as a replacement of 'reverse'.

so as far as i get it, baker describes a 'linear embedded
language'. linear components are allowed to reference non-linear ones,
but vise versa is not allowed without proper synchronisation. so in a
RT setting, this means the only thing that is allowed to run in the RT
thread is the linear part, while the nonlinear part can maintain it's
game outside this realm.

so, again:
- high priority linear RT core (forth)
- pre-emptable nonlinear metaprogrammer (scheme/cat)

the linear part contains only STACKS + STORE. the nonlinear part can
contain the code for the linear part. the compiler runs in the
nonlinear part. the nonlinear part is not allowed to reference CONS
cells in the linear part.

this can be implemented entirely inside of PLT. on the other hand,
having this structure independent of a PLT image makes it more
flexible: the core linear system should be able to do it's deed
independent of the metasystem's scaffolding.

baker calls my 'packets' nonlinear types: names with management
information (reference counts): a strict distinction is made. this
allows a nonlinear type to be decouple from it's (possibly linear)
representation object.

in PF this means: packets are references to linear buffers. the result
is that underlying representation can change ala smalltalk's 'become'.

conclusion:
- cons cells are linear
- packets are nonlinear wrappers for linear storage elements
- packet access: readers/writers access protocol: mutation is only
  allowed when there are no readers (RC=0). (functional ops)
- 'accumulation ops' use shared state + synchronized transactions.


Entry: standalone forth
Date: Sat Mar  3 13:25:46 CET 2007

maybe he didn't get it, but writing this compositional language and a
standalone forth are conflicting ideas..

it's not so hard to give up on parsing words, other than true quoting
words: there will be only one left, let's call it '{'.

what's worse is that i need to dumb it down a bit. i'd rather define a
new language, but an ANS forth might be better for teaching. for the
simple reason that i don't need to write such an extensive
manual. maybe it still makes sense to run both languages on the same
VM ?

another forthism that's not really necessary: since i'm sharing code
between the lowlevel subroutine threaded forth and the direct threaded
forth, why not make the VM primitives equal to subroutine threaded
forth, instead of them being directly linked to a NEXT routine. in
other words, why not have an explicit trampoline? this will be
slightly slower, but uses less code since the primitives don't need a
separate binding, which would just call the other code anyway.


conclusions:
- interpreter loop allows primitives == native code (STC forth)
- 'enter' uses short branch -> code needs duplication
- primitives need no IP saving!! (compiler needs to distinguish
  between primitives and highlevel words)


the last one is a consequence of doing continuation management on the
caller side: caller cannot be agnostic! it should be possible to pass
this information to 'enter' somehow, so enter can save/restore
depending on some flag. carry flag? that's ok, a long as this machine
state is guaranteed to be saved..

however, in this case, the primitive needs to call 'EXIT' in case the
carry flag is not set! so still, some compiler magic is necessary, or
all words need to terminate with an EXIT call, independent of wether
they terminate with a tail call.

this is a bit messy... let's try to summarize: the flag is called the
NTC flag: non-tail-call. 

- EXIT = leaves current context
- WORD -> ENTER conditionally saves the context (carry flag)
- PRIMITIVE: needs EXIT if TC flag is set.


again.. there are 4 cases:  PRIM/COMP and TC/NTC. what i'd like is to
solve the PRIM/COMP completely in 'enter', such that the interpreter
can be agnostic about highlevel words.

an instruction =   primitive + NTC flag

what does the NTC flag mean for the interpreter? nothing. it's just
extra information passed to 'ENTER': it means the rest of the code
thread can be safely ignored.

the interpreter completely ignores it, and just runs forever, assuming
the code stream is infinite. all threading changes are implemented by
other primitives.

so, given the current implementation, a solution is to always compile
EXIT, together with a bit that indicates an instruction is a tail
call. this is not very clean.. the exit bit should be universal.

semantics: the bit indicates that the current thread can be discarded
BEFORE passing control to the primitive. then the primitive can always
just save the continuation. (a possible optimization is to overwrite
the continuation, but let's to the former first since it's
conceptually simpler). this is different in that the interpreter is
not agnostic about the return stack, but effectively implements
'EXIT'.


Entry: is code composite? run or execute? yin or yang?
Date: Sat Mar  3 16:37:37 CET 2007

in CAT it seems i've converged on only using composite code = list of
primitives as the quoted programs that can be passed to higher order
functions. however, original forth does not use this stance: threaded
code is a list of execution tokens, and execution tokens are the
canonical representation of quoted code, when treated as data.

this is wrong. why? reflection becomes more difficult.

the stuff on the return stack is the saved IP. this should be "a real
program". the inner interpreter deals with arrays of primitives, and
such arrays can be wrapped in a primitive by prepending them with
ENTER. however, the data representation of code should really be
composite, so no primitive address, but a composite address.

primitives == internals. it's better to treat primitives as singleton
composites, than to treat composites as primitives. in the inner
interpreter, the reverse view is better.

i think this view originates in original forth, and is mainly
historical: primitives came first. composites were treated as
primitives. i can't think of another reason really..


conclusion: 

------------------------------------------------------
programs are composites = array of threaded primitives
------------------------------------------------------

i'm going to reflect this in the following change:
- execute is reserved for primitives
- run is reserved for composites


also, if you look at native code, the picture is pretty clear:
primitives are machine instructions, and you simply cannot 'execute'
them, they always need to be inside a code body.. composite code =
list of instructions, referred to by an address. it's just the same...


Entry: reflection
Date: Mon Mar  5 02:26:40 CET 2007

have to think about this a bit more. something strange going on with
this primitive/composite thing. what about only having highlevel code:
composite code just links to more composite code.. there's no way to
plug in primitives here. for purely pragmatic reasons, using
primitives, and highlevel words wrapped in primitives is worable..


Entry: essentials
Date: Thu Mar  8 16:57:26 EST 2007

* symbolic -> ast + room for it
* possibility to 'uncompile' an AST
* use abstract types (structures) and 'variant-case' in AST


Entry: delayed list interpretation
Date: Fri Mar  9 10:48:54 EST 2007

thinking about eopl : i need more data abstraction. car and cdr is
nice, but they really are quite lowlevel. there's too much
implementation leaking through. the asm monad is a good example of
abstraction, but code probably needs it too..

about using symbolic code and caching: parsing a list, it can be
either code or data, depending on how it is accessed. maybe it should
really have these 2 identities? if accessed using list processing, it
behaves as a list, but if accessed using 'run', it is behaves as code
-> jit compilation cache. the benefit of this is that the semantics
can change.

so a list is really an object with different identities. all list
processors should be modified to take an abstract list object.


Entry: platforms
Date: Mon Mar 12 18:54:28 EDT 2007

ai ai ai... i'm spending money again, surfing on ebay.. discovered
this nice ARM7 board on sparkfun made by olimex. it has a 128x128
color lcd, usb client mode and ethernet. a dream platform for brood,
especially since this is THE standard 32bit chip, getting really cheap
too.. so i want: 8 bit PIC18, 16 bit DSPIC30 and 32bit ARM7.


Entry: itch
Date: Tue Mar 13 12:47:23 EDT 2007

want to start changing some abstract data type implementations.. the
most important one is probably 'quoted program' or 'composition'. this
has to be distinguished from a chain of cons cells in that it has more
structure. a composition can always be converted to a chain of cons
cells, and a chain of cons cells can be converted to a composition if

1. it's a proper list
2. some interpreter semantics is attached

so a quoted program is the above: a proper list with attached
interpreter semantics. the changes this requires are:

- all data operations that modify lists need to accept the
  'composition' data type and convert automaticly.

- the parser needs to produce compositions instead of chained cons
  cells.


it's getting old.. but the structure of program (source compile
cached) is probably better written as (primitives) where each
primitive has its own semantics and cache: word (atom compile
cache)

there are several options for word
(source thunk/cache)
(source thunk cache)
(source compile cache)
(source compile env cache)

in general: do we want the environment to be explicitly specified, or
is an abstract representation of the binding operation enough?

one of the requirements is the ability to rebind, so at least cache
and binder need to be separate, since the binder uses (in the current
implementation) some mutable state. 

probably the following model is close enough to the current one
(basicly the same as 'delay' but with possibility to clear the cache)
to not need a lot of changes (or enable incremental ones) and will do
what's required:

prim = (source atom, interpretation thunk, cached compilation)

so:
- code can be re-interpreted by just clearing the cache
- all code, independent of semantics, can be specified as lists
- a 'data' mutator can be defined that strips a list from all
  executable semantics

so i guess the conclusion is:
- composite code is a concrete list of abstract primitives
- primitives contain memoization info

this brings me to restate that in BROOD, the compiler itself is
written using OO techniques with mutable state, but the target
compilation is completely functional.

the reason is this:
- the host language is mostly about organization  -> mutable OO
- the target compiler is mostly about algorithms  -> functional + monads

the main practical reason for using functional approach in the
compiler is the ability to work with continuations for very flexible
control structures. the 'constant' part is implemented as an OO
system.


Entry: reflection
Date: Tue Mar 13 13:46:25 EDT 2007

another thing i keep running into is the mixed use of 2 calling
conventions: scheme N->1 and cat stack->stack

it would be nice to have scheme only to provide primitives, and have
all other utility code be out in the open. however, given the way some
algorithms are implemented now, that is impractical. i can have all
the reflection i want, but not necessarily from CAT, since that would
make it harder to use scheme to implement the core of things..

maybe it's good to keep this in mind: CAT is just a minilanguage
inside scheme, and all the things i need to bring out can easily be
brought out if necessary. full reflection is not necessary yet.

probably CPS chapter in EOPL will make this a bit more clear..

ok.. getting rid of the parse/find abstract interface. it complicates
things too much..

one thing i didn't think of is that 'find' maps  thing -> word. so the
compiler for a symbol + find is something that looks up a word AND
dereferences the implementation.

yep. i noticed it is really a good thing to use closures instead of
explicit structures.. of course, this does mean that all the red tape
moves to the other side: all things that provide closures need to do
the binding.

it's not going so smooth as expectected: all through the code it is
assumed a primitive is either a function or a promise. so i guess it's
a good idea to change it now. the main problem is the 'find' as
expressed above, since i have an extra level of indirection that
distinguishes 'find' from 'compile', with explicit delayed compilation
(interpretation) instead of implicit.

i can probably work around it by providing:
- compile lifting
- special primitive registrars


i need to sleep over this..

current problem = pic18-literal: used in a lot of places. produces
primtive but should produce word. i changed this, need to check. the
rest should be straightforward..

also writer-register! is wrong, due to the lift not being
wrapped.. maybe better just lift words instead of prims?
all this freedom!!


Entry: spaghetti
Date: Wed Mar 14 08:33:03 EDT 2007

the change above brings up some conceptual confusion.

- a word is a representation of an atomic piece of code. it retains
  its source representation, and a translator which defines its
  semantics.

- lifting is done on the primitives, not on the words. maybe that
  should change? NO

- the pattern (register! name (atom->word name compiler)) has 2
  occurences of name. this is ok. the first one is an index, while the
  second one is there to recompile the word if necessary.

ok, it seems to work now..


Entry: reload
Date: Wed Mar 14 09:48:10 EDT 2007

i run into problems with redefining the structs: data lingering after
a reload is not compatible with type predicates and accessors. this
means code cannot survive a reload. this is a bit ill defined, so i
need to make a descision.


- if i don't redefine 'word' on reload it can never be redefined on
  reload, which is a bit of a nuisance.

- the other solution is to redefine 'word' and chainging all the data
  on the stack to reflect this change: the stack is the only thing
  that survives a reload, so it needs to be properly processed. it
  looks like i need a temporary structure to solve this.

- find a better way to implement reload.


the struct thing is really annoying. i need to find a solution to that
soon. all the rest works with reload, even using a 'symbolic'
continuation passing: after load, the repl loop itself is recompiled.

maybe i should separate the files into init and update. that way it is
possible to perform incremental updates.

ok. the solution seems to be to install a 'toplevel' continuation that
is passed the entire application state (stack). 'load' can then be
called with a symbolic code argument == continuation.


Entry: TODO
Date: Wed Mar 14 11:01:25 EDT 2007

got it hooked up. lots of things to fix:

- pic programmer endianness bug for high word?
- fix reload / scheme modules (using different files with include) DONE
- create a monitor/compiler for the 16bit threaded interpreter


a compiler for the threaded code would:
- map a list of words to their respective addresses
- perform tail call optimization

i should go straight for cat-like code with code quoting.

yep. the most important things to tackle now are modularity and
platform independence. aspect oriented programming :)

maybe is should leave the module stuff for later, since reloading is
not that easy... loading inside a module namespace might be possible
though.


Entry: languages
Date: Thu Mar 15 09:40:09 EDT 2007

how to combine 2 different languages in one project? i'm trying to
write the purrr language in terms of purrr/18, and i need an easy way
to switch between them.

i need a methodology.. what is a threaded forth? basicly a set of
primitives. so what i need is:
- a list of primitive names
- a way to compile 'enter'

things to look at:
* unify all toplevel interpreters, so i can have more
* separate console.ss into machine specific things.


a toplevel interpreter is
- a string interpreter
- an exception handler
- a continuation


what about i store all the modes in the data store, symbolically?
seems to work.

now, about the VM. i think it's best i standardize on the VM mentioned
above: call threaded with return (jump / tailcall) bit. this can be
written in C too, so should eliminate most porting problems, with only
optimization problems remaining.

ok. back to where i started. should i allow 2 different languages on
one attached system? why would i want to do that? debugging of course,
but what else? there's a bit too much freedom here. 2 languages:
native + ST forth will compilicate things, but will also make things a
lot easier to use.. and a threaded forth compiler isn't so incredibly
hard to build..

so, i work with one core language purrr/18 and build a threaded forth
on top of that using a different mode.

so.. next problem = representation for words. i'm using a simple name
prefix = underscore. maybe there's a better way to do this? name
prefixes allows the use in the lower level language.

ok.. rambling on. the way to do it is to just translate highlevel
language into lowlevel forth, and pass it on to the compiler.

(1 2 +)  -> (ENTER ' _lit 1 ,, 2 ,, ' + EXIT)

here EXIT is a special word that installs the return bit in the MSB of
the last word.


Entry: TODO
Date: Thu Mar 15 13:17:53 EDT 2007

- variable abc abc 1 +
- flash addresses as literals
- exit bit
- write a paper about the absence of '[' and ']' and the relationship
  between literals and (dw xxx)


the first 2 are similar: it would be nice to partially evaluate some
code that uses words from the ram and flash dictionaries next to
constants.. this introduces another dependency.

currently the partial evaluator only resolves constant symbols. it
requires a new dependency [ dict -> compiler ] to resolve this
problem. there is a possibility to delay the evaluation of the
optimization until assemble time, by using closures..

there's a deeper problem here: name resolution needs to be fixed..

let's see..  partial evaluation can't fail in a sense that is
recoverable: if the literal optimization fails, it's a true error that
fails the entire compilation. this means the evaluation itself can be
delayed until the environment is ready, since the control flow does
not depend on the result.

a delayed evaluation has the form   \env -> value
whyle env is: name -> value.

the more i think about this, the more better i like the idea.

ok, so the first 3 can be solved using some form of delayed evaluation
until exit.


Entry: delayed evaluation forced in assembler
Date: Thu Mar 15 15:56:08 EDT 2007

there is already one kind: symbolic constants. the addition that needs
to be made is generic expressions. there are several forms to
choose:
- symbolic lisp expressions
- symbolic cat expressions
- scheme closures

the first one are nice since they are symbolic, so easier
debugging. the last one might be simpler to implement. lisp style
expressions make more sense here since they have a single value, not a
stack.

now, this can be combined with the paper on partial evaluation:
partial evaluation should then be transformed to compile time meta
code evaluation.

actually:

1 2 +   ->   [ 1 2 + ]L

following colorForth, executed code always results in a literal on the
->green transitions.

ok, so this fixes the question above: 
* delayed code is symbolic cat.
* assembler does final evaluation of this code


so what is the context?

machine constant -> number
variable name -> data addresses
forth words -> code address

operations come from some dictionary, probably cat, but need to be
escaped somehow. let's say: search meta first, then variables, words,
constants.

this needs some changes: 

* the assembler needs to be a CAT word, so the stack can be used as
  context.

* it's probably better to wrap all symbolic names in a list, so the
  evaluation is uniform: either numbers or lists.


this seems to work pretty well.


Entry: 16bit threaded forth compiler/interpreter
Date: Thu Mar 15 18:06:23 EDT 2007

let's give them a name: the highlevel forth is PURRR and the lowevel
forth is PURRR/18. here i use shorthand names threaded and native
resp.

first problem is the parser, since the forth needs its own parsing
words. that should be the only real problem. since this forth is
mainly for higher level stuff, i don't need machine constants: all
machine access is solved on a lower layer. actually, the different
namespace is a nice excuse for some simplificiation.

second problem is running code from the brood console. this needs a
little trampoline, since the only way to get out of a running
interpreter is to call 'bye':

' bye -> IP
<addr> _run CONT


Entry: added pattern debug
Date: Fri Mar 16 10:13:07 EDT 2007

the pattern compiler now has a debug method which dumps the source rep
of the patterns into the asm buffer for inspection. this is
implemented in the form of a match rule that matches ['pattern].


Entry: added robust reloading + logging
Date: Sat Mar 17 09:38:56 EDT 2007

error on reload: console waits for <ENTER> then reloads. this is to
give a chance to correct syntax errors without loosing state.

also added 'cat.log' output, which enables the use of emacs'
compilation-mode. just run 'tail -f cat.log' as compile command.


Entry: trampoline
Date: Sat Mar 17 09:40:45 EDT 2007


ok, i got something wrong...

words stored in the dictionary are primitives. invoking a highlevel
word from within a lowlevel native context requires the use of these
primitives.

remember: highlevel 'run' ONLY takes composite words, while lowlevel
entry ONLY takes primitives. IF a primitive contains a highlevel
definition AND the primitive points to an ENTER call, THEN the rest is
highlevel code.

the correct way to build a lowlevel -> threaded trampoline is:

* set the highlevel continuation, saving the current one, to highlevel
  code that does (bye ;)
* call the primitive
* call the interpterer to invoke the continuation


Entry: delayed eval in assembler
Date: Sat Mar 17 10:18:14 EDT 2007

i should go to an architecture where the number of passes in the
assembler is not fixed, but just enough so all expressions can
evaluate with correct dependencies. maybe one pass is necessary to at
least find out if all the labels are defined.


Entry: literals
Date: Sat Mar 17 12:06:44 EDT 2007

should literals be handled in the interpreter?

problems: 

* a _lit instruction cannot drop the current thread, because the value
 needs to be accessed from the current thread. so the code "123 ;"
 translates to "LIT 123 NOP|EXIT"

* moving this to the interpreter by encoding the value in the opcode
  is possible, but then large words need 2 words. it also requires 2
  different lit instructions, implemented as:
  
  - LOAD + LOADHI
  - LOADLO + LOADHI
  - DUP + LOADSHIFT
  - LOAD + LOADSHIFT


the one with DUP requires only one explicit lit, but it always needs 2
words, even in cases where the data would fit into a single word.

i think code density is more important than any other constraint,
except conceptual simplicity of the language, which is independent of
the VM implementation. so it's probably going to be 2 different lits.

so how to implement that?

it's easy to detect if the high byte of an address is zero, since the
flags will be set. this could be the clue. the address space this
overlays is just the boot monitor, so that's no problem.

some bitneukerij. 'x>IP' uses movff so it doesn't affect flags, this
means the testing of the zero flag can be done after testing of the
carry flag.

are literals important enough to give them half of the address space?
the answer is probably yes:
- they occur a lot
- if they are as cheap as constants, you don't need constants
- 14 bit signed words will cover most use for numbers (counters/accumulators)
- other literals are addresses: make sure memory model respects this opti


so how do we give them half the address space:
- effectively: use only 32 kb -> enough for now
- align words to 4 byte boundary (and possibly reclaim the storage..)

let's go for the first one: only 32kb address space. the other half
could maybe be used for byte access?

so, encoding primitives as

[ EXIT | LIT | ADDR/NUMBER ]

gives  EXIT -> c
       LIT  -> n  afer one shift

this looks nice:

\ inner interpreter loop
: continue
    prim@/flags                 \ fetch next primitive + flags
    exit? if x>IP then          \ c -> perform exit
    literal? if 14bit ; then    \ n -> unpack literal
    execute continue ;          \ execute primitive

\ interpret doubleword [ 1 | 14 | 0 ] as a signed value.
: 14bit
    _c>>                 \ [ c | 1 | 14 ]
    1st 5 high? if       \ sign extend
	#xC0 or  continue ;
    then
	#x3F and continue ;


after fixing a bug in 'd=reg' it seems to work


Entry: parsing
Date: Sat Mar 17 19:00:52 EDT 2007

just fixed the interpreters and direct->forth translators using
parser.ss

somehow something doesn't feel right though.. parsing words feel
'dirty'. i'll try to articulate why, since i don't think anything can
be done about it.

internally, quoting is no problem: you just build a data type (word)
that supports quoted functions/programs/symbols.. in CAT this is done
by creating primitives that map  stack -> (thing . stack)

however, in program source code it is problematic: non-quoted
compositional code has a 1->1 correspondence symbols<->semantics, and
the semantics of successive words is not related. quoting is about
modifying the semantics of symbols.

one example where this is done very nicely is colorForth: here the
color of a symbol is part of the source code, and represents
information about how to interpret a symbol name. in textual from this
would be something like

(red drie) (green 1) (green 2) (green +) (green ;)

here a pair of (color word) represents a single semantic entity. in
ordinary forth however, it's not done this way: not all words have a
prefix (a color). an other way to say it: most words use the default
'color'.

so, in a sense, the thing that is 'dirty' is the default
semantics. this is not so bad for convenience sake, but does requires
a parser that introduces the semantics. otherwise we would have

: drie  number 1 number 2 word + word ;

which is really what it is parsed to in the end.. the thing i'm being
anal about is that CAT has a 1->1 correspondence between syntax and
semantics, inspired by Joy. although, this is not entirely true. a
syntactic shortcut in the form of (quote thing) is introduced to be
able to quote lists and symbols. but this is not entirely necessary:

'(1 2 3)  ==   (1 2 3) data
'foo      ==   (foo) data bar

with these operations being a bit less efficient. that concludes the
rant.


Entry: quasiquote
Date: Sun Mar 18 09:00:31 EDT 2007


which leads me to the following. it does make
sense to have lists of programs in CAT, where quasiquote would come in
handy.

`(,(+) ,(-))


Entry: program->word
Date: Sun Mar 18 09:09:50 EDT 2007

some nitpicking about constant->word. before i had quoted programs
wrapped using constant->word. this doesn't make sense, since the
'constant' is really a parsed thing, and not a source representation.

however, it does enable 'data' to do its work. but why don't i just
quote the source of the entire program, and store the parser as
semantics? that would be cleaner, but something doesn't feel right
there either..

well, actually. i can just delay parsing completely! that seems like
the right thing to do: the source can just be retained in its original
form, and initial recursion during parsing is avoided, which directly
solves the problem of setting! an atom's semantics.


Entry: lazy eval
Date: Sun Mar 18 10:06:18 EDT 2007

i think i start to see why a lazy language can be so convenient.. i
spend quite some time trying to figure out when it's best to evaluate
some expression. if this is always "as late as possible" this work
should disappear. nevertheless, it's an interesting exercise.

for the assembler it might be interesting to write it completely
lazily, including the optimizations necessary for jumps, which i still
need to implement.


Entry: disassembler
Date: Sun Mar 18 14:40:43 EDT 2007

disassembler needs to be smarter. i probably need to add some
semantics to the fields, and have a platform-specific map translate
them:


resolver closure + asm code -> [ shared code ] -> disassembled ->
prettyprint.


Entry: open files
Date: Sun Mar 18 17:15:50 EDT 2007

something is terriby wrong with the open files.. fixed by manually
closing. i think i need to read about how ports get garbage collected,
or not.. indeed. they are not, need explicit close or make an
abstraction:

http://list.cs.brown.edu/pipermail/plt-scheme/2004-November/007247.html


Entry: where to go from here?
Date: Tue Mar 20 01:01:02 EDT 2007

enough mudding about. roadmap:
- get dtc working with host interpret/compile
- make it self hosting
- combine with synth

- dspic asm + pic18 share/port


Entry: a safe language?
Date: Tue Mar 20 01:22:58 EDT 2007

[ 1 + ] : inc
[ 2 + ] : inc2

is it possible to make a safe language without too much trouble?
something like PF. without pointers.. 

[1 2 3] [1 +] for-each

the interesting thing is that i can use code in ram if i unify the
memory model. i think it's time to start to split one confusing idea
into 2:

- a 16/24bit dtc forth for use with sheepsynth dev: control computations
- a self contained safe linear language for teaching and simple apps


safe means: 
* no raw pointers as data
* no accessible return stack, so it can contain raw pointers
* no reason why numbers need to be 16 bit: room for tags
* types:
  - number   [num  | 1]
  - program  [addr | 0]

features:
* symbols refer to programs, special syntax for assigment
* assigning a number to a symbol turns it into a constant
* for, for-each, map, ifte, loop

[ 1 + ] -> inc
1 -> inc

[[1 +] for-each] -> addit


now.. lists?
the above is enough for structured programming, but map and for-each
don't make much sense without the data structures.. so programs should
be lists, at least semanticly. since flash is write-once, a GC would
make more sense than a linear language.. so what about:

purrr/18 -> purrr -> conspurrr

maybe it's best to stay out of that mess.. cons needs ram, not some
hacked up semiram.

what about using arrays? if programs are represented by arrays instead
of lists, not too much is lost:

[1 2 3 4] [PORTA out] for-each   ;; argument readonly = ok
[1 2 3 4] [1 +] map              ;; argument modified in place (linear)

the latter one needs copy-on-write.

[[+] [-]] [[1 2] swap dip] for-each

what about

[1 2 3] [1 +] map -> test

1. arrays are initially created in ram, as lists?
2. when assigned to a name, they are copied to flash
3. assignment is a toplevel operation, effectively (re)defining constants
4. flash is GCd in jffs style.
5. words can be deleted.

in ram: one cell is 3 bytes: 2 bytes for contents + 1 byte next
pointer. this leaves room for 256 cells, or 768 bytes.

it might be interesting to make assignment an operation that's valid
anywhere: persistent store.. on the other hand, that encourages misuse.

so..
- free lists make no sense in flash
- they do in ram
- persistent store rocks

in order to make this work, i need to write a flash filesystem first.

problem: does redefining a word redefine all its bindings? it
should. so each re-definition needs to be followed by a
recompilation. nontrivial. this gets really complicated...

can't we represent code as source, and cache it in ram? it looks like
variable bindings should really be in ram. but what with persistent
store?

damn. dumbing it down aint easy. i think maintaining the late binding
approach is infeasible. maybe it's good enough to clean up the
semantics a bit? 1->1 syntax->semantics mapping (i.e. choose is the
only conditional) so code can be used as data using 'map'. maybe that
does make sense.. 'map' as 'interleave'.

ok, that's enough. 


Entry: language in the morning
Date: Tue Mar 20 09:02:11 EDT 2007

after 4 hours of sleep: it's hard to say goodbye to nice ideas when
they don't work for practical reasons.. still there's something here.
i think i just need to read the PICBIT paper by Marc Feeley and Danny
Dube, and bas it on that. it looks like i just need to be wastful:
everything is source code, the flash is a filesystem, and the ram
contains executable code. the most important of all: it should be
towered:

purrr/18 -> purrr -> cat/18

so to distill again:
- cons cells in ram
- a flash file system


it is interesting how the linear/nonlinear language thing i'm using,
and this linear ram and nonlinear flash memory model coincide.

the approach in PICBIT seems interesting: using fixed size cells of 24
bits = [2 | 11 | 11], with the types:

00 PAIR
01 SYMBOL
10 PROCEDURE
11 one of the others


Entry: distributed system
Date: Tue Mar 20 09:53:19 EDT 2007

i was thinking: this tethered approach makes a whole lot of sense in
the case of one host controlling a huge amount of identical cores.


Entry: back to dtc
Date: Tue Mar 20 10:42:24 EDT 2007

got compile + interpret working. time for control structures. i'm
seriously considering only using code quoting. but how to implement?
same as in PF?

it's actually not so hard:

x x x { y y y } z z z
      |
   
x x x quot L123 ; y y y ; : L123 z z z


this does require a stack / recursion to associate the lables. another
way to deal with it is to solve it in the parser, and use real
lists. or the lowlevel forth could be extended to use something like
this, which is probably easiest.


Entry: hands on pic hacking
Date: Thu Mar 22 17:10:19 EDT 2007

playing with the synth board. it resets from time to time. found that
touching the PGM pin causes this. this pin is floating in my board, so
i guess that's where the problem is.

datasheet says:

CONFIG4L 300006 bit 2 LVP enable1 / disable0

indeed, this is on. as long as this is enabled, normal port functions
are disabled. moral of the story: disable it, or tie it high, or
enable weak pullups.


Entry: sheepsint
Date: Thu Mar 22 17:38:38 EDT 2007

after fixing the PGM bug (LVP disabled now), it still crashes from
time to time. i suspect it's some kind of interrupt thing.. lets
disable stack reset and see if it still crashes.

tss... watchdog timer was on. stupid.


Entry: modeless interface
Date: Thu Mar 22 18:55:22 EDT 2007

- modeless interface (unix socket) to send brood commands for emacs
- normal boot vs interpreter based on activity on reset


i should find a decent protocol to interrupt an app: to attach a
console easily, but to have it running most of the time.


Entry: partial evaluator
Date: Fri Mar 23 00:50:37 EDT 2007

i'm probably just getting tired, but isn't it a lot better to do
partial evaluation on source instead of assembly code? there is some
elegance to the greedyness of the algorithm. somehow, this feels
ok.. but if i type 1 2 +, it's always going to be equivalent to 3.. if
literals can be identified at the time they are compiled, their
compilation can also be postponed..

i don't really have a good explanation. what i do know is that this
works because it is fairly decentralized.. the price payed is "literal
undo" which is not so hard, and also works for pure assembler.

don't know if this is going to make sense.. a symbol's semantics is
only defined by what machine code it will be compiled into. (concrete
semantics) for forth, this is either a function call or some inlined
machine code. since the latter is highly machine specific, it doesn't
really make much sense to separate that out into partial evaluator +
optimizer, since the optimizer is going to add some bit of partial
evaluation anyway.. it's better to put some effort into making the
code separable: some patterns go for all register machines, some go
for all pic chips, ...

as i found so far, 

1. abstractions will arise whenever they are hinted by redundancy or
   "almost redundancy".

2. if you build an abstraction you don't use later, you
   loose. abstractions make code more complicated, and are only
   justified by frequent use.

3. don't hesitate to keep towering abstractions until the redudancy is
   gone. some problems really do need several layers to encode
   comprehensably.

what i'm intrigued by:

4. solve only one thing per layer. (one aspect). if the abstractions
   do not stack, find a way to disentangle them, and weave them back
   together automaticly.


Entry: compiler compiler
Date: Fri Mar 23 10:00:04 EDT 2007

seems you can't really use macros to write macros without extra
effort im mzscheme. it defines level 0 and level 1 environments
(normal and compiler), but a level 2 (compiler compiler) cannot be
easily used without the use of 'require-for-syntax'

the thing i ran into is this: i want to use a macro to generate a
pattern matching expression inside a define-for-syntax function that
is used implement a macro that generates a pattern matching
expression.

maybe it's best i just switch everything to using modules, and reload
the full core when i'm reloading. i'm getting a bit tired of these
kind of problems.

questions:
1. is it possible to reload a module?
2. how to only recompile what's changed to reduce load time?


Entry: cat as plt language?
Date: Fri Mar 23 13:54:06 EDT 2007

ok, but what is apply in that case?

(apply fn args) == (run-composite stack composition)

in other words, exchange single code multiple data to single data
multiple code. apply then still means means: convert a data + code
into a data.


Entry: modularizing cat
Date: Fri Mar 23 14:35:26 EDT 2007

brings up a lot of problems.. some of the macro's i'm using like
snarf-lambda are not very clean wrt names and values..

i also 'communicate using global values' which is not a very good
idea.. so it's going to take a bit longer than expected, but the code
should be a bit cleaner when it's done.

ok, now for the big one. pic18.ss

generic forth stuff: need to spend some time to separate out the
sharables, which is a lot..

i do wonder if i really need both writers and asm state monads.. it is
cleaner, but also a bit of a drag..

i need a proper mechanism to do this separation.

but first, get this thing to load properly.. got some bugsies here and
there. seems the compiler works fine, but the assembler has got some
problems.

ok. seems to work now. also compilation seems to work.


Entry: macro namespace
Date: Fri Mar 23 19:02:35 EDT 2007

there is really no reason to have multiple macro namespaces. i mean:
namespaces are defined using hashes. it's easier to just load the
generics, then overlay the specifics instead of having a lot of
special names in the dictionary.. in other words: the pic18* words
should be replaced by global unique things, denoting the fixed
functionality:


* machine constants
* simple/full forth parser
* macros
  -> recursive              
  -> pattern matchers
  -> writers
  -> asm state modifiers


all specific functionality is added on top by overlaying the
code. this used to be done with "load" but is now done using
"require". order of execution is preserved in require ???


Entry: double postpone
Date: Fri Mar 23 21:16:39 EDT 2007

i'm running into problems with macro generating code.. fixed
some. cleaned up some in vm.ss

now i have an interesting problem with delayed eval: macro defs (side
effects) get delayed till after the macros are used..

ok i think i got it..

what about tagging names that are supposed to be cat semantics in a
certain way?

ok.. this concludes a long run. from the top of my head, things are
better now because:

- badnop is better defined as a forth compiler with fixed
  functionality mentioned above

- code makes clear indication if functions are used as cat semantics
  == code that compiles something into a stack primitive.

- 'compile' and 'literal' are now CAT macros

- the state monad uses a more highlevel wrapper


things to do still:

- constants for disassembler
- disassembler
- core restart
- clean up source file layout, maybe split in more modules + docu

funny... running into an evaluation order problem again.. maybe i
should use some kind of module / scheme namespace trick to get rid of
this? because load/eval/parse order is kind of arbitrary now.. ->
nothing to worry about. it was a stupid typo.

got the meta-patterns macro working too. this is actually an
interesting idiom: just wrap a single macro around a body of 'define'
statements to alter the way they are used: it allows proper syntax
hilighting + individual testing.


Entry: so what is badnop?
Date: Sun Mar 25 16:02:37 EDT 2007

a native forth compiler for register machines, with provisions for
harvard architectures, and provisions to build a dtc interpreter on
top of a native wordlength forth.

the platform specific part are: assembler generator, pattern matching
peephole optimizing code generator, and some recursive macros.


Entry: persistent store
Date: Sun Mar 25 18:59:45 EDT 2007

so.. it would be way easier to just have the compiled forms cached on
disk. but i guess if that's really necessary i can always write out
scheme files and compile them. for the rest: all persistent data
should be SYMBOLIC.

this means:
- no compiled CAT code (word)
- no continuations in asm

this seems really important.. an area where compromise leads to
unnecessary complexity. i'm going to leave it open, and implement
restart by reload, giving only the parameter. 

this is turning into a "where to put stuff" quest again.. ok. keep it
like it is, and put the data stack in the state store + perform some
checking to see if data is serializable before writing it out.


Entry: debugging tools
Date: Mon Mar 26 15:10:12 EDT 2007

need more debugging tools:

- some safe way of dealing with the bootblock (mainly isr) OK-
  on-demand console: interrupt app OK
- proper disassembler
- 'loket'
- documentation: how to document the language?

dasm needs some thought. the interrupt app is as simple as polling the
rx-ready flag, i.e.  "begin params rx-ready? until"


Entry: i need something new
Date: Tue Mar 27 10:22:26 EDT 2007

the dasm might be interesting.. maybe i should do that. but i'd like
to do something exciting today :)

wrote some badnop docs, changed some names.. maybe i should have
user definable semantics accessible in CAT itself? (more reflection?)


Entry: the road to PF
Date: Tue Mar 27 11:13:48 EDT 2007

ok. time to write PF in forth, by gradually bootstrapping into
different languages. the lifts are:

1. vector -> linear lists
2. non-managed -> refcount managed
3. untyped -> type/polymorph
4. proper GC
5. scheme


the first lift is the same as the one i already did, which is lifting
native code to vectored rep. the lower interpreter's composites become
the higher interpreter's primitives. however, if data is also being
lifted, the change is in no way trivial: primitives won't accept the
data until it's moved to a linear stack.

so maybe this needs to be separated. the lift to lists is different
for data than it is for code. on the other hand, it does look like a
nice place to insert some type checking code.

need to think a bit more.. 


Entry: multimethods
Date: Tue Mar 27 11:49:14 EDT 2007

i had this idea of representing types using huffman coding, in a
binary tree. this requires a set of fixed types and some information
about which ones are used most, but it might be quite optimal.

there is a lot lot of room for optimization here, moving type checks
outside of functions etc.. but it will probably require some type
specs.


Entry: poke
Date: Wed Mar 28 10:57:34 EDT 2007

let's write poke again, the PF vm. the first thing i need to do is to
generate C code from some sort of s-expression.

expression conversion seems trivial, just need to distinguish between
the bultin infix operators, and prefix expressions with comma
separated argument lists.

statements are more problematic. bodies are straightforward, but how
to handle special forms like for/while/do ?

seems i got most of it running now. main features:
- an s-expression interpreter with a primitive and a composite level
- used to implement 2 interpreters for statements and expressions

now i was thinking if it would be possible to create some kind of
downward lambda. i can't use the gcc extension..

yes, but i do need to allocate ALL functions in structures, meaning
explicit activation recors, and use lexical addresses. if this is
used, it's better to completely forget about any local C variables.


Entry: downward funargs
Date: Thu Mar 29 16:12:22 EDT 2007

so, attempt to create a 'downward lambda' for poke. allocating on
stack for now, with later possibility to allocate on heap. how hard is
this to have in some form?

simplifications:
- all cells are the same size
- values are pointers to 'object'

this needs quite a bit of support:
- environments
- closures

the function bodies themselves take:
- environment
- arg list (part of environment?)

a function invokation is:
- create environment extension
- run function
- cleanup environment extension


{
	object_t env[3]; // parent + 2 variables

	// invoke a function 'FUN'
	({
		// create new environment
		object_t ext[2];
		ext[0] = env;     // link parent
		ext[1] = 123;     // init first and only arg
		FUN(ext)          // invoke fun
	})
}


this resembles PICO.


ok.. going a bit too far here. what about introducing these features
when they are really needed?

one question though.. if only downward closures are needed, why not
use dynamic binding instead?

nuff.


Entry: back
Date: Thu Mar 29 17:50:44 EDT 2007

back to the code generator. the reason i wrote this was twofold. one
is to have a portable target for brood forth. the main idea there is
to rewrite mole into someting more graceful, and have a basis for
(re)writing PF.

and two: i need a language for expressing the signal processing code
in PF. this should not be forth, but a multi -> multi dataflow
language. maybe just forth + protos?

so. i think the next step should be to transform current cgen (poke)
so it has an extensible name space.

maybe it is a good time to look into defining new languages inside
PLT, since that's what i'm doing basicly, instead of mucking about
with explicit environment hashes and interpreters.

something to iron out: it's not a new language, it's a cross-compiler:
you want to define functionality accessible in one name space using
functionality accessible in another name space.


Entry: extendin cgen name spaces
Date: Fri Mar 30 10:31:15 EDT 2007

i don't really need to make the hash tables available. it's much
easier to just create a new interpreter function which falls back on
the basic one defined in cgen.ss

hmm.. i got myself in trouble again. the above doesn't work since
statement/expression are mutually recursive. in addition to that,
statement uses closures. maybe i do need a hash?

ok. i think i got it ironed out a bit. using a hook for both the
expression and statement formatters, and calling this hook recursively
does do the trick.


Entry: compiler structure
Date: Fri Mar 30 15:01:41 EDT 2007

so.. basicly, a compiler/assembler/whatever has the following
'natural' structure:

T = target language
S = source language
C = compiler language

it's best to separate the S -> T map into:

primitive macros  S -> T  (small)
composite macros  S -> S  (big)

you want to write both S -> T and S -> S maps in C. the reason you
want an S -> S map is because it contains higher level code than a S
-> T map.

one pitfall is to shield functionality in C by not properly mixing in
the T name space. the most straightforward way to implement both maps
is quasiquoting: quoted S or T and unquoted C. including the compiler
language is more precise:

primitive:   C,S -> T
composite:   C,S -> S

badnop is already organized this way: the primitives are peephole
optimizing pattern matchers, where C is scheme. writers and state
modifierd are composite, with C being cat. and the recursive macros
are a cleaner S -> S map, with C empty.


Entry: lifting
Date: Fri Mar 30 15:22:03 EDT 2007

now for the ambitious part. the thing that got my whole forth/PF thing
started is a desire to generate automatic control structure for video
DSP building blocks. basicly:

IN:  a highlevel description of how pixels are related through
     operations

OUT: a compiled representation processing images / tiles

the core component here is loop folding:

(loop { a } then loop { b }) -> (loop { a then b })

the win is a memory win: intermediates should not be flushed to main
memory. 

so compilation generates the control structure. compilation 'lifts'
the pixel building blocks into something interwoven with the control
structures.


Entry: grid processing
Date: Fri Mar 30 16:01:57 EDT 2007

the possible optimizations depend tremendously on the amount of
information available on the individual processors, so the idea is
to keep the primitive set really simple, and look at their
properties.

* associative    (n-ary op consisting of n-1 binary ops)
* commutative    (binary op)
* linear/linear1


+   l a c
*   l a c
/   l1
abs


the typical structure to look at is a one dimensional FIR filter,
since this can be extended to 2D (space) and 3D (space+time)
filters.

(* gain
   (+ x
      (n x 0 -1)
      (n x 0 +1)))

let's analize. 1 and 3 are constants, so (/ 1 3) can be
evaluated. x is used in a 'n' expression, which we use to denote
membership of a grid. let's make all parameters into grids

(* (gain)
   (+ (x 0)
      (x -1)
      (x +1)))

so (gain) is a 0D grid, (x 0) is a 1D grid (x 0 0) is a 2D grid etc..

composite operations can be specified, for example

(processor (a b)
  (+ (a) (b 0) (b 1)))

this means all parameters need to be declared, since we need to know
the order. the syntax i'm using here requires ordered parameter
lists. i prefer this over keywords, since it is more compact, and we
need to fill in all inputs anyway (no explicit defaults).

another ineresting operation on an expression is to compute it's
reverse: an expression represents a dependency graph, which can be
inverted. however this is only iteresting for multiple inputs, which
we won't use yet: apply explicit subexpression elimination and graphic
programming.

ok, so we need parameter names.

another interesting operation is fanin: how many times is a single
value used? this is important for memory management
(linearization). note that linearization and operation sequencing is
almost equivalent to translation to forth.

maybe it's time to go for the first iteration binder. we map a single
function to an explicit iterator. i.e.

(+ (a 0) (a 1))

it has a single 1D grid input, and produces a single grid. ah!
something i forgot: what's the output type? a grid of dimensionality
equal to the maximum of the input grids.

so, an n-dimensional grid is placed on the same notational level as an
n-ary procedure.

ok. the above can be transformed to the loop body

(+ (index a (+ 0 i)) (index a (+ 1 i)))

where a runs over the line. the rest is border values:

(+ left (index a 0))
(+ (index a w) right)

so the idea is to make the loop body and the 2 borders.

implementation (see ip.ss)

     implicit  ->  explicit
     (a 0          (a ([I 0]  0) 
        1             ([I 1]  1) 
       -1)            ([I 2] -1))
                          | 
            loop depth ---X


Entry: thinking error
Date: Fri Mar 30 19:53:11 EDT 2007

the error i made previously was to 'precompile' things: bind stuff to
tiles, then bind some stuff later in an interpreter. the problem with
this is that you're solving the same problem twice. not very good..

a much better idea is to keep everything in a highlevel description,
then compile it as composition goes on: one thing i'm dreaming about
is to build things in a pd patch, then hit 'compile' for an
abstraction, and it will compile an object that performs the
operation.

so, the other error was to use low level reps. forth has benefits, but
not for writing compilers, which is mainly template stuff: mixing name
spaces. you really need quasiquoting and random parameter access.

EDIT: this is what's so nice about the scheme macro system: the mixing
of compiler and target namespaces works really well.

Entry: monads and tree accumulation
Date: Sat Mar 31 10:44:58 EDT 2007

writing the source code analysis functions i run into the following
problem: map a tree, but also run an accumulation. now of course it's
easiest to just use local side effects here, since they behave
functionally from the outside. (linear data type construction).

but just out of curiosity, what kind of structure is necessary to do
this functionally?

basic idea of monads: if you don't save 'extra' data in the
environment, save it in the data. this requires 'map' to be
polymorphic, so it can act on this type accordingly. i don't think
it's worth the trouble here.


Entry: boundaries
Date: Sat Mar 31 12:14:13 EDT 2007

border values

using finite grids, borders need to be handled. basicly, invalid
indexing operations need to be replaced by valid ones. some
strategies:

constant:    (a -1 0 0) -> c
repeat:      (a -1 0 0) -> (a 0 0 0)
wrap:        (a -1 0 0) -> (a (wrap -1) 0 0)

how to name border regions? there are several distinct cases, for
example a square grid has these:

(L L) (I L) (H L)
(L I) (I I) (H I)
(L H) (I H) (H H)


L  low boundary
I  bound to iterator
H  high boundary

with looping indicated by { ... } a full 2D loop looks like:

   (L L)           ;; top left
   { (I L) }       ;; top
   (H L)           ;; top right
   {
       (L I)       ;; left
       { (I I) }   ;; bulk
       (H I)       ;; right
   }
   (L H)           ;; bottom left
   { (I H) }       ;; bottom
   (H H)           ;; bottom right


that basicly solves the problem. note that it's best to lay out the
code in a L I H fashon to keep locality of reference.

on to representation.

the loop body is a serialization of an N-dimensional 3-grid. (a 2-grid
is a hypercube). it's serialized into a ternary tree.

how to represent ternary trees? the following representation looks
best in standard lisp notation:

     ((L . H) . I)

other variants have the dot in an awkward place. another possible rep
is (I H L) which can be written in mzscheme's infix notation as

      (H . I . L)

i'm going for the former, as it allows to use (B . I) in case L and H
are the same. in order to generate the full loop body.

EDIT: it's easier to just use s-expressions: (range H I L), and have
'range' to be a keyword..

loop borders can be constructed using the data structure provided by
'src->loopbody'

ah! it's possible to separate the operations performing loop order
allocation and pre/post expansion, but probably not very
desirable.. so let's combine them, so we can get rid of using natural
numbers.

note: i found out that when i needs index lists, i'm doing something
wrong: applying a certain order on things...

so, in order to generate the tree above, we consume coordinates from
left to right. all loop transformations need to be done on the source
code before generating loop bodies.


Entry: lexical loop addresses
Date: Sat Mar 31 14:33:53 EDT 2007

i need a notation for addressing loop indices. currently i'm
converging on not updating pointers in a loop, but using indexed
addressing, since that's something that can be done easily in
hardware.

an optimization here is to use relative addressing only for the inner
loop, so only one index needs to be added, and cache the computation
for all other relative accesses.


each loop has exactly one index that's being incremented. the depth of
the loop determines how many indices are bound. what i'm trying to do
is to generate the border conditions that have not all indices
bound. how to do that?


loop a {

     ... data (a) ...
	  
     loop b {

     	  ... data (b c) ...

     	  loop c {

	       ... data (a b c) ...

	  }
     }
}


the inner loop here needs to be split into 3 parts

data (l  b c)
data (a  b c)
data (h  b c)

then the 2 unbound parts can be moved out of the loop.


so, basicly.

BODY -> (nonfree . free)

as an example, take (+ (a 0) (a 1))

split in
      (+ (a 0)    (a 1))       ;; border
      (+ (a (i 0) (a (i 1))))  ;; body


since code is originally in unbound form, it might be more interesting
to perform binding inward. start from the relative description, and
split this into a partially bound and partially filled structure.

              border <- relative -> bound

then iterate downward

before this is possible, all code needs to be translated to full
'virtual full grid'. later on, it can be substituted back to its
original form.


Entry: representation
Date: Sun Apr  1 10:22:02 EDT 2007

ok, i think i got the basic idea, so it's time to start using some
abstract data structures. on the other hand, if using list structures
is possible, debugging is more convenient.. sticking to lists.


Entry: breath-first
Date: Sun Apr  1 16:52:47 EDT 2007

i think this is the first time i ever encountered a problem that's
easier solved using breadth first expansion.

hmm.. that's probably plain bullshit.. it's just my particular
approach at this moment using an 'infinite' expansion with an escape
continuation:

(define (expand e)
  (call/cc
    (lambda (done)
       (let ((expand-once 
              (lambda (f) ...
                      (done e))))
            (expand (expand-once e))))))

basicly, this just iterates expand over and over, and backtracks to
the last correct expansion 'e' whenever some termination point is
reached in expand-once.

ok, abstracted in 'expand/done'


Entry: separation of concerns and exponential growth
Date: Sun Apr  1 19:15:54 EDT 2007

was thinking.. separation of concerns: hyperfactoring, whatever you
call it, is a means to move from linear -> exponential code dev..

once you can separate things into independent parts A x B, increasing
functionality in either will increase total functionality by the same
multiplication factor. if they are not separated, increase in
complexity doesnt translate to increase in functionality.

this is very badly explained, but i think i sort of hit a spot here.

compare the payoff of time invested in building independent/orthogonal
building blocks that can be combined, against the payoff of time of
tweaking a small part of a huge system. the added complexity
(information, code size) might be the same, but the added expressivity
(possible reachable behaviours) is hugely different. multiplication in
the first, and addition in the second.

it's the difference between adding a bit in state encoding
(exponential), and adding a state (linear).


Entry: the inner loop
Date: Sun Apr  1 19:26:11 EDT 2007


how to encode the innermost loop? for example start with

(+ (a (I 0) (I 1)) (a (I 0) (I 0)))

with the inner loop being the last index (arbirary choice). the main
question to answer is: "relative or absolute addressing?"

either one uses explicit pointer arithmetic, or one uses index
registers. for the outer loops, increments occur infrequently, so
it's best to use pointers. a -> pa

(+ (pa (I 1)) 
   (pa (I 0)))
 
so, the number of registers used for addressing in the inner loop is
equal to the number of grids (including the output one), and one loop
index. if addressing modes like BASE+REL+OFFSET are not available,
extra pointers or indices are needed.

i seem to remember that incrementing pointers using the ALU is bad on
intel, and it's better done using AGU..

i guess there's a lot of room for doing this right or wrong depending
on the architecture. and i sworn never to intel assembly again :)
if C is the target language, i guess some experimentation is in
order. for simple processors, it seems quite straightforward how to
subdivide things so maximum throughput can be attained.

i guess the next target is to generate actual code. that should iron
out the conceptual problems..


Entry: inner loop cont
Date: Tue Apr  3 09:47:24 EDT 2007

the problem is, the indentation shown by 'print-range' is not the same
as the indentation for the C code loop blocks. setup code needs to be
moved out of the loops. going from inner -> outer:


   (+ (grid a (I 0) (I 0)) (grid b (I 0) (I 0)))

needs to be translated to


   (update a 0 (I 0))
   (update b 0 (I 0))

   (+ (grid a (I 0) 0) (grid b (I 0) 0))

   (downate ...)

effectively updating the pointers before the loop is entered. i was
thinking about just shadowing a single variable 'i'

in that case, what is necessary is to make sure each expression
referencing I has only one occurance (or an occurance in the same
position).


instead of construction an intermediate range representation, it might
be more valuable to generate the loop structure directly, following
the same approach as before.


    (a (0 1 2))

->
    (a (L 0) (1 2))
    (a (I 0) (1 2))
    (a (H 0) (1 2))

->

    (let ((a (L a 0))) (a 1 2))
    (let ((a (I a 0))) (a 1 2))
    (let ((a (H a 0))) (a 1 2))

so, basicly just specializing variable names. this boils down to
computing pointers.


so, to resume the downward motion is:

(expr (+ (a 1) (a 0)))

->

(bind ((a_p1 (S a 1))
       (a_p0 (S a 0)))
      (expr (+ (a_p1) (a_p0))))
...


ok, i think i got somewhere:

> (p '(+ (a 0 0) (+ (a 1 0) (a 1 1))))
{
	int i;
	for (i = 0; i < (400 * 300); i += 300)
	{
		float* a_p1 = a + (i + (1 * 300));
		float* a_p0 = a + (i + (0 * 300));
		float* x_p0 = x + (i + (0 * 300));
		{
			int j;
			for (j = 0; j < 300; j += 1)
			{
				float* a_p1_p1 = a_p1 + (j + 1);
				float* a_p1_p0 = a_p1 + (j + 0);
				float* a_p0_p0 = a_p0 + (j + 0);
				float* x_p0_p0 = x_p0 + (j + 0);
				*(x_p0_p0) = (*(a_p0_p0) + (*(a_p1_p0) + *(a_p1_p1)));
			}
		}
	}
}

now, there are quite some possible optimizations or simplifications.
one is to leave the inner level as indexed pointers. another is to
replace stride multiplication with addition.


Entry: scheme syntax
Date: Tue Apr  3 22:35:32 EDT 2007

today i (re)discovered:

      (define ((x) a b) (+ a b))

and was surprised that it also works for

      (define (((x)) a b) (+ a b))

first saw it used in SICM


Entry: accumulation / values
Date: Tue Apr  3 23:29:08 EDT 2007

i need an abstraction for (linear) accumulation. no need to mess with
monads. the pattern i'm finding is:

* substitute expression in tree + accumulate a set

i want a function that returns 2 values, the original expression and
the accumulated set.

note that use of assignment like this isn't so bad, beacause it's
encapsulated (linear): there are no references to the object until
it's ready. also note (again) that using monads requires polymorphic
versions of generic list processing operations, and is overkill.

the 'lifting' technique used in the compiler do need monads, because
they are open: each operation modifies a state, and intermediates are
accessible, so pure functional programming is a good idea to keep
backtracking/undo tractable.


Entry: aspect oriented programming
Date: Wed Apr  4 08:56:29 EDT 2007


1972: Parnas "On Decomposing Systems"
1976: Dijkstra introduces term "Separation of Concerns"
1982: Brian Smidth introduces "Reflection"
1991: Metaobject Protocols
1992: Open Implementations
1993: Mini Open Compiler
1997: First paper on AOP
1997: D
2001: AspectJ
2004: JBoss

http://www.cs.indiana.edu/dfried_celebration.html
Anurag Mendhekar: Aspect-oriented programming in the real world


Entry: back to sheepsint
Date: Wed Apr  4 12:49:40 EDT 2007

i need to restart the board design soon, but i do need a fully
functional dev env before i can do that. some more things are
necessary:

proper stateless message interface (CAT = object) for sending code and
performing command completions.


Entry: summary
Date: Wed Apr  4 19:21:27 EDT 2007

THINKING ABOUT PF 

been looking into bootstrapping PF from lowlevel forth core. aspects:
polymorphy and types (clos), linear memory managent (lazy copy),
transition from vector -> list.

the latter is interesting since it contains 2 parts: code: needs a new
interpreter, data: needs a lot of new primitives, maybe combined with
type checking. i wonder wether it's easier to just start from a cons
cell VM directly.


C CODE GENERATION

* separate statements and expressions
* plugin expression transformers


POKE

* using a non-blocked version of C gen


LOOP CODE GENERATION

i think i have the general idea:

* c code generation working
* functional specification mapped to assignment
* nested loops: blocks to bind locally cached index pointers
* additive index arithmetic
* inner loop uses a single index

the scheme code looks simple, and well factored. gut feeling says the
code is simplified enough for gcc's optimizer to tackle it.

i still need to do the border conditions. this will need to be example
driven. next month i might try to plug in some code.


Entry: from forth to PF
Date: Wed Apr  4 19:37:52 EDT 2007


1. data

a PF primitive written in forth looks like:

- (force) collect arguments (list -> vector)
- method lookup
- perform primitive forth code
- (lazy) push arguments (vector -> list)

so the stack is implemented like:

[ list | vector ]

the vector actually needs to be a circular buffer, because it behaves
as a deque: traffic between list and vector is on bottom end, while
primitives operate on top end, unless the primitives accept their
arguments reversed.


2. code

fairly straightforward.


because of the difficult impedance match between list and vector
machines, i think it makes sence to forget about building one on top
of the other, and write only the vm.

an interesting question is wether this can be abstracted. and also,
can i write the VM in itself?

been tinkering a bit with poke.ss and mole.ss
got the basic permutation worked out.


Entry: alan kay name dropping
Date: Wed Apr  4 19:59:37 EDT 2007

from "Proposal to NSF - Granted on August 31st 2006 - Steps Toward The
Reinvention of Programming"

i'm curious about the albert thing. what i read i don't understand
though.. better next time.

motivation and inspiration:
John McCarthy  LISP
... bootstrapping


Entry: persistence & late binding
Date: Thu Apr  5 16:11:14 EDT 2007

so, borduring verder on that article.. i ran into the problem of
saving parsed code, beacause semantics is stored as a procedure. what
about replacing this by a symbol?

assuming data will only be read by a system that has the bindings in
place, this is a valid approach. then bootstrapping can be solved
differently, and all internal representation is just cache.

so..

a word = code object
* a source representation
* a symbolic semantics (other word?)

* a cached transformer procedure (concrete semantics)
* a cached meaning = lambda expression

the cycles in this representation need to be broken somehow.

hmm.. this is actually a lot harder than it sounds, since the cache
really needs to be a cache. probably needs a from-scratch approach.

ok. started the 'symcat' project. for the current project i think i
can live with non-savable parse trees, since it's always possible to
save source code, and i have a working 'reload core' command for use
during compiler development. all in all, the system i'm writing is
fairly straightforward.

so no more about this really cool idea here. see symcat.


Entry: name spaces
Date: Thu Apr  5 16:59:24 EDT 2007

something that's getting on my nerves a bit are CAT namespaces. small
special purpose apps can benifit from the simplicity of a single
namespace and short names, but for CAT i'm not so sure any more. also,
i'd like to catch undefined names early on.


Entry: standalone
Date: Thu Apr  5 17:06:15 EDT 2007

time for the standalone forth. one of the things i've been wanting to
try for a while, but never got to.. i should have a look at flashforth
and also retroforh for inspiration. roadmap:

* 'accept' terminal input into buffer
* 'parse' words
* 'find' a word in the dictionary

compilation is straightforward, but requires some thinking since stuff
will need to go to ram first. (it's multipass, i.e. if .. then).


Entry: reflection
Date: Thu Apr  5 19:37:03 EDT 2007

the ideas of reflection and metacircularity probably go hand in
hand.. in CAT i'm getting a bit annoyed by having to choose between
implementing something as a scheme function, or as a cat function. for
example: semantics is implemented as a scheme function, so it's
technically not accessible from CAT.

let's re-iterate. the point of CAT...

usually, a forth compiler is written in forth. a cross compiler poses
problems in this sence, since the normal 'local feedback loop' doesn't
work. the (re)constructed rationale:

1. forth is extremenly modular: a function is a composition of functions
2. a forth compiler is most naturally expressed in the same way: a forth
   compiler is a composition of compilers (macros).
3. most naturally, forth is implemented metacircularly.
4. i can't do that because the target is too simple -> simulated
5. the metalanguage best reflects the same structure: compositional
6. choosing for a functional language (CAT) -> monadic composition
7. CAT is written in scheme to avoid it's own bootstrapping problem


the last one actually reads as: CAT is an impedance map from scheme to
a compositional language to easier implement an extensible optizing
forth compiler. if CAT is metacircular, there is no need for
scheme. this approach is not used because:

- (plt) scheme is packed with features
- i use a fair amount of scheme to provide primitives. in fact
  'primitives' is not really a good word for it..

so it's best to see CAT as scheme in disguise, and as a vehicle for a
decentralized compiler/interpreter, bound together by monadic
composition. to have the possibility of writing new CAT words is
mainly for extension property (writing the compiler), not for CAT
core.


Entry: nested scope
Date: Thu Apr  5 19:55:49 EDT 2007

as i've learned, these features are really necessary to write a
compiler:

* lexical variables
* quasiquotation
* pattern matching

however, they do serve most purpose adapting to a representation that
is inherently imposed, i.e. assembly language syntax. anything that is
non-compositional is better handled with something like scheme.

however, if you can design everything from scratch, it's probably
quote doable to get by with a couple of combinators and aggressive
factoring.

but, in the end, some form of lexical scope should be possible, if
only for the practical problem of name clashes.. there is only one
question. are names functions or values?

in lisp, they are values, because functions are explicitly invoked: if
a variable is in the head of a list, it's a function. in a
compositional language it would involve something like 'i'.


((a b c) locals
  ... a i ... b i ...)

treating things like values makes them more natural. an abstraction
could be added to do the other (bind as program). then, how to handle
the environments?


NOTE: got lexical variables and quasiquotation working in symcat, but
only by a more direct cat->scheme translation. i dont think it's
really necessary here, since i do most in scheme. also, some name
space issues are still not resolved. maybe i can switch for the next
reqrite tho :)


Entry: back in the solder lab
Date: Mon Apr 23 17:10:12 CEST 2007

things i need to get working before the end of the week:

* sheepsint input switches
* room for xtal on pcb
* capacitors on pots

random hacking:

* 3.3V serial interface
* usb?


Entry: emacs integration
Date: Mon Apr 23 20:03:23 CEST 2007

this screams for a 'once and for all' solution. i'd like to keep brood
portable, so using unix sockets for a console as is done for pf, is
not the way to go. since we're running a lisp in a lisp editor, it's
probably best to keep the one 'default' interface on stdin/stdout as a
lisp channel, and run the console logic in emacs.

maybe a bit in the style of slime?

ok.. following slime to ielm.el, modified to connect it to a running
scheme process. slime is too big for me to make sense of, i might
return later for some features, but i need to get something running
first.

what i need is multiple languages on the same console, or maybe
different buffers?

the whole idea is to have most of the parsing in emacs, so emacs can
make the editing a bit smarter. maybe i should have a look at:


Entry: erepl
Date: Wed Apr 25 14:18:32 CEST 2007

looks like it's working reasonably well.. things to add:

* tab completion
* multipe languages

either parser in emacs, or sending out raw lines. the former is better
for better line editing, it already does that really.. the second is
better so i don't need to rewrite anything, though forth parsers are
really simple and i'm not using a tremendous amount of special plt
read syntax. i wonder if emacs read syntax is extensible?

anyways, what i do need is a way to switch the mode in emacs, and not
in the target scheme image.


Entry: fresh install
Date: Thu Apr 26 09:16:42 EDT 2007

i tried a fresh install, but apparently my compile script tries to
compile stuff in the plt dist, starting with the deps of "match.ss".

"sudo ./go" should work..

so, how to install? should i keep all the source files as 'writable'?
should i keep it in dev land for a while? maybe best.


Entry: project directory
Date: Sat Apr 28 19:24:10 CEST 2007

i need to solve the following problems:
- core should be installed system-wide
- project directory should contain multiple projects

the idea is that 'clicking' on a state file should bring up
everything.

let's try to make sense of this: the brood system is aimed at
developers. in that sense, it is encouraged to hack the system, which
means the scheme files should not be stored system-wide, and they
should be writable. this allows the compilation cache to remain as it
is.

the source dir has a subdir called 'prj' which contains
subdirectories, one for each project. these individual subdirectories
could be managed using darcs.

it's absolutely essential to find a way to have the TARGET determine
which project to load. in order to do this, we use the reply of 'ping'
as the name of the project.

there is one default project for each architecture, which serves as an
example.

-> compilation from scheme: right now i invoke mzc, it's probably
   better to do so from a scheme script.


all this seems to work. next problems:

* windows / osx : emacs + serial port config

* using snot : rewrite all language repls to standard interface : one
  line (string) at a time, require from snot for it to be 1 or more
  valid s-expressions.

for the last one, i think i found it: just have 'prompt' display the
prompt and accept the next line input, this can be done using a simple
coroutine/continuation trick.


Entry: getting to working usb
Date: Sun May  6 13:04:44 CEST 2007


roadmap:

* constants as forth file
* platform dependent constants
* 2550 init
* get serial monitor working
* ...


Entry: usb debugging
Date: Mon May  7 13:36:31 CEST 2007

got the kernel messages going etc.. looked at doc/usb/asmusb.asm
(johannes adapted this from C code) to find out i need to enable full
speed instead of low speed: #0x14 -> UCFG. now i get transactions.

time for the highlevel protocol.


Entry: usb device descriptors : usb.ss
Date: Sat May 12 13:25:30 CEST 2007

looks like it's working: i can compile device descriptors from a more
reasonalble highlevel description. next step is to organize the tables
in flash.

ignorant of content, the thing it needs to do is to map

device     -> (n,addr)
(string,i) -> (n,addr)
(config,i) -> (n,addr)

the logic then needs to transfer the buffer in chunks

so i need a proper tree structure in flash. preferably one that can
handle errors so the device is a bit robust.

these things are read-only, so they can be implemented directly as
code. for example:

device/string/config  ( id -- string )

which is encoded as

: device  3 word-table 
  	  addr0 ,,
	  addr1 ,,
	  addr2 ,,

: addr0   length , 1 , 2 , 
: addr1   length , 3 , 4 , 5 ,


here 'word-table' does bound checking + throws exception


for error handling: it's probably easier to just use 'max' to limit
the offset, then install the last redirect as an error handler, so:

: config   3 min route
  	   config0 ;
	   config1 ;
	   config2 ;
	   error ;


Entry: conditionals < and >=
Date: Sat May 12 14:54:37 CEST 2007

in pic18-comp.ss they are implemented as macro predicates, following
the standard forth comparison operators: consume 2, leave condition.
( a b -- ? ). these can be followed by if.

i've been looking into a more general way of using the CPFS[EQ|GT|LT]
opcodes, by mapping them onto the conditional jump implementation.
been avoiding this for a while, because i have unsigned 'max' and
'min'.

the thing is 'cbra'. it consumes a condition, and compiles a
conditional branch. does this really make sense? the other
conditionals can be inverted, these cannot: only by swapping jump
targets. so:

- change 'not' to support a new pseudo op
- change 'cbra' to do this branch based swapping

looks like it's working.

an optimization is possible in case of single opcode instructions, but
it's probably better to just code them as macros. needs some thought.


Entry: usb descriptors again
Date: Sat May 12 15:53:12 CEST 2007

it's probably best to just keep using 'route' in combination with
'min' and an error handler. let's standardize a 'buffer' or a 'string'
to what i already use for the 'ping' command:

: my-flash-buffer
	string>f
	length ,
	0 , 1 , 2 , 3 , ;

this means that the word 'my-flash-buffer' sets the current flash
object (the f register). a string is a flash object which has its
length stored in the first byte. so '@f++' on a string object will
give the length, and leave f pointing to the raw bytes, so successive
'@f++' will read out the bytes.

the usb descriptors should be stored in exactly the same way: device,
configuration and string should just set the current flash object,
which is understood to be a purrr string.

so, the following output

((device
  (16 1 16 1 0 0 0 8 216 4 1 0 4 3 2 1))
 (strings
  ((23 3 68 101 102 97 117 108 116 32 67 111 110 102 105 103 117 114 97 116 105 111 110)
   (19 3 68 101 102 97 117 108 116 32 73 110 116 101 114 102 97 99 101)
   (5 3 48 46 48)
   (10 3 85 83 66 32 72 97 99 107)
   (28 3 77 105 99 114 111 99 104 105 112 32 84 101 99 104 110 111 108 111 103 121 44 32 73 110 99 46)))
 (configs
  ((9 2 25 0 1 0 0 160 50 9 4 1 0 1 3 1 1 1 7 5 128 160 8 0 0))))


can be transformed into:

: device  string>f <length> , ... ,
: string  5 min route string0 ; string1 ; ... ; string-error ;
: config  1 min route config0 ; config-error ;

: string0 string>f <length> , ... ,
: config0 ...


maybe it's easier to just eliminate the intermediate names, since
there is a notion of arbitraryness involved. they are just local
labels, as used with if ... then. all in all, just generating a couple
of <tag><number> names is probably easiest.

ok, done.

now loading. the thing to fix next is a global path for any kind of
file loading mechanism.


Entry: some weird bug with forth parsing
Date: Sun May 13 13:35:02 CEST 2007

apparently, for parsing macros (color macros) like 'load' and 'path',
there is a problem when the macro that's implementing the behaviour,
popping the name from the data stack, is not defined..

i don't know why.. maybe i need to make that macro parsing part a bit
more transparent.

currently parsing words are a bit of a hack. i need to get to the core
of the problem and fix it. again:

* forth macros are cat words, as such they are 1-1 semantic/syntactic

* forth parsing transfers parsing words to quoting code: something
  forth source cannot represent, but parsed cat code can.

maybe i need a symbolic intermediate form, where lists are quoted
explicitly? like PF. with a mapping like:

(load file.f) -> (('file.f load) run)

hmm.. it's probably just a bad day to make decisions.

ok. calmed down a bit. 

load-usb is working now.
next: hands on transfer.


Entry: state machine or task?
Date: Sun May 13 15:55:33 CEST 2007

a task that does usb transfers makes sense. however, since i'm still
debugging i think a more lowlevel approach is better. when i got it
running, i can write everything in blocking form.


Entry: jump bits
Date: Sun May 13 15:56:55 CEST 2007

words use relative addressing. this can lead to trouble. what about this:

* just assemble, but when an address doesn't fit, keep it symbolic.

* 3rd pass: gather all addresses, and compile words which contain a
  goto statement to the words that were called, but not reachable.

this will keep code small, and the assembler simple: no need for
variable size goto instructions inside words. the rationale is: this
forth is for lowlevel stuff. for highlevel things, use a DTC on top of
this: there you don't have a problem.


Entry: stamp dead
Date: Sun May 13 19:57:41 CEST 2007

serial port driver dead or something? i don't know. it doesnt seem to
be a software problem. chip isn't doing anything. without scope hard
to debug... so plan B

1. brood + snot (1 evening)
2. sheepsint buttons + audio out port (1 evening)

-> leuven for scope and other stuff..


Entry: stamp back
Date: Sun May 20 12:01:59 CEST 2007

something going on here.. i tried stamp 2, which refused to work a
couple of times, until i got it going. then replaced with the original
'broken' stamp, and now that one works too.

maybe it's just my breadboard.. since i did have to move 2 pins to the
left on the breadboard because the 2nd stamp's pin header is too
big.


Entry: late binding
Date: Sun May 20 12:47:48 CEST 2007

what i need next is some form of late binding to do incremental
debug. the code runs fine up to a point from which i need to make
small changes to the code. reloading there is a drag, so i need a
proper construct.

   defer broem

   2variable broem-hook
   : broem broem-hook run-hook ;


some premature optimizations: since these variables don't really need
to be accessible, it's maybe better to put them somewhere behind the
ram bank, for example shadowed by the FSR registers.. this way a hook
can be represented by a 1 byte XT.


Entry: color macros
Date: Sun May 20 18:30:49 CEST 2007

what i mean with color macros is macros that modify the 'color' of
subsequent words. currently i have no way to implement new parsing
words in forth. this is not a good thing.. something is broken, but i
dont know what exactly. probably my understanding...

problem: parsing words use automatic name mapping. this is bad, since
it's viral. meaning, once you start doing things like that it's all
over the place: there is really no clean way to nest parsing words.

so i need a different approach: extend the partial evaluator to
include symbols. the deal is this: the PE uses the assembly buffer as
a data stack. because some words use the CAT data stack for 'data'
items, things get confusing.

so, the thing is: i need a single macro that quotes the next atom in
the input stream as a literal, and then use that.


Entry: partial evaluation revisited
Date: Sun May 20 19:31:53 CEST 2007


i ran into a pattern: the assembler buffer can be used as a data stack
to perform partial evaluation. i don't have a proper way to make this
sound, but it seems to eliminate the need for an 'interpret mode' in
the sense of classical forth. 

the interpret mode is replaced by a set of rewrite rules that will
perform compile-time evaluation. so instead of

	[ 1 2 + ]L 

we just have

        1 2 +

with the same result: 3 being compiled. actually, in the latter case
purrr will produce [movlw (1 2 +)], so the evaluation can be delayed
as long as possible.

this can be extended to the following pattern: allow target forth
values to be richer than just numbers, but require that they can be
combined into lowlevel constructs.

since i use this trick a lot, why not make it a feaure instead of an
optimization? currently the postcondition of compiling a literal is
valid assembler code. what about relaxing this to a delayed literal
stack, and introducing a 2nd pass to comb out all the remaining,
non-optimized literals.

once i have this, partial evaluation becomes better defined: quoted
symbols can be included and can be used in parsing macros. the CAT
data stack can then be used for control operations only.

big change. probably requires a temporary fork.

NEXT: 'lit' macro preprocessing step

is it possible to make 'lit' a pseudo-asm operation? yes, but the
disadvantage is that it's not 1->1. is this required? yes. the asm is
1-1 sym<->bin, so this needs to be solved in the compiler.

considering the percentage of code that intersects with delaying
'lit', i guess it's best to wait until after the big deadline, and
work around the macro stuff now. as a matter of fact, i can still do
it the old way, just adding a single 'quote' operator, for example
backtick `.

that's a good idea, as long as there's a [`] too, meaning macros can
have literally quoted symbols in them. with those 2 primitives, all
parsing words can be implemented.


Entry: back to debugging -- deferred words
Date: Sun May 20 20:37:11 CEST 2007

if the idea is just to get debugging working, it's easy: execute will
do enough.


Entry: back to thinking about the literal stack..
Date: Sun May 20 23:53:26 CEST 2007

there's a jucy fruit on the tree somewhere.. but i can't see it
through the thick leafs. a literal stack is an interesting idea, and
also is commutation of some constructs with literal stack..

i noticed that a problem atm is hardcoding of [lit a b] instructions:
the number of arguments is hardcoded. could be fixed with a postproc
step, but have to be careful there..


Entry: parsing macros
Date: Mon May 21 11:42:18 CEST 2007

forth parsing words require an input to be attached. my model does not
allow that: it requires parsing macros to live in a separate class.

hmm.. this is really kind of complicated. what about providing a
mechanism to create parsing macros as pure symbolic macros?

hmm.. ok, i got symbolic expansion macros now, but that's not the same
as recursive parsing macros!

i'm having difficulty getting my head around all this..

next step is to write a macro mode which recursively calls the parser.


ok, i think i found it now: the trick is to allow composition. the
best way to do this is probably to write the parsers as CAT words.


Entry: parsing
Date: Mon May 21 14:16:48 CEST 2007

i think i got it now. i'm just doing parsing wrong: each parser should
have an explicit 'read' and 'write' operation. then some glue can be
constructed to compose all of them.

'read' reads the next input atom, and 'write' outputs CAT code in
parsed or symbolic form.

i need to really let this go and get the usb driver working.. rewrite
stuff accumulated thus far:

- explicit literal stack with compile postprocess
- parser with recursive composition

anyway, the bigger picture becomes visible: 3 different interpreters

- compiler is kept in compositional mode: every source atom
  corresponds to a single action in CAT

- before: parser converts multiword constructs into single word constructs

- after: assembler uses localized arguments -> not compositional, just
  a sequence of independent commands


Entry: grounding problems
Date: Mon May 21 16:24:16 CEST 2007

very strange: if i touch the table, the pic resets. some kind of EM
interference. i don't really know what's going on, but putting the
stamp in a cage worked: just a grounded metal top of a metal box.

if i stick the probe in the carpet, i can measure about 25 V
peak-to-peak 50Hz signal. maybe i should just ground my table?

ok, i connected the TV cable shield plugged into the cable modem to
the case of zzz. without this cable there's 114V ac accross. this
seems to fix the problem: no more 50Hz on the carpet.


Entry: defer
Date: Mon May 21 17:11:41 CEST 2007

hmm... the only thing i really need is to 'overwrite' a
function. using a separate ram table for deferred words might be a
good solution if a lot of them are needed, but it sure does complicate
matters. moreover: it requires loading values to ram etc.. what i need
is really a cheap hack:

	 : someword nopf 1 2 3 ;

the 'nopf' could be overwritten, since it's #xffff. this opcode can
then refer to the next definition.


Entry: usb debugging
Date: Tue May 22 13:49:39 CEST 2007

using usbmon, i get this as first failure after the first request,
which is a device request:

d97cb540 144438646 S Ci:000:00 s 80 06 0100 0000 0040 64 <
d97cb540 145068664 C Ci:000:00 -84 0

the odd thing is the request length, which is set to 64 and not
8. status code is -84 which means
 
http://www.mail-archive.com/linux-usb-devel@lists.sourceforge.net/msg25936.html

so why doesn't it respond at all?

maybe i need to acknowledge the TRNIF before sending a response?


Entry: the a & f registers
Date: Tue May 22 16:50:37 CEST 2007

i need a proper coding style. lets try the following: caller is
responsable for saving the current object context. this means it's
regarded as a low level feature, and bad coding style to pass
arguments in the a and f register.

conclusion: use them only in small lowlevel words, and use functional
words or different object representations on higher levels.

CURRENT OBJECT = BAD !!


Entry: different macro implementation
Date: Tue May 22 16:57:42 CEST 2007

or better: an extension

currently 'macro' in forth only takes names from the macro dict. what
about allowing runtime behaviour here?


Entry: ram copy
Date: Wed May 23 11:17:14 CEST 2007

funny, but i don't have any ram copy facility! the reason is of course
there is only one free indirect addressing register to use. in order
to make a faster one, interrupts need to be disabled and one of the
two other regs need to be used.

no time to think about that now, so i'm going to avoid mem->mem copy,
and save only what i need.. (SETUP request is what i'd like to copy)


Entry: something wrong with XINST
Date: Thu May 24 12:19:49 CEST 2007

this is probably the cause of a lot of my misery: somehow access bank
variables don't work right when XINST indirect addressing is
enabled. for the workshop i switched back to old inst, with access
bank.

need to figure out what's going on there later: somehow
fetching/storing address 96 doesn't work either.. if i stay low, it
works.


Entry: bouncing ball physics
Date: Thu May 24 15:22:02 CEST 2007

a bouncing ball can be made using the natural rollover from 255->0,
combined with some coordinate mapping.

A---B
|   |
D---C

->

A---B---A
|   |   |
D---C---D
|   |   |
A---B---A


so, using the high bit to signify wether a coordinate is reversed, the
operation simply becomes:

: bounce clc rot<<c
  	 c? if
	    rot>>c
	 else
            rot>>c #x7F xor
	 then ;

or even simpler: 1st 7 high? test if -1 xor then


Entry: johannes config bits
Date: Sat May 26 18:52:37 CEST 2007

low voltage program off
HS oscillator
power up timer off


Entry: meta workshop notes
Date: Mon May 28 09:25:16 CEST 2007

all went really well after day 1 of total chaos, very happy with the
result in the end.

some remarks:

1. need a proper 'erase-all' in case the chip is messed up
2. need interaction words composition -> all symbolic
3. more docs or reference words -> find some automated mechanism
4. need simpler conditionals
5. maybe distinguish between @high? and high? -> btf are odd duck
6. investigate extended instruction set troubles
7. automate 'expose'


Entry: quoting symbols
Date: Mon May 28 09:35:39 CEST 2007

so, why not use syntax for this?

`hello : 1 2 3 ;

i think i need to preserve parsing words for the simple reason that
':' is a parsing word. changing that behaviour makes things very
different from standard forth. 

however, internally the parsing words should compile to the literal
stack.

the code above is actually quite clean. it has a symbolic
representation as CAT code, in the form of lisp's quote form. this
could be translated to forth in a minimal way. i could use this
symbolic representation as the output of the parsing stage. an
alternative lexer could then be used to make use of the more
functional forth described above (one without parsing words, only some
symbolic quote mechanism, where macros are purely concatenative).

note that since it's not legal to have a literal symbol not optimized
away, the ':' is redundant: symbols present after conpilation are just
labels. maybe even better, symbols are always labels. so why not get
rid of the space?

:help 1 2 3 ;

so, if parsing macros are symbolic transformers, interactive macros
could be the same. 'test words' if you want. this could lead to a
better simulator. the first version looks better, and has `<word>
compile <word> as literal.


Entry: literal stack
Date: Mon May 28 10:01:19 CEST 2007

just a quick look at what it would take:
1. abstract all literal patterns
2. make a local change in the abstraction

so this boils down to writing a generic pattern generator for literal
opti, and a mechanism to execute arbitrary macros as a pattern. this
is already there, but a bit of a hack. maybe it should be the default?
ok. there is already 'lit' defined in comp.ss, which can be extended
to take multiple arguments.


Entry: cache
Date: Mon May 28 10:09:10 CEST 2007

an annoying thing in the current code is to have to reload everything
when an implementation of a word changes: the cache never invalidates.
or, it's not a cache.

so i need to change the implementation of 'word' to include a cache
mechanism. it would be interesting to plug into the cache mechanism of
scheme, but that would require either a lowlevel thing, or something
with namespaces.


Entry: bug fixing day
Date: Sun Jun  3 12:32:20 CEST 2007

time to clean up some minor annoyances:

* serial port settings: use 'system' + platform script when port is opened
* faster upload (faster baudrate?)
* snot integration + better emacs integration
* fix parser -> parse to symbolic code
* create interpret macros
* sheepsint: build board tests for proto


Entry: monad stuff
Date: Sun Jun  3 17:15:08 CEST 2007

one problem i have with the way i perform function lifting (monads),
is that it's not mixable: i can't just 'tag on' another monad.

maybe this should be made a little more explicit. the next thing i
need to implement is parsing macros: symbolic preprocessors to map
forth to something closer to 1-1 cat code.

last time i got lucky: i was able to use code as one of the input
streams. now that's not so easy any more: there is an input stream
which is not code.

i guess the easiest way to tackle this is to just define a prototype
CAT function for a parsing word, and work from there.

in rout -> in+ rout+

with rout a reversely accumulated list of atoms. it's like the
assembler proto, but with an extra 'in' state value.

the default parser moves an atom from in -> rout

it would be nice to be able to compose parsing macros, so they really
should be a special kind of macro: one built on top of ordinary
macros, with the input stream on top of stack, and a primitive 'read'
which takes an input object.


Entry: snotification
Date: Sun Jun  3 20:55:19 CEST 2007

-> entry point = load state + enter main loop
-> main loop = event dispatch

got it mainly working, but i'm experiencing problems with asynchronous
messages.. maybe i should get rid of the dots?


Entry: faith, evolution and programming languages
Date: Tue Jun  5 11:21:39 CEST 2007

bye Phillip Wadler, April 27, 2007
Google Tech Talks

http://video.google.com/url?docid=-4167170843018186532

a bit over my head, but things to look into:

- a logic corresponding to a programming language
- contracts
- haskell type classes for polymorphy

about logic & programming languages:
http://video.google.com/url?docid=-4851250372422374791


Entry: boot config
Date: Wed Jun  6 19:06:17 CEST 2007

i'm looking for a better default for the boot loader, to make sure a
project is either in one of 2 states:

virgin:   run purrr interpreter on boot, no interrupts
app:      fresh reset vector + isr installed

if i make it so that 'scratch' can safely erase the boot sector,
things might get more robust.

things that can go wrong:

- reset not defined, but isr defined
   solotion: always define them in the same macro
- reset or isr defined, but target code is gone
   solution: always erase the boot block on 'scrap'


Entry: forthtv pro
Date: Tue Jun 12 04:18:08 CEST 2007

let's see. what i need is a dual processor 18f1220 system

VIDEO
- low bandwidth I2C master (pull: poll at line frequency)
- video using USART out
- audio sampled at video line frequency

HUB
- low bandwidth I2C slave (push)
- keyboard interface (bitbanged)
- USART for host serial


this is a nice excuse to prepare brood for multicore projects.

note that the 18 pin chips do not have I2C, so i need to go to 28 pin
versions.


Entry: bank select
Date: Tue Jun 19 15:44:50 CEST 2007

keeping bsr at a fixed value, the extra bit in the instructions that
access the register file can be used as an address bit. note 1x20 has
only 256 bytes of ram.


Entry: sheepsint core todo
Date: Wed Jun 20 14:52:53 CEST 2007


- standalone boot + fall into debugger
- battery operated
- brood async io
- 16 bit math for control ops
- note/exponential lookup tables
- pot denoise?
- keep it working (*)


(*) don't know if i can do that yet. what i can do is to freeze the
software: make a fork of brood. i also can't fix the boot block. but i
can fix the app block.. maybe i should go that way.


Entry: text interface
Date: Wed Jun 20 16:07:24 CEST 2007

i probably need to take a deep breath and change the monitor from
binary to text. this would make it a bit easier to standardize, and
also, make it usable without the brood system, for debug purposes..


Entry: application control flow
Date: Wed Jun 20 16:38:38 CEST 2007


1) boot
2) mainloop (contains RX check)
3) on RX, fall into interpreter

then from the interpreter, 2) can be entered. this works like a
charm. sheepsint runs fine on 2 AAA batteries too using
18LF1220.

summary:
- empty bootblock -> fall into interpreter ('warm')
- application -> install reset and isr vector (best at the same time) 

something which is important though: if there's no serial TX connected
to the pic RX pin, something needs to pull the line high. on the
CATkit board, the easiest way is to insert a jumper inbetween RX and
TX.


Entry: DTC
Date: Wed Jun 20 18:03:14 CEST 2007

time to do the real job: a dtc forth. what i'd like to do is to chop
the chip up in 2 pieces. first half is kernel + audio, second half is
DTC on top of that.

this is not so easy as it looks :)

but.. it might be more robust. basicly i have the following choice:
A. go brood/snot and finish that interface (requires emacs + plt)
B. go binary and use just a terminal emulator

i basicly promised B. which, for education and not-too-sophisticated
use, is what we need. getting A. ready to the point where i can teach
it is too much, so i have no choice really. i need a real forth!

and i need it before i can do more synth stuff.. or better, while
doing. so what is necessary?

1. terminal input with XON/XOFF
2. dictionary
3. compile link to ram
4. copy ram->rom


Entry: conditionals
Date: Wed Jun 20 21:16:41 CEST 2007

ok.. using the flag macros can be fast, but it's alsy really really
hard to use if a condition always needs to be a macro. so i need basic
'=' etc.. using nfdrop, and proper if that accepts any kind of byte.

not completely tested, but asm looks ok.


Entry: mini module system
Date: Wed Jun 20 22:19:50 CEST 2007

basicly, do something similar as in PF: a 'provide' word will skip
loading the current file if the word already exists.


Entry: terminal.f
Date: Wed Jun 20 23:10:17 CEST 2007

thinking about this XON/XOFF thing: there is really no way around
doing this with interrupts and proper buffers. the problem is really
that the when we send an XOFF, a byte can be already in progress. in
fact, if there's no break, and the host is sending full speed, it
probably is.

so a proper interrupt/buffer scheme is necessary.

time to dig up those cool 15 byte buffers again :)


Entry: read/write pattern
Date: Thu Jun 21 12:02:31 CEST 2007

something which occurs a lot is an update to memory which i'd like to
put in a macro. till now i always solved this using a macro which
expects a memory address. maybe that's the only sane solution? need to
think about this..

a bit of a hack, but something that might be interesting: have a
'lastref' macro which compiles a ref to the last referred variable.


Entry: workshop
Date: Thu Jun 21 12:41:26 CEST 2007

this serial terminal thing is not going very fast.. maybe i should
focus on finishing the 16 bit words first, then build a tethered DTC
on top of that?

maybe indeed best not to stress too much. it is working. i just need
to add some control to the synth.


Entry: multiplication
Date: Thu Jun 21 21:38:45 CEST 2007

the first thing to do is to create a generic unsigned multiplication,
and derive the other muls from that.

let's call 'z' a 8 bit shift (256)

we need to compute (x0 + x1 z) (y0 + y1 z)
all coefficients are 0 - 255

this gives

z^0   x0 y0
z^1   x1 y0
      x0 y1
z^2   x1 y1

the lowest of 4 bytes is unaffected by the 3 bottom ones
the second of 4                            top one

so, i'd like to do this
- fast
- functional

so no temp variables

the variables are presented as

  x0 x1 y0 y1

every number is used twice

now the juggling

done: i gave up on not using ram. it's probably possible to just use
the stacks, but it's really inconvenient due to the 'convolutive'
nature of multiplication. what i mean is: multiplication has all to
all datadependencies, and is not easily serialized. if it is
serialized, it needs random access (variable names) or at least
relative indexing. forth is not good at that.


Entry: refactoring
Date: Fri Jun 22 13:18:53 CEST 2007

some things that need to change in brood to make it easier to
understand and modify:

- words need to be cached, not delayed evaled, so incremental loads are possible
- parser macros -> purely symbolic, using only a 'quote' word for some 'pure forth'
- partial evaluator needs to be properly defined, so more elaborate operations are possible. i.e. explicit literal stack + commutation of operations with literals.

so, in short: CACHE, PUREFORTH intermediate (without parsing words),
and explicit PARTIAL EVALUATOR.

the PE needs to work together with the PUREFORTH, to be able to have
symbols als "ghost values".


Entry: forth vs DSP
Date: Fri Jun 22 13:27:09 CEST 2007

following the remark above about multiplication. most DSP stuff is
like that, so i wonder if it makes much sense to write a forth for the
dsPIC. anyways, it shouldn't be too hard once i clean up the compiler
code a bit.


Entry: sheepsint next
Date: Fri Jun 22 16:34:41 CEST 2007

ok, DTC and multiplier are working. time to get busy :)

maybe i do need to think a bit about the memory model though. might be
interesting to have full device control.


Entry: memory model
Date: Fri Jun 22 17:40:02 CEST 2007

what about simply:

* kernel is overlayed with RAM + EEPROM
* all the rest is flash

note that only the first 32kb can contain VM code, due to VM using 2
bits. the other 32kb is addressable, but only usable for tables etc..
not important now for PICs i use.

so ram max address space is max #x1000, data eeprom i'm not sure there
is a limit. but we have only #x100 and it's not used. what about we
map the flash to the upper 32 kb, and ram from the start? then eeprom
could be added later.


Entry: vm macros
Date: Sat Jun 23 22:47:10 CEST 2007

basicly, i need control words. so i need a mechanism for vm macros.
ok, in place. next is to just write macros, and to add a mechanism for
loading.

actually, this is kind of interesting, since it requires 'control
stack' operations. to re-iterate, i have these kinds of macros:


- peephole optimizers (asm buffer + used as literal stack)
- control operations (use data stack as control stack)
- recursive macros
- simple incremental macros (writer monad)
- whole-state assembler macros (i.e. global optimization)


if i make the stacks a bit more obvious: literal and control stack
need to be independent. control stack is sort of a literal return
stack.

so i just need to write accessors that bridge literal stack (asm
buffer) and control stack (data stack).

the more general thing that interests me is to make more functionality
available to the forth level, so more powerful macros can be written
straight in forth, without having to resort to tricks. in short: i
need a meta-forth, not a meta-cat, so cat can be tucked away as an
implementation/intermediate language.


Entry: compilation stack and word names
Date: Mon Jun 25 11:40:58 CEST 2007

i find some standard forth words a bit confusiong. it's probably
easier to start calling the compilation stack 'c' and be explicit
about the traffic.

there are only 2 label operations: localsym>c (generates a new label)
and label>c (compiles a label reference for the assembler).

in ordinary forth, labels can be patched, effectively implementing a
dual pass assembly. since we're not using mutation, we just generate a
label at the first occurence (instead of reserving an empty cell and
pushing its address) and bind it to opcodes as required later. these
symbols will be bound by the multipass assembler later.


Entry: writer macros
Date: Mon Jun 25 11:56:23 CEST 2007

these are confusing. maybe i really shouldn't distinguish between
'writer' macros, and 'asm buffer' macros. the the writer thing is
clumsy and a bit hard to understand. so i'm taking it out.

+ it's simpler: i'm using some I/O style monad '>asm'

- writer macros can't be isolated any more (assumption needs to be:
  modifies the whole state, not just concatenation.)

this doesn't seem to be a big disadvantage. it's probably better to
use some kind of tag system to classify macros according to
properties. the only thing i use it for is optimization, where missing
a classification means some optimization can't be done, so it won't
cause fatal errors.


Entry: make-composer
Date: Mon Jun 25 13:24:19 CEST 2007

another thing i'm running into is my terminology about namespaces. if
i have a collection of words, i'd like to specificy:

- source dictionary (semantics)
- destination dictionary (def)
- parser (syntax )

currently that's make-composer, but the names used are a bit
confusing. this can be done better. maybe i should just rename
make-composer to define/parse/find.


Entry: parser words
Date: Mon Jun 25 13:38:01 CEST 2007

this needs a thought about what to do with parsing words, mostly
quoted symbols. i guess it's safest to put them on the compilation
stack, so i don't need any literal optimizations.

Entry: todo
Date: Mon Jun 25 13:38:54 CEST 2007

- take out all writer stuff OK
- rename 
  	 asm-buffer-find       to       find-asm-buffer
	 asm-buffer-register!  to  register!-asm-buffer
	 state-parse	       to      parse-state
- fix parser macros: decide on lit/comp stack
- fix assembler evaluator


i'm not going to change th find/register!/parse names. this is just
cosmetics..

about fixing the assembler evaluator. what about requiring all literal
arguments to be cat code?


Entry: literal stack + compilation stack
Date: Mon Jun 25 14:37:20 CEST 2007

the important thing about stacks is that you need two of them, i once
read. which seems to be the case. currently i'm trying to figure out
what should go to what by default.

the idea of the 'literal stack' is simply to be able to do some
computation at compile time. a nice feature here is that a lot of
operations become more natural. for example:

	   1 2 +

is really just 3. and this is a mandatory optimization in
badnop. something you can rely on as a feature. standard forth would
make this explicit

     	  [ 1 2 + ]L

the reason i dont use the above is that my meta language is not
forth. it's CAT. more importantly, CAT is much more powerful than the
simple 8 bit forth is.

so, the idea goes:
- mandatory literal optimization (compile time evaluation)
- forth extended with 'ghost' types

the ghost types are things that make no sense for the microcontroller,
but when they are combined with other ghost types, result in things
that do make sense. the most obvious one is assembler labels:


           ' foo

will compile code that loads the (symbolic) address of foo. if this is
followed by a macro that consumes it, the whole can be reduced to code
that does have a meaning on the microcontroller.

i'm not 100% convinced this is a good idea (not being explicit), but
it does feel like one. what i'm looking for is to give it a decent
meaning. and to find out when to use the literal stack, and when to
use the compilation stack.

another thorn is the way the literal stack is implemented, but that
can be fixed later. right now i need to get the semantics right.


i'm not asking the right question..


what's the real problem here? the target chip has a clear separation
of ROM and RAM. this is both convenient (code is persistent), and not
(they need to be treated differently).

what i'd like to do is to make a source file correspond to only
ROM. standard forth doesn't do that: loading a file both writes code
and initializes data. i guess this is the main reason why things are
different for me:

 harvard: ram initialization (run-time code) and meta compilation
 (compile-time code) are strictly separate.

 von-neumann: both can be done at the same time (program load time),
 and blurr together.

so what does this have to do with the literal stack?

- the meta language is not forth
- i'm trying to disguise this

basicly, i'd like to not think about this thing being a
cross-compiler, and act as if everything runs on the target. one way
of doing that, is to require compile time evaluation whenever it is
possible. as a result, the simple recursive macro system, which does
not refer to the real meta language directly, becomes more powerful:
required partial evaluation gives it some run-time power, instead of
merely being passive concatenation of code.

so the real question is:

  how to simplify the target language such that no explicit reference
  to the meta language is ever necessary, and all macros have a
  compositional semantics.


the way that seems most natural to me is:

- partial evaluation is the default: act as if everything is done at
  run time (like "1 2 +"), but write the macros such that they perform
  compile time compilation + raise an error when higher level things
  can't be resolved at compile time.

- some constructs use the COMPILATION STACK referred to as 'c'. this
  is mainly intended for code blocks, and serves a bit the role of the
  return stack.


this also gives the solution for parsing words: their default
semantics is to map something to a literal compiler. 

a common problem i encountered is a macro which has 2 references to
the same name. this is now easily solved using the compilation stack.

so the key is really in the words '>c' and 'c>'

   
Entry: vm words and literals
Date: Mon Jun 25 15:18:13 CEST 2007

so, looking at the remarks before.. the literal stack is really more
than just literals. it could contain words to. words in their normal
meaning are calls.

so: the assembler buffer is just a stack of symbols, bound to
semantics (literal, call, jump). what i really need is 2 new opcodes:
lit and word, that will be resolved in the assembler, but that can be
used in the optimizer and partial evaluator without too much trouble.

so i think i see the roadmap now:

1. fix assembler to take these opcodes:
      cw	 call word  (code)
      jw         jump word
      qw         quote word (data)

   which are really just the primitives used in the VM

2. fix the peephole optimizer to operate on those words


this will give a proper semantics to the literal stack, basicly it
will then contain words + their meaning: code or data. again a simple
pattern: delay low level representation as long as possible.

ok.
now i need to check first if the monitor code still runs..
it does. time to fix this.

it's probably easiest to create an extra assembly step which filters
out the pseudo ops. could be interesting to clean up the assembler a
bit.

i'm writing pic18-compile-post now, and will start using 'values' to
do the expansion. at first i though this values thing was a bit
clumsy, but having to wrap things in a list is usually more work: it's
better to do this in the consumer using call-with-values, than in the
producer when there are a lot more producers than consumers. which is
the case here..


Entry: literals : save
Date: Mon Jun 25 16:02:53 CEST 2007


oops. too much coffee, going too fast..  i AM doing SAVE for each
literal, so the postprocess step shoul maybe perform the save too?
this is a bit more complicated than i thought..

so: when to do SAVE?

currently save does:

 ((['drop] save)            '())
 (([op 'POSTDEC0 0 0] save) `([,op INDF0 1 0]))
 ((save)                    '([dup]))


what about just a second compiler pass with the word 'save' ?

that 2nd pass seems to work. problem now is a lot of the literal
macros do need their arguments. am i going to try to fix that now?
maybe think a bit about how to do that in a smart way..

ok. roadmap:

- just added c> == '(qw) op>asm
- replace all lit macros by a qw macro, and remove them from expose hack

done, now for the calls

TODO:
- replace branches and calls with pseudo ops
- fix vm control ops
- start working on the control part of synth


Entry: multiple passes: pseudo assembly language
Date: Wed Jun 27 10:41:04 CEST 2007

so the pseudo assembly langauge is a bit more explicit now. between
forth and real assembler there is a representation where the opcodes
qw, cw and jw are used. they give a proper typed stack meaning to the
assembly buffer.

as a consequence, quoting in macros can be eliminated, and pure
postfix notation can be used, using the 'word>c' operation, takes a
code word from the assembly buffer, and moves the tag to the
compilation stack. in short, instead of

	    	   ' word <...>

one can do

                   word word>c <...>		                   

where <...> handles the symbol/address.


now wait.. if quote is not longer necessary in the compile time
semantics, why use it? (it is still necessary in the run time
code. back to that later.)

the whole idea seems to be: because all behaviour is postponed
(compilation), it doesn't have to be stopped before it
happens. meaning: if i enter 'broem' at a command prompt, it will
execute -> damage done. if i want to not run it, i need to 'quote' it,
which means postpone execution. during compilation, everything is
postponed, so no need for quotation! wonder if i can make that
a bit more formal.

this does ring a bell somewhere. been reading about the macro/module
system in PLT scheme. something with keeping run time and compile time
separated to make dependencies explicit.. anyways.

oops. not completely true: if it's a macro, and you want to refer to
it, it needs to be quoted.

an 'almost right' thing here.. if there are no macros, it's
right. macros are code, the rest is data during compilation. if
there's no code, true. back to my original point: quoting is
postponing execution. maybe i should just try it to see if i get into
situations that are awkward, because it does look promising.


Entry: vm compilation: one word to change semantics of parsed code?
Date: Wed Jun 27 11:25:18 CEST 2007

since the compilation buffer already contains the code/data
(quote/call) distinction, only a single word is necessary to convert
any operation to its vm equivalent. this word should leave macros
alone.

problem here is that i need a type (pattern) matching word here, so
not yet..

simpler: i'd like to remove the quote in 'vm->native/compile' and in
the vm-core.f file, so i can easily compose macros. quote is really a
preprocessing thing, which is necessary to get from source -> forced
data semantics. once parsed to intermediate, no quote is necessary.

argh.. so i don't really need to remove the quote there, since it's
exactly that: a preprocessing step to generate native forth code. it's
ok this includes a quote operation. si '_literal' and '_compile' take
data atoms on the literal stack, which means quoting is necessary for
code atoms. this allows VM semantics (decision for code/data) to be
different than lower level language, which is a good thing.

check vm-core.f for some explanation. sommary:
"purely compositional macros == good thing".

it's the basic idea of CAT. see the notes below.

question is though: can i make these macros powerful enough to have
some kind of lambda construct? postponed macros basicly? the only
thing i want to solve now is conditionals, but better to aim for the
bigger thing.


Entry: language path
Date: Wed Jun 27 11:41:28 CEST 2007


FORTH with parsing words = symbolic ---> 
FORTH with only quote = symbolic ---> 
pure FORTH without quote, = compositional CAT code -->
intermediate assembler = effected macros (real asm) with pseudo asm (qw, cw, jw) -->
symbolic machine assembler -->
binary

i should give these a name
PURRR18/forth      (quote + parsing words)
---> PURRR18/quote (pure + quoting word)
---> PURRR18/pure  (purely compositional macro language, as CAT code)
---> PURRR18/asm   (PIC18/asm augmented with pseudo ops)
---> PIC18/asm     (my version of the symbolic assembler language)
---> PIC18/bin     (binary machine code)


Entry: i want it all
Date: Wed Jun 27 12:14:37 CEST 2007

what about postponing macros? i basicly want conditional branching at
compile time, but full lambda (quoted macros) would be interesting too.

now what did i expect? forth is not CAT. this is a game of syntax, in
the end.. i'm trying to cram a meta language into the language syntax,
without using its quoting mechanism: lists. it looks like i can't make
it too powerful without introducing quoting syntax, which is what i'm
trying to avoid to keep it simple.

the problem which sprouted this line of thought is the VM return
operation. the words "_then _;" don't work because ";" expects a
word. so i'm going to need an extra primitive to solve this
conditional execution.

maybe there is only one real solution. make the ' operation a
syntactic one, like in lisp. 

- if quote is syntax, an intermediate language is not necessary.
- if it's not, a parsing stream needs to be available

the last one is obviously worse, since it makes composition harder. so
that's what it will be: quote needs to be syntax, and ' is a special
character.

so, pure forth in s-expressions is

<program> ::= ( {<atom>} )
<pure>    ::= <number> | <word>
<atom>    ::= <pure> | ( quote <pure> )


to preserve previous syntax, the run-time semantics of "' word" still is
"load address of word on parameter stack"

so, summarize again:

is quote a lexing operation, or a syntactic operation? the answer
seems to be the former. the problem this solves is this: syntacticly,
code and data are distinct. the full domain is split in 2 parts, but
semanticly, code is a subset of data.

introducing quoting at the lexing level gives:
- better mapping to CAT (using the same lexical trick)
- saner semantics: quote is defined independent of an input stream
- quoting can be used in macros, using forth syntax, keeping the compositional property

in the language path above, the 'pure' and 'pure+quote' will now be the same, so i have


Entry: updated language path
Date: Wed Jun 27 13:31:57 CEST 2007


PURRR18/forth      (quote + parsing words, symbolic form is not CAT)
---> PURRR18/pure  (purely compositional macro language, has symbolic CAT form)
---> PURRR18/asm   (PIC18/asm augmented with pseudo ops)
---> PIC18/asm     (my version of the symbolic assembler language)
---> PIC18/bin     (binary machine code)

so the entry point is there to preserve original forth syntax, i.e. ":
abc". for internal processing, this will be mapped to "'abc make-:" or
as s-expression ((quote abc) make-:).

the 'make' name i need to think about still..

reason for having ' as lexing operation, instead of parsing, is that
it eliminates one parsing layer + it maps better to CAT.

this is different than forth, but in a way that is probably hardly
noticed.


Entry: again?
Date: Wed Jun 27 16:16:34 CEST 2007

so why not just a parsing step?
i need types to do this properly


macros:  pure+quote -> pure
forth:   forth-> pure

parsing words are merely frontends for pure

the alternatives are: 

1. lexing produces a stream of symbols and numbers. then there are 2
different parsers that map this to pure forth.

2. lexing already produces quotes

the first option is really simpler, so let's keep that.


Entry: parsing
Date: Wed Jun 27 16:36:30 CEST 2007


so now i need to redo parsing. currently, it's a bit of a hack. it's
not extendible. but do i really want it to be extendible? i need a
different 'kind' of word. a parser is not a macro.. they operate on
different levels.

so let's abstract it out a bit.

2 steps need to be separated

forth -> symbolic cat
symbolic cat -> parsed cat

both are parsing operations structurally, but it's maybe best to give
them different names?

i got it, except for the quoting stuff..

now, a problem i ran into is that ' abc actually compiles a byte
address. i wonder where this will fail if i change that.


Entry: bytes or words
Date: Wed Jun 27 18:25:42 CEST 2007

some conflict here

bytes:  ' abc org	needs byte addresses
words:	"' abc"		can be used as just a symbol.

maybe quote is more important. maybe we need to have "execute" take
word addresses everywhere? that's also better for the VM.

the thing is: data is always byte addressed, while code is always word
addressed. a unified address space (bytes) would be nice, but makes
things complicated since quoting is not just quoting..

so best seems to me:

* execute takes word addresses
* monitor JSR will also take word addresses
* quoting a symbol name has default semantics to load word address on stack


Entry: cosmetics
Date: Wed Jun 27 18:40:35 CEST 2007

TODO:
- make dtc intermediate code a bit more readable
- fix prj path as mutable state (arbitrary.. maybe see it as a constant?)

last one isn't so important.. first one requires some kind of
loopback, and i think it will make things too complicated.. need to
think about it.


Entry: dtc control primitives
Date: Wed Jun 27 20:49:18 CEST 2007

i need 'run' and 'jump' prims.. time to get confused about primitives
and programs again. if i remember correctly, the lessen is to never
let primitive addresses leak into the higher level code: it's not
convenient to have to deal with 2 kinds of code words. in cat, i only
use programs (lists of primitives) never primitives directly. same
here.

just like for primitives, i need to choose for some kind of basic
representation: byte or word addresses for composite code? the only
thing i need to take care of is that continuations (return addresses)
are compatible with "run".

i'm getting confused.. i guess i just need to write if/then/else and
we'll see how to continue. it does look like there's no easy way other
than:

	LIT L0
	BRZ
	<true>
L0:	<rest>

and

	LIT L0
	ROUTE
	<true>
	LIT L1
	RUN;
L0:	<false>
L1:	<rest>


ok, so be it. can't win them all.. maybe a good opportunity to use
ifte instead of if .. then .. else.

so.. primitives. can't 'run' primitives. can run programs. so the idea
is that quoting code always quotes programs, so i need something like
PF's { and } words. for conditional branching i can use 'route' as a
basic word. cloaqued goto or something.

route \ ? program -- 


Entry: assembler bug
Date: Thu Jun 28 00:07:21 CEST 2007


performing meta evaluation needs to happen in the 2nd pass, because of
the presence of code labels.

time to clean up the assembler, and sort out all different
meanings. the bug is simple: just retry if there's an undefined
symbol.

then another problem: literals take 14 bits, but quoted programs are
byte addresses. can we resolve this somehow? if i really need the
return stack to contain word addressess, that can still be fixed
later. now i'm going for 'run' and 'run/b'.

ok, it seems to work now.


Entry: vm optimization
Date: Thu Jun 28 09:36:41 CEST 2007

now it's time to reduce code. it's not very fast anyway, so no reason
to start spilling bytes. but this is for later. got some stuff to get
ready now.

i'm happy with how it's looking though. some minor things need fixing,
probably the most important one being return stack alignation.

something to focus on is to limit the number of macros. i probably
only need conditionals, the rest can be written in forth even. macros
are only necessary for marking jumps.


Entry: sheepsint 8 bit interface
Date: Thu Jun 28 15:24:27 CEST 2007

so. i need a synth control layer. going to use the ordinary 8 bit
forth.


Entry: loading dtc forth
Date: Thu Jun 28 16:03:59 CEST 2007

problem. the mapping from vm -> native forth is not just syntactic. it
uses knowledge about target words being macros (as native macros) or
dtc target words. this means 'load' will not work properly.

so this decision needs to be postponed. 

easiest is to load both symbols (word and semantics) on the literal
stack, and have a macro determine the semantics.

ok, seems to work.


Entry: problem with dup and literals
Date: Fri Jun 29 09:49:11 CEST 2007

123 dup 456 doesn't give 2 literals on the stack.. if i let dup copy
the literal, some other things go wrong.. maybe it's best to have dup
copy the literal, and solve the other problems in a second pass?

i found an optimization that solves it in one pass, by realizing

1 (2 3 !)   ->    <...> 1

where <...> stores the value with stack effect = 0.

other places where this might go wrong is where an explicit dup is
expected.. there are none outside of the '!' i think.


Entry: sheepsint core
Date: Fri Jun 29 10:51:39 CEST 2007

things to fix:

- noise
- sample playback

then for control, i need to find ways to map parameters to meaningful
ranges. this is where multiplication and exponential table lookup come
into the picture, which might be an interesting advanced topic.

ok, there's a problem with the buffering: i don't have a fixed sample
rate any more, so computed values need to be sent out immediately: i
have no idea when the next event will output the previous state!

ok, just moved it to the end of the isr.. now there's a bit of jitter,
but probably not really noticable.


noise still isn't working. i can't find the problem. probably needs a
fresh look. also, notes aren't working..


Entry: unified namespace and rolling back
Date: Sun Jul  1 15:47:32 CEST 2007


for target stuff.. meaning: something defined as a variable should be
able to be redefined as just a target word. or not? this is not so
easy since all meta objects are compiled into the core, and are not
really seen as data..

there is also a conflict between forth's "first find" and my meta
language's last redefined. maybe the project file should index macros
somehow? so they to can be reverted.. this would be cool for variable
names etc..


Entry: VM and TBLPTR
Date: Sun Jul  1 15:57:09 CEST 2007

maybe it's not such a good idea after all.. the deal is this: the VM
should be easy to use. anything that needs speed can simply be moved
to primitive code, completely eliminating interpretation overhead. i
put some effort into making both layers interoperable, so why not use
it?

it seems as if each 'useful' feature of the VM makes it a lot
slower. why do i care? the whole idea is to make some kind of
standard. why not write the VM on top of the memory model for
instance?


Entry: swapf
Date: Sun Jul  1 17:53:10 CEST 2007

something is wrong with the nswap macro:

ok, i found it: nothing wrong with the macro. there was en error in
the assembler binary opcode.


Entry: control slides
Date: Sun Jul  1 18:17:29 CEST 2007

linear & exponential. in-place updates? probably best to go
out-of-place. with wrap-around?


Entry: control timer
Date: Sun Jul  1 18:37:05 CEST 2007

previous sheepsint had some fixed sample->control rate timer. here i'm
using a fixed sample rate for the noise generator (bit less than 8
khz), which increments a 32bit counter once every tick. this can be
used as a general fixed time source.

ok, trying to sync to bits of the 32 bit timer, i'm using this code:

\ control at 244 Hz    
: wait-control
    begin tick0 6 high? until
    cli tick0 6 low sti ;

but the cli/sti isn't necessary: the timer increment is atomic:
there's no read-modify-write.

one problem though, if the counter is reset, higher bits will never
get set! so a better strategy would be to wait for a bit to go low,
then wait for it to go high, so the transition is captured.


Entry: fix macro loading
Date: Tue Jul  3 12:07:42 CEST 2007

really annoying to have these not synced to project.. maybe include
them directly in the project file. also need caching: timestamps would
work together with mark points. a problem point is missing variable
and function name spaces. once something has been a macro, it will
remain a macro. a single dictionary stack is easier to use.


Entry: transient controller
Date: Wed Jul  4 12:43:16 CEST 2007

this is fairly simple if it only needs to save the mixer config (one
byte). saving oscillator frequency state requires 6 bytes more. what
about making the transient word itself responsible for saving current
state, and just using the x stack.

if the time base is fixed (32 bit tick timer), control words become
fairly simple. remaining question: who is responsible for syncing to
note tick? this is a question of composition: i.e. hihat + kick at the
same time requires hihat word to sync to note, not kick.

best to keep control syncing independent of note syncing.


Entry: AD conversion
Date: Wed Jul  4 13:53:39 CEST 2007

2 things to determine:

- aquisition time (sample/hold settling)
- TAD (per bit sample time)

TAD should be as short as possible, but greater than the minumum TAD,
approximately 2uS for 18F1220. the datasheet says for the F version at
8MHz, to use 16TOSC, and for the LF version to use 32TOSC.

It was on 16TOSC, 20TAD.. put it to 32TOSC, but can't see a
difference. maybe the pots are too noisy. i tried to add a capacitor.
100n and 10u, but no difference..


Entry: noise
Date: Wed Jul  4 15:49:18 CEST 2007

noise is probably more useful as one of the oscillators instead of a
fixed 3rd one, just like sampler. using the 8 bit timer only for
control time base frees up some resources, and decouples noise
frequency from control frequency.

best seems to be OSC1, keeping in mind the formant mixer. changing the
mixers: silence, xmod, formant. and having OSC1 do noise/square/sample.


Entry: bootsector
Date: Wed Jul  4 16:02:52 CEST 2007

maybe it's best to reserve some functionality for chip erase, so i
don't need to worry so much about messing up the bootsector. basicly,
just need a single piece that never changes, which has the ability to
influence the booting process to run the interpreter. probably an 'ack
bang' or something?

- keep boot sector free for fast isr
- reserve 2nd block for reset vector?

seems the core of the problem is that boot vector and isr vectors are
in the same block. what if:

- default reset vector = jump to second block
- add an application vector after this
- second block contains some kind of checking code to determine
  activation of application or debugger


Entry: metaprogramming
Date: Fri Jul  6 13:03:44 CEST 2007

more things from forth. i'be been using the first couple of macros
that use the compilation stack explicitly. i could probably move more
code to be accessible from the forth macro language. to have a forth
like [ and ] section would make sense.

the point where i want to stop is s-expressions: once i'm introducing
that syntax into forth, there's nothing stopping it from becoming
something completely different. one of the aims really is to keep out
s-expressions. however, it's not so hard to have some kind of 'begin
... end' construct that maps directly onto cat code.


Entry: noise as osc1
Date: Tue Jul 10 22:22:35 CEST 2007

tested. seems to work.


Entry: macros and cat
Date: Tue Jul 10 22:25:32 CEST 2007

name space mixing in macros. the ultimate goal is to have a forthish
CAT that i can just include in PURRR/18 code. currently the 'c' words,
combined with the literal stack, work pretty well. i need to think
about cleaning up the semantics a bit. there's a lot of nice things
hidden here..

one of those is: you need 2 stacks. mapping behaviour in an assymetric
way (i.e. return stack / data stack) is arbitrary "human meaning" to
ease understanding of components so they can be composed.


Entry: nand synth
Date: Tue Jul 10 23:45:29 CEST 2007

works like this:

- 4 schmith-trigger based oscillators, cap select (decade) + pot

- chained: 2nd AND gate input turns oscillator off

- the NOT in the chain prevents subsequent oscillators from being OFF
  at the same time

so, the the first oscillator A produces a square wave. during A's ON
period, the second oscillator B produces a square wave, during A's OFF
period, the second oscillator gives ON. and so the story continues...

....AAAAAA....AAAAAA....AAAAAA....AAAAAA
BBBB.BB.BBBBBB.BB.BBBBBB.BB.BBBBBB.BB.BB
C.C.CC.CC.C.C.C.CC.C.C.CC.CC.C.C.CC.CC.C

etc

this can give a quite complicated pattern after a couple of steps. one
thing is missing though, there is no resync: all capacitors keep state
between oscillator ON/OFF switches, so no formant-like tricks.


Entry: noise as sync source
Date: Wed Jul 11 12:12:56 CEST 2007

It's possible to use 'filtered pitched noise' by using the RESO mixer,
together with a noise OSC1. however, the opposite: an oscillator
resynced by noise i don't have atm. maybe OSC0 should be able to do
noise too?

OK, that's a different game


Entry: boost converter hack
Date: Mon Jul 16 19:13:12 CEST 2007

as mentioned before (probably in brood 2 ramblings.txt), it is
possible to use a protection diode as rectifier for a signal -> power
converter by connecting a signal with a large enough duty cycle
directly to an input pin, and connecting a cap across the power pins.

related, it should be possible to convert that scheme into a boost
converter, by connecting a power supply to an input pin using an
inductor, and using the pin's output stage as a switch that to charge
the inductor (by connecting the point to ground).

when the pin is switched to input, the inductor discharges the energy
stored in the magnetic field, and charges the capacitor through the
protection diode.

this way, the uC can regulate its own supply voltage. this scheme
just needs an initial push to charge the capacitor such that enough
energy is stored to boot the program that starts the feedback
mechanism.


Entry: filter bank on PIC18
Date: Mon Jul 16 22:13:34 CEST 2007

so, if i want to run a digital filter on the PIC18, for, say, some
demodulation. what performance am i looking at?

running on 5V and a xtal, i can get to 10 MIPS. for audio rate
signals, say up to 5kHz, this gives 2000 instructions per
sample. that's not quite nothing.

using half of this for the filtering, and the other half for the
decoding and the actual application, we're looking at 1000
instructions of DSP to burn. looks to me there's plenty of room.

to make it sound good, tones need to be quite stable. at least 1/16th
of a second. say 6.4kHz, this is about 400 samples.


Entry: PSK31 and meshing
Date: Mon Jul 16 22:27:06 CEST 2007

i think for the waag, we need to keep the basic objective simple:
PSK31 as it is tried an true, and there is decoding/encoding software
to actually test it.


Entry: human naming nature
Date: Tue Jul 17 01:21:33 CEST 2007

One of the things that's nice about a compositional language like CAT
is that they force you to aggressively factor. Simply because programs
become to hard to understand if you don't. Factoring is really
identifying (naming) substeps. In a compositional language, factoring
is really totally arbitrary, from a machine point at least. Not for
the programmer. Since function arguments are not named, names have to
be introduced elswhere.

This is that extra bit of 'meaning' in a program which transforms it
from the mess a computer just executes, to some meta-executed thing
represented in a human mind.. Those are really not the same. Being
able to program something and 'knowing' how it works are different
things. The 'knowing' is hard to explain sometimes.

It's just a force of (human) nature, really..  For a program to be
actually readable, a bit more than the connectivity (topology) is
necessary: the information encoded in the names an sich seems to help
the human brain to understand the connectivity, or at least give it
some analogy.  Maybe a bit like embedding a topological thing in a
geometry to make it more 'real', programming is embedded in the real
world of thoughts by associating some native language to it. The two
ways to do this are either the lambda calculus (lexical scope) or
combinators.


Entry: get off that lazy ass
Date: Thu Jul 19 13:49:18 CEST 2007

i think i'm not made to idle around. depresses me. people tell me i
need to try harder, give it a couple of weeks of idling to find out
the true joy of life. i don't have time for that :)

so.. next things to tackle are:

* fix boot loader so the ICD2 can stay safely in the box for really stupid mistakes.
* interaction macros
* SNOT and sending code from emacs
* the slow highlevel forth on virtual memory


Entry: the boot block
Date: Thu Jul 19 13:54:04 CEST 2007


conditions:

* BLOCK 0 = empty OR 0000 and 0006 contain jumps to BLOCK 1 (soft reset)

  this ensures that an empty boot block is valid + interrupts and
  application invocation result in a reset when they are not defined.

* during boot, a DEBUG condition is checked. this will force it to run
  the interpreter to await commands.

* if the DEBUG condition is false, the application (addr 0002) is
  executed. if there's no application, a soft reset is run. (so
  eventually the chip responds).

* installing a new application:
   - clear boot block
   - install security jumps
   - install isr code

* possible conditions:
   - a pin
   - a boot wait + serial activity
   - break condition on serial port

--

installing the bootblock can be done in a single interaction macro:
compile an init macro, then when this succeeds wipe bootblock, and
upload a new one.

the deal is that the boot sequence up to the DEBUG check is NEVER
changed! it's not enough to have your application perform such a
test. this can go wrong in it's boot sequence before the check is
executed, or even during the check. get it right once, then keep it
like it is.

another possibility is to have the serial port operate from
interrupt. that way sending a break signal could actually stop the
program. however, this is more complicated and reduces freedom for
custom isrs.

--

thinking about it, why the one at 0006 ?  ok, it prevents problem if
there's a reset vector but no application vector installed. better to
be safe.

ok, default really is empty boot block: means app is gone. whenever
APP and ISR vectors are installed the 'reset-vector' macro needs to be
included.


Entry: new stuff
Date: Mon Jul 23 13:44:31 CEST 2007

done doing goto10 admin stuff. time to make a list of things that need
a different approach.

BROOD:

* streams (don't save intermediate state)
* macro namespaces
* interaction macros
* clean up pattern matching macros
* SNOT
* clean up / document / reflect on the forth macro semantics (partial
  evaluation + parsing words)

PURRR:

* boot block updates
* highlevel forth on virtual memory


Entry: name spaces
Date: Mon Jul 23 13:54:02 CEST 2007

i guess i need a proper name space mixing for the macro system. it
should all be just scheme functions, not hashtables full of
structures.

currently i have the following name spaces: cat, state, store, meta,
asm-buffer, forth-parse, macro, badnop

so.. let's see if i actually understand the plt scheme namespaces. a
namespace is something that maps symbols to storage cells for works
like 'eval' and 'load'.

so instead of using hash tables and explicit lookup, using namespaces
one could use 'eval'. the advantage is that run time 'eval' could be
avoided, and macros could be used where possible.

so, what do i want really.. 

* access macros using scheme names in scheme code.
* compile (eval?) a symbolic cat function straight to scheme fn
* be able to change cat macro name bindings just like scheme


questions i need answered:
* can an entire namespace be hidden in a module?
* is it possible to dynamicly add stuff to a module?
  (i guess so, using module->namespace)
* how to 'merge' namespaces?
* can i abstract the rather awkward symbol prefix merging?
* is prefix merging really awkward?


name spaces in scheme:

* once evaled/compiled, an expression is bound to a certain name space
  and independent of the current one


Entry: callout
Date: Mon Jul 23 22:30:28 CEST 2007

i need some knowledgable people to discuss this stuff with. don't know
where to find them though. things to try:

* plt list
* comp.lang.forth
* picforth list
* gnupic list


Entry: BROOD 4 takeoff
Date: Tue Jul 24 00:00:00 CEST 2007

EDIT: this where the ramp up to brood 4 starts with the move from
interpreter -> macros.


Entry: really on top of scheme
Date: Tue Jul 24 19:02:23 CEST 2007

so, i need to get rid of the explicit interpreter. or not? i'm mostly
concerned with name spaces here, not implementation.

(1 2 +) ->

(lambda stack (apply cat:+ (cons 2 (cons 1 stack))))

what about preserving original source form? do i actually still use
that? yes, when printing code. for example, doing (1 2 +) creates a
quoted CAT program, which when compiled doesn't have a source form.

so, how to assoicate original source form to lambda expressions?


i really should define my interface first. i don't need to use raw
functions as representations. the 'stuff' that's bound to names can
just as well remain a word structure. in the end, i'm doing nothing
but replacing hash tables by name spaces.

so..

* modules: separate code into logic entitites
* namespaces: allow run-time eval/compile

the latter part is not really necessary for the core! so, i should
build macros first, make sure i have a direct map from:

CAT (or any monad language derivative) -> 'raw' cat -> scheme

raw cat is just cat with scheme words.


so how to do this?

- all CAT code is compiled: use modules
- how to separate name spaces: (i.e. how to prefix names?)

so.. it's seeping through. names are compile time stuff. macros are
compile time stuff. anything that juggles names should be a macro. so
(cat +) is a macro, which expans to a labda expression, or a variable.

it's not enough to have it expand to just a lambda expression. storage
should be shared, so (cat +) should return a binding in case of a
single expression, or a composition (cat 1 +) in case of multiple
arguments.

so, what about this:

  any CAT-like languages use the (<language>: <word> <word> ...)
  syntax, where the macro <language>: (i.e. 'cat:') transforms the
  code into a function that maps stack -> stack


this way everything is directly accessible from scheme. for example
(cat: 1 +) is a lambda expression. neat. even, ':' could signify THE
cat. then 'cat-compile' is no more than (eval (cons 'cat: src))

note that i don't really need to ever run any programs. cat is just
functions, and in scheme, they can be applied to data.

the thing is, i don't need an interpreter. i just need a proper way to
associating compiled code to original source form (reflection). this
does mean giving up some reflection: the current source/semantics
association probably needs to change. it's not a small rewrite..


Entry: the macro way..
Date: Thu Jul 26 11:17:42 CEST 2007

let's start with some basics.

apparently structures can be used to implement behaviour of
procedures, using struct-type properties. this should be enough to
convert completely to macros.

i started cat-base.ss

so, here we go.. all the freedom is there again. 


* i'm starting with one modification: low level CAT source
  representation is reversed. this makes writing the macros a bit
  easier.

  this makes (a . b) be 'compose a AFTER composition b', so:

  (pn-compose  a b c)  ==  (apply a (pn-compose b c))

* 3 phases are separated:
    - compile: atom -> representation of behaviour (apply/cons)
    - compose: list of words -> nested apply/cons
    - abstract: application -> lambda expression

  compile can be recursive due to the presence of quoted programs

* reversal is introduced early on: it's too confusing to have it
  around after the nested 'apply/cons' is in place. i'm switching from
  pn- to rpn- prefix at the point of abstraction (converting code to
  scheme lamda expression).

* snarfs can be stolen from previous implementation. maybe the code
  reversal should use a generic reverse macro too. (done)

* now all that's left is to solve the name resolution.


Entry: separating syntax from semantics
Date: Fri Jul 27 13:05:09 CEST 2007

I got the syntax working. Now i'd like to build an abstraction that
takes a binder macro, and produces a compiler macro:

cat-bind   ->   cat::

Assuming the structure of the language remains the same.

The problem is i keep running into compilation phase problems and i
don't really know why.

It's quite intriguing, this macro programming. Not quite the same as
regular lisp hey :) It's a bit like a lazy language with pattern
maching. Maybe it is a lazy language? Would be nice to read a bit about this..

Anyways, i do see to start some programming patterns. I have a problem
that i'd like to keep both semantics and syntax abstract. Currently, i
pass around 'compile', but it's too general. I'd like to specialize
only some compile behaviour, and keep the rest open. So: message passing!


That seems to work quite well.

Now, on to semantics.


Entry: macro expansion
Date: Fri Jul 27 16:36:51 CEST 2007

One problem i run into is that (cat: ....) seems to be looking for
symbols in the toplevel. I guess if i know why, i'm a big step
further in understanding this whole module/namespace stuff..

From the manual: 5.3 Modules and Macros

"... uses of the macro in other modules expand to references of the
identifier defined or imported at the macro-definition site, as
opposed to the use site."

This looks like the 'no surprises' rule, or the 'dynamic binding is
evil' rule to me.

The toplevel can still be used for dynamic binding, hence the macro
expands to (#%top . xxx::+)

So it looks like i have only one choice. Either i make sure the names
are available at the point where the macro body is defined, or i put
them in the toplevel explicitly.

Let's see if the former is doable.

Ok, trivial but still feels a bit weird. Maybe i'm too much accustumed
to late binding by 'load/include', which is as far as i get it,
exactly what the module system tries to avoid.

* Circular dependencies are allowed within a module
* Not in between modules
* Undefined symbols in a module are not allowed.
* Any late binding is to be done in the toplevel (but feels dirty)


Ok, time to clean up the utility code.


Entry: control structures
Date: Fri Jul 27 18:24:03 CEST 2007

.. become a lot easier to implement:
 
(define (xxx.choose no yes condition . stack) 
	(cons (if condition yes no) stack))


Entry: where to store the functions?
Date: Fri Jul 27 18:34:08 CEST 2007

This remains a question. I thought it was necessary to have them in a
scheme name space. Not true. As long as they can be identified at
compile time, and mapped to storage, all is well.

Not true, and also not convenient, because i really can't find a good
way to do it except for explicitly creating an empty name space and
dumping all the references there. 

Another thing: i don't really need the extra level of indirection a
name space cell provides: it is ok to just mutate the word structure
that's permanently attached to a certain name. It already behaves as a
cell:

instead of          NAME -> CELL -> WORD
we could just have  NAME -> WORD

since every cell is a word.

So why not just dumping stuff into hash tables? If (compile function
sym expr) returns a word structure, all is well. Since my language
doesn't have anything else than words, each name simply IS a word.

Make that nested hash tables, so i have a mutable real store to go
with the functional store. Maybe i can even unify them?


Entry: macros really are better
Date: Fri Jul 27 18:46:44 CEST 2007

* no VM, no custum control structures that invoke the interpreter. just 'apply'.
* functionality can still be stored in a hash table: each name refers to a fixed cell = word struct.
* hash table needs to be available at compile time


Entry: 2 stores
Date: Fri Jul 27 19:10:32 CEST 2007

Why not store the functions in the functional store? The main reason
is that the functional store is supposed to be dynamic, and the
mutable store static, never muted, except for debug purpose. But debug
is always!

So is there a better reason?

* It's not serializable. 
* It's fully derived from source, and just a cache.

So a better division is:

- everything that's completely derived from source, and doesn't change
  during a regular, non core-sev session goes into the hash store.

- all the rest, the real state which is result of computations (like
  assembler labels) goes into the functional store.


Entry: compile time hash
Date: Tue Jul 31 20:14:25 CEST 2007

let's do this namespace thing: a hash module, used at compile time and
later run time to solve all binding problems.

something i forgot: a namespace has both runtime and compiletime
semantics, however, i need to transfer everything explicitly from
compile time to run time if i want to use a hash..

now i am really confused. does this even matter?  the hash is not
accessible at run-time, but it is possible to have it around at
compile time and just have a macro spit out some values..

the real problem is: modules can be compiled independently, and all
state accumulated over such a run needs to be saved if it is to be
used somewhere else. so what i'm trying to do will probably not work.


Entry: got snot working async
Date: Wed Aug  1 22:49:16 CEST 2007

so now it's time to do some real work. i still don't want to give up
the idea of putting cat names in modules, and using eval to compile
code at runtime. it really can't be that hard. would be a good
exercise to find out what a namespace needs next to being empty to
just compile code..


Entry: cat and #%top / lexical variables?
Date: Thu Aug  2 09:00:15 CEST 2007

what about this: redefine #%top in the cat syntax expander to go look
for the cat namespace. this should enable the use of lexical scope to
do name resolution.

i found something easier: using 'identifier-binding' names that are
not lexical can be drawn from a namespace object. this gives maximal
scheme<->cat interplay, while keeping the namespace mechanism we had
before.

so:

- compilation to lambda expressions
- top level name resolution

are now separate. at this point it looks like i'm where i was before,
only with word rep changed a bit, and lexical scope.


Entry: namespace again
Date: Thu Aug  2 10:19:01 CEST 2007

so all name resolution is a runtime thing. at runtime, a tree of hash
tables is available which contains permanent bindings to word
structures. the code expands to forms that get bound to this word
structure whenever they are executed, using 'delay' forms.

so, with this delay mechanism in place, is there a need for storing
semantics in word structures? probably not.

..

something is not right:

- can't have (apply (delay expr) body ...)
- can't insert a word structure at compile time either

i wanted to to the latter to avoid a delayed expression. the only
solution is to use a different apply.


ok, i got it now. just using delay in the macro and force in the
applicator.


Entry: lot of work
Date: Thu Aug  2 18:40:57 CEST 2007

got myself in a lot of work because i'm not respecting
interfaces.. maybe fix that temporarily? it was necessary for the
control structures because they're low-level, but maybe not for the
rest of the code?

next: the 'compositions' macro parameterized by:

* source name space
* target name space
* compilation macro

maybe it's best to take a step back, and respect the interfaces.. it
looks like this is going to work, so i can just as well make the step
and replace the entire vm code.


Entry: weird macro bug
Date: Thu Aug  2 20:52:12 CEST 2007

  ;; This driver could be generalized into eager evaluation for macros.

  (define-for-syntax (process-args op stx stx-args)
    (datum->syntax-object
     stx
     (map (lambda (x)
            (if (and (list? x)
                     (eq? ': (car x)))
                (op (cdr x))
                x))
          (syntax-object->datum stx-args))))
    
  ;; This utility macro calls another macro with an argument list
  ;; reversed if it is tagged with ':'. This is necessary for PN <->
  ;; RPN conversion.
  
  (define-syntax reverse-args
    (lambda (stx)
      (syntax-case stx ()
        ((_ m . args)
         #`(m . #,(process-args reverse stx #'args))))))


The code above doesn't work.. Something about the syntax gets lost
maybe? Expanding the macro seems to do the right thing though..


Entry: base functionality working
Date: Fri Aug  3 10:34:04 CEST 2007

got cat/cat.ss as absolute minimum: anonymous and named functions.
(like lambda and define).


Entry: macro weirdness
Date: Fri Aug  3 10:38:41 CEST 2007

i'm confused again.. syntax-rules macros are like normal order
application: 

(macro arg1 arg2)

the arg1 and arg2 forms are left alone until after the expansion of
macro.

This is how it should be i guess (the only way to get non-eager
evaluation in scheme is by constructing macros). But somehow it's hard
to switch between both ways of writing code..

One of the things i miss is to parametrize a macro with an 'anonymous
macro'. Something that behaves as a transformer, but does not have a
name. More specificly:

(compositions (lambda-macro ...)    ....)

Is this possible, or am i just confused about something??


and another one:

why is it so difficult to get this working:
 (define-syntax lex/cat-compile (syntax-ns-compiler cat-ref (cat)))

 (define-syntax syntax-ns-compiler
    (syntax-rules ()
      ((_ ref (ns ...))
       (syntax-rules (global)
         ((_ c global s e)   (apply-force (delay (ref '(ns ... s))) e))
         ((_ args (... ...)) (cat-compile args (... ...)))))))
    
  
i'm importing the module that has 'syntax-ns-compiler' as
require-for-syntax, but i get the error:

ERROR: cat/stx.ss:146:10: compile: bad syntax; function application is
not allowed, because no #%app syntax transformer is bound in:
(cat-compile lex/cat-compile dispatch 3 (pn-compose lex/cat-compile (2
1) s))


but this works :

 (define-syntax define-syntax-ns-compiler
    (syntax-rules ()
      ((_ name ref (ns ...))
       (define-syntax name
         (syntax-rules (global)
           ((_ c global s e)   (apply-force (delay (ref '(ns ... s))) e))
           ((_ args (... ...)) (cat-compile args (... ...))))))))
    
i don't get it..

Update: the answer might be that the latter is a pure rewriting macro,
and thus doesn't need any phase separation.. The former does, and the
problem is just that i don't understand the separation here..


Entry: list operations on code
Date: Fri Aug  3 13:44:12 CEST 2007

since all compiled code should have it's source rep still attached,
generic list operations are possible. i'm inserting a call to 'source'
for most of them.

Now, why not have 'run' accept data? This will make the language
simpler, and representation just a matter of optimization.

So.. a consequence here is that there always is a default or base
semantics. Maybe that's better.


Entry: Conclusion
Date: Fri Aug  3 14:52:34 CEST 2007

Maybe a bit early since i don't have the old stuff ported yet, but the
main conclusions seem to be:

* name space storage can be kept abstract: it's ok to do part of the
  binding at runtime, as long as this behaviour is abstracted
  (cat/lang.ss)

* defining a new language as syntax instead of explicit interpretation
  is good, because scheme's scoping stuff carries over: it's possible
  to only replace the global name space, but to keep lexical variable
  bindings.

And, macros can be simple, if you stick to syntax-rules. The more
general syntax-case can become very confusing very fast. The most
important thing to remember for syntax-rules is that it is a DIFFERENT
language than scheme! It is normal order (breath-first) instead of
applicative order (depth-first).

So.. time to look into CPS a bit more. There's this SRFI 53 i might
have a look at, but before that, i had a go at rev-k and rev-arg in
stx-utils.ss

seems to work..


Entry: and beyond
Date: Sun Aug  5 01:14:43 CEST 2007

So..

Maybe it is time to make a proper module based CAT language. Modules
really are a nice way to factor a design.. and i am already running
into the simplest of problems: name space clutter. A lot of temp
functions i'm using are littering the name space.


Entry: porting
Date: Sun Aug  5 16:08:52 CEST 2007

so, i started porting badnop to the new cat core. the first nontrivial
problem i run into is 'state-parse'.

Maybe i should keep 'define-symbol-table'. This needs some thought,
since the whole namespace thing changed. In effect, it's the same:
there are still hash tables with functions.

Wait.. the 'make-composer' things need to be macros now..

so, what's needed is sourcedict,compiler,target
currently, 'cat' is sourcedict+compiler, and 'cat!' adds destdict. i
need a better naming for this, since it's so general..


Entry: mzscheme things to look into
Date: Mon Aug  6 10:35:47 CEST 2007

* what is a 'transparent repl'
* moving more snot functionality to scheme
* snot and syntax coloring


Entry: anonymous macros
Date: Tue Aug  7 00:30:38 CEST 2007

is it at all possible to have anonymous macros? what i need is to
parametrize one macro with an implicitly defined other macro.

maybe this is not necessary: it is possible to have 'local' macros,
meaning macros defined by other macros, with names from syntax
templates. those names never clash, so it serves the purpose.

  (define-syntax compositions
    (syntax-rules ()
      ((_ (gen-def! . gen-args) . definitions)
       (begin
         (gen-def! CAT! . gen-args)
         (compositions CAT! . definitions)))
      ((_ def! (name body ...) ...)
       (begin
         (def! name body ...) ...))))


Entry: lifting
Date: Tue Aug  7 09:09:06 CEST 2007

when i want to do lifting, a decision needs to be made based on wether
a symbol is present in one namespace or not. this is a run-time
decision, since i'm using late binding. that doesn't look too difficult.

i think i have it now, overriding 'global' and 'constant' methods. the
rest should just work.

but. it's good to have a better look at monad theory and the 'lifting'
formulation to clean up my terminology a bit.

Let's see:

map	(a -> b) -> (M a -> M b)
unit	a -> M a
join	M (M a) -> M a


Setting a 'stack' as the base type t, the monad type M t will be a
stack with added state.

map is trivial and already used, however, the other two operations are
hidden somewhere else: in the words that implement the monad
dictionary. Does it make sense to make them explicit?

The thing that confuses me is that i am doing the 'lifting'
automaticly, based on a name space distinction. All the functions
inside the monad dictionary actually to the mapping joining and
returning, but in a way that's not factored into those 3 operations.


Entry: state lifting works
Date: Tue Aug  7 11:54:47 CEST 2007

now i need to think about some proper abstraction names, so the
'compositions' declarations look nice and readable.

maybe it's best to standardize on the following syntax:

(compositions
	(syntax (dst ...)
        	(src ...) ...)

  def ...)


* 'syntax' refers to the macro used to compile the body of the
   code. this is actually a compiler which needs source semantics.

* '(src ...) ...' are the namespaces representing the source semantics
  used by the compiler.

* '(dst ...)' is the namespace used to store the resulting code
  object.


Entry: program quoting and lifting
Date: Tue Aug  7 13:29:37 CEST 2007

i ran into this before.. in lefted code, how do i quote programs?
because of automatic lifting, the only sane way is to default to
non-lifted cat semantics. so i need to fix it up a bit..

looks like it's fixed now.


Entry: things that need fixing
Date: Wed Aug  8 12:10:32 CEST 2007


Probably the parser in forth.ss needs to be rewritten.. maybe as
macros? The thing that needs to change is that the parser always
returns symbolic cat code. No tricks with inserting internal
representations.

another thing i need to fix is default semantics: what to do if a
symbol is not found? maybe using a parameter?

done..

so the parser macros. if it's entirely built on top of the ordinary
cat macros, i could disentangle them and get them to work first, then
rewrite the parser macro preprocessor.

so let's start top-down.


Entry: macro.ss and literal + compile
Date: Wed Aug  8 21:09:02 CEST 2007

now i get it:
they need to be in (asm-buffer) and c> and c>word in (macro) need to
refer to them. that way 'macro-prim:' can be used together with
lexical binding.

ok, that works.

actually, it's quite cute this way. lexical scope to mix scheme and
cat code is nice..

this makes me think: if i implement the preprocessing macros also as
lexical extensions, that property remains. maybe that's overkill?
maybe the current code is ok, as long as i make it fully symbolic?


Entry: hygiene and the rewrite-patterns macro
Date: Thu Aug  9 11:35:50 CEST 2007

It's fairly complicated, but the name bindings introduced are only:

  make-word-compiled
  lift-macro-executable
  lift-transform

what if i factor it into 2 parts:

  a nonhygienic part that creates just the match clauses
  a hygienic part that binds the function and macro names


It looks like this is sort of working. Now what about preserving
syntax information in the expression parts of the match clauses?

(match ---
       (pattern expression))

so the expression part can refer to lexical variables etc..

let's do that, but first see if this non-hygienic version works.

one important question: when peeling off syntax with syntax-e, and
using datum->syntax-object to put it back, is the orginal syntax that
wasn't peeled off preserved? it really has to be..

seems to work.. at least the expansion does, but i can't see what can
go wrong with the quoting..


Entry: reduce
Date: Thu Aug  9 14:34:52 CEST 2007

transforming

((a . 1) (a . 2) (a . 3)
 (b . 4) (b . 5))

into

((a . (1 2 3))
 (b . (4 5)))

is called 'reduce', at least that's what i recall... but, i think
maybe the more general 'fold' is also called reduce sometimes.. so i'm
going to call it 'collect' for now.


Entry: require-for-syntax
Date: Thu Aug  9 17:55:48 CEST 2007

look at the macro compiler-patterns. find a way to put the utility
functions in a module without getting the error:

 pattern-core.ss:94:11: compile: bad syntax; function application is
 not allowed, because no #%app syntax transformer is bound in: (begin
 (ns-set! (quote (macro +)) (make-word-compiled (quote +)
 (lift-macro-executable (lift-transform (lambda asm (with-handlers
 (((lambda (ex) #t) (lambda (ex) (pattern-failed (quote +) asm))))
 (match asm ((((quote qw) b) ((quote qw) a) . rest) (appen...

i don't get it. when i make them local to the transformer expression,
all is well, but using 'require-for-syntax' doesn't work.


i tried the following isolated case:


;; Utilities for syntax object processing.
(module stx-utils mzscheme
  (provide (all-defined))

  ;; Reverse a syntax list.
  (define (reverse-stx stx)
    #`(#,@(reverse (syntax-e stx)))))


(module test mzscheme
  (require-for-syntax (file  "~/plt/stx-utils.ss"))

  (define-syntax reverse-quote
    (lambda (stx)
      (syntax-case stx ()
        ((_ list)
         #`(quote #,(reverse-stx #'list)))))))


and this seems to work fine, so i'm doing something else wrong..


Entry: CPS macros are fun
Date: Thu Aug  9 18:29:06 CEST 2007

but not really practical when syntax-case is around. now that i'm
understanding it a little better, there isn't any reason to keep the
CPS macros for list reversal.

the other thing to consider is the 'compile' macro. i'm using
something akin to CPS there too, only it's more message like message
passing: pass the current object (self).


Entry: datum->syntax-object
Date: Thu Aug  9 19:17:06 CEST 2007

thinking a bit more.. i'm still not convinced that

#`(#,@(syntax-e #'some-list-stx)) 

is doing what i think it is doing: the manual says
datum->syntax-object is used, but does it see the syntax substructure?

reading the manual again, now that i know what i'm looking for:

"(datum->syntax-object ctxt-stx v [src-stx-or-list prop-stx cert-stx])
converts the S-expression v to a syntax object, using syntax objects
already in v in the result."

for (with-syntax ((pattern stx-expr) ...) expr)

"If a stx-expr expression does not produce a syntax object, its result
is converted using datum->syntax-object and the lexical context of the
stx-expr."

then for quasiquoting syntax:

"If the escaped expression does not generate a syntax object, it is
converted to one in the same was as for the right-hand sides of
with-syntax."


so i guess we're ok!


Entry: symbolic macro names
Date: Thu Aug  9 20:05:26 CEST 2007

something i run into is the (macro x) function from comp.ss: (macro:)
wont work because the variables in the patterns are symbolic!

this is really confusing..

i'm replacing the symbolic function with 'macro-ref' to make it more
clear this is a run time symbolic lookup, not something that can be
bound once.


Entry: lexical quoted
Date: Thu Aug  9 20:41:58 CEST 2007

with the new syntax approach, i can use lexical variables like

(let ((xxx (lambda (a b . stack) (cons (+ a b) stack))))
   (base: 1 2 xxx))

Which is convenient. However, i ran into at least 2 cases where the
more convenient thing to do is to insert a constant instead of a
function. However, the semantics of a symbol is always a function in
CAT. Except.. when it is quoted!

So what about this

(let ((yyy 123))
   (base: 1 'yyy +))


Meaning (base: 1 '123 +) ???

This is very convenient, but looks a bit weird. The reason is of
course that stuff after base: is NOT SCHEME. Quote in the cat syntax
only means: "this is data".

The benefit of this is that it somehow resembles pattern variable
binding as in syntax-rules.

A better explanation is this:

 The scheme and cat namespaces are completely separate: scheme has
 toplevel and module namespaces, while cat has everything from a
 separate hierarchical namespace. The only way they can interact is
 through lexical variables: this is the only set of names that is
 fully controllable.

 In cat expressions: 

 * free identifiers come from the associated name space
 * identifiers bound in scheme are 
    - used as functions when they occur outside of quote
    - used as data when they occur inside of quote


This can be implemented by mapping quote -> quasiquote, and unquote a
symbol whenever it is lexical.

It seems to work fine.
Quote for macros is now also fixed.

Another attempt to justify myself:

 The quote operator in cat language is NOT the same as the quote
 operator in scheme code. More specificly: lexical variables will be
 substituted wether they are quoted or not. i.e. both (base: abc) and
 (base: 'abc) will be substituted if the variable abc is bound. The
 quoting just indicates the atom is not to be interpreted as a
 function, but to be loaded on top of the data stack.

 The substitution is there to make metaprogramming easier.


Entry: pattern transformer extensions
Date: Fri Aug 10 10:23:06 CEST 2007

I'm trying to perform the pattern extensions properly. A true test
about this phase separation thingy, since i have a couple of phases
here:

0 matcher runtime
1 execution of pic18-pattern transformer
2 execution of pic18-pattern transformer generator

an extra problem is that i'm matching transformer names -> syntax
transformers.

this gets a bit complicated, because the name of the pattern
generator, i.e. 'unary->mem' is used both as a macro template, and as
a function name, so the transformer generator needs to be generated!

too many levels of nesting: this has to be simplified somehow..

wait:

the thing that needs to be generated is a pattern expander function,
which can be used in pic18-comp.ss to create the extended compiler-pattern
macro.


ok, i'm running into the problem again: if i put pic18-meta-pattern
and pic18-process-patterns in a different module, and
require-for-syntax it, i get the #%app error again..

so until i figure that out, maybe best to always use local transformer
procedures?

i guess it has something to do with binding identifiers. here the
problem seems to be 'lit'. in the binary-2qw pattern..


i ran into the problem again, in pattern-utils.ss /
extended-compiler-patterns and it was something like:

      #`(namespace . #,( ---- ))

which needs to be

      #`(namespace  #,@( ---- ))

because the #, expands to an s-expression, which is then just inlined
too, leading to 'process-patterns' not being quoted. weird stuff can
happen.. ALWAYS CHECK EXPANSION when this #%app thing occurs!

wait.. that's not it.. damn!

i'm going to leave it at not using any syntax for it.. it's not too
bad, and understandable.


Entry: for-syntax
Date: Fri Aug 10 14:08:57 CEST 2007

I still get into trouble with higher order where i completely don't
understand what's going wrong. Well, i guess it will come with
time. I'm glad i get syntax-case to a point where i don't need to use
unhygienic constructs any more.

And, if i get into trouble with name bindings, it's always possible to
put local functions in a transformer expression.

I'm done for today tough.. head is hurting :)


Entry: preprocessor
Date: Fri Aug 10 14:50:57 CEST 2007

Time to adapt the parser/preprocessor, and change it to something
purely symbolic.

Seems to work. It's a lot simpler too now that the macro
representation supports quoting etc..


Entry: cat name space organisation
Date: Fri Aug 10 17:45:16 CEST 2007

Things changed:

* i found a way to easily debug modules and scheme code using snot
* cat code is now fully embeddable in scheme
* things got separated out a bit more

So, what i'd like to do is to separate pure scheme code from stuff
that accessess the global 'stuff' name space. This doesn't include
private name spaces that are only written in a single module, like
'asm' or 'forth', but it sure does include 'base'.

base is full of junk.. maybe that's the real problem?

Maybe i should just implement more in scheme, and have this base thing
as a scripting language only...

Or. I need to add an easy syntax for creating 'local words' using the
lexical stuff i have now.

(letrec
    ((a  (base: 1))
     (b  (base: 2))
     (c  (base: 4)))
  (begin
    (ns-set! '(base broem)   (base: a b + c /))
    (ns-set! '(base lalala)  (base: a b c))))


(local ((a 1)
        (b 2)
        (c 4))

       ((broem  a b + c /)
        (lalala a b c)))


this requires a different syntax, since the anonymous compiler
needs to be available always.

very straightforward. works like a charm.


Entry: wordlist search path
Date: Sat Aug 11 15:21:18 CEST 2007

i need to change the 'find' macros so they accept multiple paths..

one think i'm wondering about is how 'force' is implemented.. somehow
i suspect that the thunk is not erased... it probably is..

found it:

(define(force p)
  (unless(promise? p)
    (raise-type-error 'force \promise\ p))
  (let((v(promise-p p)))
    (if(procedure? v)
	(let((v(call-with-values v list)))
	  (when(procedure?(promise-p p))
	    (set-promise-p! p v))
	  (apply values(promise-p p)))
      (apply values v))))

thunk is erased. the only thing to optimize is to not use the values
stuff, but a single return value. probably not worth it.

Haha, something i didn't see at first there: the p is either a
procedure or a list, so i can't make a single atom of it, because it
could be a procedure :)


another trick:

creating the '<language>:' macro at the spot where the '(language)'
dictionary is created and populated with primitives uses the module
dependency system to somehow enforce dependencies on namespaces, which
are not checked.


Entry: next: namespace
Date: Sun Aug 12 21:10:45 CEST 2007

more specificly, it's time to start using eval on the 'macro:' syntax,
and i run into the problem that this happens in a toplevel where it's
not defined. can you tie 'eval' to a module context?

update:

what about making this namespace explicit? i just need a single
namespace object which contains all the relevant compilers.

hmm... it looks like the easiest way to implement this is to require
each 'lang:' macro to be associated with an 'eval-lang' function.

argh... can't do that, since those also need eval.. looks like i can't
escape this namespace thing..

Entry: lazy data structures
Date: Mon Aug 13 01:33:23 CEST 2007

so, what do i need to make the assembly process lazy? match needs to
work on lazy lists.


Entry: disentangling
Date: Mon Aug 13 10:43:14 CEST 2007

i can't just separate out all the code that defines names in the
namespace, because the namespace is used for other things. there's
some conflict here..

what about i start doing it anyway.

- more namespaces

- populate each name space at the same place where the scheme code
  that uses it is exported.


the real trick is of course to see the direct specification of
namespaces as 'internal'. this should be wrapped by functions.


Entry: name space trick doesnt work
Date: Mon Aug 13 15:56:35 CEST 2007

the problem seems to be that data constructed using the struct from
rep.ss is different from data constructed using run time evaluation in
the separate namespace..

i need a different approach. first to identify the problem:

  - mzscheme has strict phase separation
  
  - "(eval '(macro: ,@src))" works when the runtime name space, the
    current one when that code is executed, has that macro available.

  - somehow, the 'rep.ss' gets loaded twice, since i run into
    incompatible instances.

now, why is it loaded twice?

one thing i could eliminate was a for-syntax dependency on rep.ss, so
the messages are a bit less confusing now.

it looks like the namespace trick creates a new instance. that's where
my trouble is.

let's simplify it a bit. the only thing that really needs dynamic
compilation is 'macro:'. so i'm going to put that dependency in the
code itself. but this should be independent of the problem, so i leave
it like it is.

i found a solution: just make sure the current namespace has the right
symbols. there's no other way. currently i just dynamic-require them
in, but i dont know if this is better than just requiring them at load
time, or namespace creation time..

it does feel a bit dirty though.. why use modules if they start
injecting stuff in your namespace? Maybe i should just pass the macros
upstream.. whatever..


Entry: quoting parsers
Date: Tue Aug 14 12:59:37 CEST 2007

They seem to work now. Map well onto the literal stack / typed macros
approach. The question is how to map them. It doesn't look like a good
idea to keep the same symbol, but how to change it then? I'd like both

  ' <filename> load 

and

  load <filename>

to work. Prefixing them with 'def-' seems not right. What about
"/load". A single symbol seems the right thing.. ~#$%& are
ugly. "*load" seems a good compromise.

I got it sort of working, and factored it out a bit. However, might
need a bit of name cleanup to distinguish the following source
representations:

- a filename
- a list in symbolic forth format
- a list in symbolic macro
- the latter compiled into executable code

I switched to the following naming convention:
- files use 'load'
- strings use 'lex'
- list -> compiled code uses the ':' prefix
- all other operate on lists directly

Entry: default semantics
Date: Tue Aug 14 13:55:56 CEST 2007

Also, i start to wonder if it's a good idea for 'run' to take literal
lists as an argument. The only real benefit is Joy like introspection,
but since in badnop most source reps are macros, this doesn't make
sense: source is not the only thing, semantics needs to be added, so
default semantics might not be good.


Entry: long run times
Date: Tue Aug 14 14:46:10 CEST 2007

seems to go into a loop somewhere.. time for a break.

it seems it's just really slow!

and all the time is spent during compilation. looks like i got some
quadratic things going on in the expansion..

so, i suspect this is syntax-rules.. i'm using a lot of rewriting to
avoid syntax-case.. maybe i should just come back from that? already
eliminated the if-lexical? macros.

ok.. let's see. first make the expansion a bit less dramatic: some
things can be abstracted in a function.

then i replace rpn-compile by a single syntax-case macro. it still
calls the compile macro, which can be customized by stuff built on
top.. there's no difference in speed, so i guess it's somewhere else..


so it's this:

    (rpn-compile *forth* 'macro:)

if *forth* is about 150 atoms, there's about a second delay.
maybe it's the nesting of the macro?

i wonder if i can write a macro that's faster..

let's try something different.

currently, the rpn-abstract macro is using fold. it's still calling
the 'compile' macro. looks like that's what i need to replace.

so, how to implement modified behaviour. instead of using macros, why
not use functions? i do need proper phase separation to do this. let's
see if that's possible by moving stuff out to for-syntax-rpn.ss


Entry: running into #%app trouble again
Date: Tue Aug 14 22:15:17 CEST 2007

This is the smallest example i could find that doesn't work as i expect..

(module for-stx mzscheme
  (provide (all-defined))

  (define (break-stx fn args)
    #`(#,fn #,@args)))

(module test-stx mzscheme
  (require-for-syntax "for-stx.ss")

  (define-syntax (bla stx)
    (syntax-case stx ()
      ((_ fn . args)
       (break-stx #'fn #'args))))


  )

The, putting the 'break-stx' definition inside the define-syntax def
works fine..

Ok, if i change the quoting mechanism above to:

  (define (break-stx fn args)
    (datum->syntax-object fn
                          (cons fn args)))

It does work. Now i'm really confused.
I found this on the plt list:

http://groups.google.com/group/plt-scheme/browse_thread/thread/327013d5c6f61017/9a12e93d683a5f94?lnk=gst&rnum=2#9a12e93d683a5f94

  (require-for-template mzscheme)

in the module that generates syntax seems to solve it.


So, on to replacing the old 'compile' macro with a functional
approach, which works a lot better. There's really no reason to mess
with syntax-rules for anything else than simple patterns.


Entry: disentangling
Date: Wed Aug 15 14:19:05 CEST 2007


rpn-tx.ss        lowlevel syntax generation, parameterized by 'find'
rpn-runtime.ss	 runtime support for the above
rpn.ss           bind a 'find' closure generator to lowlevel syntax
ns-utils.ss	 support code for namespace lookup, to be used in find closures.
state-stx.ss  	 namespace namespace -> state syntax "language:" compiler
base-stx.ss      namespace -> base syntax "language:" compiler
composite.ss  	 create named words from compiler


Entry: mission accomplished
Date: Wed Aug 15 15:15:54 CEST 2007


Looks like i got it back online. The transformer works a whole lot
faster now. Let's repeat the conclusions:

- don't use syntax-rules if you need CPS tricks. it's ad-hoc and
  slow. use syntax-case with real functions instead.

- when using complicated syntax-case macros (compilers for embedded
  languages), separate out the transformer procedures and the template
  runtime support into different modules, so they can be tested
  separately.


I did this for pattern.ss  -> pattern-tx.ss and pattern-runtime.ss


Entry: better error reporting
Date: Wed Aug 15 16:21:00 CEST 2007

so... it would be great to be able to relate errors to where they
occur in the source code. however, to use the builtin syntax readers i
need to move both the lexer and the parser so they can
operate/generate syntax objects.

pretty clear what's to do next then:

- rewrite forth parser so it operates on syntax objects + create a
  proper 'forth:' macro that goes with it.

- make the lexer behave as 'read-syntax'

when this is done, i should be able to compile forth files straight away.

First part was easy: driver works. The rest should be
straightforward. However, moving this to compile time requires some
phase magic...

I was thinking about doing a proper phase separation in the forth code
too. Instead of defining macros as side-effect, it's probably better
to isolate them.


Entry: predicates->parsers
Date: Wed Aug 15 21:57:18 CEST 2007

I don't remember why exactly the map 'vm->native/interactive' is not
purely syntactic. it really should be.. refer to previous code to find
the previous functionality, but i'm breaking it and taking out the
'dict' dependency and will replace the predicates->parsers with
something that doesn't evaluate.

the previous 'predicates->parsers' behaviour is too dense. took me a
while to understand it. better to separate out in different
mechanisms: 1. syntacting transformation 2. run time symbol lookup


Entry: produced first monitor.hex
Date: Thu Aug 16 00:31:51 CEST 2007

looks like i got it mostly running now. didn't test the code yet since
a lot of things are still broken, mostly the interactive part. but it
looks ok.


Entry: brood 4
Date: Thu Aug 16 00:36:17 CEST 2007

enough things changed, and i'm in a broken state for a bit now. this
means it's time to up the version, and rewind the the 3 archive to a
working state.

it's archived as brood-3 on apatheia. this is the last patch included:

Mon Jul 23 21:29:12 BST 2007  tom.goto10.org
  * namespaces and next projects

at that time i was changing stuff to the boot block.. i'm not sure if
that code actually works.. might be better to revert a bit more back
till after the workshop.


Entry: next
Date: Thu Aug 16 01:01:53 CEST 2007


- test the target code, see if the monitor still works
- fix the interaction code
- fix the vm interaction/compile code
- fix snot for interacion/compile mode
- factor some badnop code: use local words


Entry: separate compilation
Date: Thu Aug 16 10:51:12 CEST 2007

got me thinking: can't i do the separate compilation trick for macros?
i already run into the non-transparency problem several times: trying
to define some code with some macros not defined..

one of the problems is 'constant': it needs run time compilation so i
can't just do this... another is that macros defined immediately start
influencing compilation of code after their loading.

but.. can the loading of forth files be made free of side effects? or
at least somehow separated? let's see what kind of side-effects we got:

 constant-def!
 2constant-def!
 macro-def!

those are easily isolated into separte dictionaries to separate 'core'
from 'project' macros and constants..

as long as project macros are loaded AFTER core macros, they can be
safely deleted as a whole.

the short version: it's impossible to change it now without real phase
separation..


Entry: literal pattern matching
Date: Thu Aug 16 11:20:58 CEST 2007

Patterns like

         ((['qw a] ['qw name] *constant)
    	  (begin (def-constant name a) '()))
  
are a bit redundant.. a better notation would be "(a name *constant)"


Entry: assembler cleanup
Date: Thu Aug 16 11:36:25 CEST 2007

Can't i get rid of the 'constants' namespace?

Again, why are they different from macros? To postpone symbol ->
number conversion until assembly time. So they can't be macros,
because at assembly time all macros have run.


Entry: compilation syntax
Date: Thu Aug 16 13:42:21 CEST 2007

i'm thinking about adding some syntax to compile code using different
syntax..

(a b c) is still default demantics quoted code, but

(lang : a b c) is interpreted as compiled with 'lang'.

or maybe

(lang: 1 2 +)

let's see if i can do this first on the rep.ss level: just store a
symbol naming the rep.

probably the first thing that needs to change is to change state-stx
to take an anonymous compiler as 2nd op..

it's really annoying to be at the border of compile/run the whole time!

first, the above is not really possible since state-stx code fallback
code is not derived from a named compiler.


Entry: override semantics
Date: Thu Aug 16 15:05:36 CEST 2007

Introduced the (language: ...) syntax for overriding language
semantics while quoting code. It's implemented as follows: the default
'program' compiler checks if the first symbol in a list ends in ':',
if so, the whole expression is passed to the scheme expander,
otherwise the default 'represent' method is used to compile the code
anonymously.

It's a small step from here to a 'lambda:' macro.

I also fixed the semantics annotation. However, it is possible to run
into code which doesn't have the semantics annotated because it works
with an anonymous macro.. This could be cleaned up, but i guess it
serves the debugging purpuse now: 'ps' displays macros as (macro:
....)


Entry: name mangling
Date: Thu Aug 16 16:01:27 CEST 2007

Maybe i should give the name mangling a go again.. If i recall the
thing i did wrong last time was to get rid of syntax information for
names, so they were mapped to toplevel names.

This macro seems to do the trick:

  (define (prefix pre name)
    (->syntax name ;; use original name info
              (string->symbol
               (string-append (symbol->string (->datum pre))
                              (symbol->string (->datum name))))))


So basicly now i have a mechanism to use the mzscheme module system
for handling namespace and dependency management.

I bet i can use some kind of 'module-local?' predicate on the syntax
to find out if a name is local to a module, and if so use that
instead.

I guess a good time to find out if i have the namespace stuff
sufficiently abstracted.

Something about naming conventions: the 'rpn-' modules do not need or
depend on the namespace implementation.

I do need a different kind of 'compile' macro, but for the rest it
works perfectly. Maybe time to rename some things..


Allemaal goed en wel, maar hoe kombineer ik? Lijkt me niet direct een
goed idee.. Dit werkt beter als alles of niets..

So, combining runtime namespace lookup and static modules.. how to?

One of the things to change is to not inherit from a namespace, but
from a named compiler macro.

What about starting from the ground up? Making the base language
static, then moving things from dynamic -> static?

Starts with snarfing. Instead of snarfing to dictionary, snarf to
prefix.

Start with separating primitive.ss into snarf.ss and ns-snarf.ss

So in principle, it should be really easy now to move the
implementation of base.ss to static functions without anybody
noticing. That is, if i can somehow make delegation work using just a
language: macro instead of namespaces..


Entry: the royal DIP
Date: Thu Aug 16 17:35:27 CEST 2007

i guess the solotion is to use 'dip' from base to create state syntax
abstractions. and maybe, to add an optimization that:

	      (+)

does not create an extra lambda, but returns the primitive + right away.

i guess the optimization can be left until later..

so..

the idea is to make the delegate compiler abstract. this requires
quite some change, but should make the code a lot simpler.. it would
also fix the annotation problem mentioned above.

so
	    (ns-base-stx badnop: (badnop) base:)

instead of
            (ns-base-stx badnop: ((badnop) (base)))


let's call this 'extend-base-stx'

haha. gotcha!  of course, the delegate: is a static thing, and the
namespace delegation is a dynamic thing, so there's no way to compile
this: the information necessary to decide about delegation or not is
not available at compile time, when the deligation needs to be
frozen..

it needs to work the other way around !!!

if a symbol is not defined at compile time, the resolution can be
postponed until runtime.

so i guess the gentle way to move things to static implementation is
to use the 'module-local?' predicate mentioned before. that way
module-local symbols can bind first, and cannot be overridden.

the number of methods in the compiler is getting larger. maybe use
real objects? prototypes?

also, if i use a decent prefix, symbol capture is not a real problem,
so i can put it on always? maybe just a dot or a pound sign..


Entry: pff... done coding..
Date: Thu Aug 16 20:49:59 CEST 2007

today was a bit intense. i start to get a bit more of this syntax /
lexical / static stuff.. it would be nice to make more things
static. there are only a couple of places that have 'plugin'
linking. one of them is 'literal' and 'compile' in the macro-prim
dictionary, so it looks as if i do need some dynamic binding.

however, i wonder if it's not better to solve this using units. more
standard tools = better, now that i know what i want at least..

one thing is bugging me though. some paradoxical thought:

i'd like to define words that fall back on another
dictionary. however, using static linking there is no such thing: a
symbol is there, or it is not. and there's no override..

maybe i should stick to dynamic.. it's really different and no easy
migration to static.


Entry: if i go static
Date: Thu Aug 16 23:29:47 CEST 2007

one name mangled namespace is enough, since i can use ordinary modules
to organise code and hide details, just like in scheme. let's stick to
'rpn.'

so i built that in: names like 'rpn.xxx' that are visible at compile
time get used as functions, and bind variables 'xxx' in the
cpmpositional code, just like lexical variables.

it looks like delegation from dynamic -> static parts is not
possible. since this is quite a deep thing to change, i'm not going
to. it's still possible to move highly specialized code into modules
to shield them from the main dictionaries.

what is possible is to add a static interaface to words in
'base'. they could still include code to register to the dynamic space
also, but at least this would enable to freeze some functionality. so,
maybe this: all base words are exported

- rpn.xxx variables from the rpn-base.ss file

- exported in a dynamic dictionary from base.ss, which gets the
  functionality from rpn-base.ss

is a bit confusing.. maybe leave it as is..


Entry: because i can
Date: Fri Aug 17 00:05:30 CEST 2007

there's a lot of 'because i can' code in thethered.ss ... as i found
out, some tasks are just easier to code in scheme. if it's anything
algorithmic, meaning intricate data dependencies, you're usually
better of writing a scheme program. they are easier to understand,
probably because they are a bit more verbose, and because 'automatic'
permutation and duplication of names avoid mental gymnastic for stack
juggling. there is nothing in the way now that i have both 'base:' or
'prj:' in scheme, and 'scheme:' in cat.

for what is the cat code useful then? simple patching and scripting,
there it clearly wins. as long as not too much data juggling is
needed, cat is really easier to patch things together.

also, imperative code looks nicer in cat. because cat is just
composition, it looks sequential. it happens that all (most)
imperative code i use is for communication. in scheme imperative code
seems always ugly.. maybe it's because synchronisation is easier to
imageine in a linear instruction flow: threads of execution joining
together at certain points, breaking the linearity of composition?


Entry: joy
Date: Fri Aug 17 01:28:07 CEST 2007

added a joy interpreter. it doesn't have much, but it can do

   ((dup cons) dup cons) i


Entry: interaction
Date: Fri Aug 17 10:45:30 CEST 2007

got a bit off track again.. time to fix interaction. first thing to do
was to put 'tinterpret' and 'tsim' code in prj.ss together with the
supportin code dip/s and ifte/s

so.. why is this so ugly?

by default, quoted programs and run + ifte use functional context to
limit surprises. however, sometimes i want to do things like:

 (tsim        (prj: dup tfind not) dip/s
              (prj: tinterpret)    ifte/s)


the xxx/s words are the analogs of xxx but pass stack + state to the
programs, and 'prj:' compiles state words.

is there a way to do this automaticly? probably not using my current
setup, unless i make 'run' understand state words, which means they
should be type tagged. since that only takes away the /s notation, i'm
not going to do this. so the convention:


  functionals <xxx> do NOT pass state to quoted programs,
  while the corresponding <xxx>/s DO


but...

if one uses types to do this automaticly, which means that the core
'apply' routine should be made aware of state, and rep.ss should
implement some kind of tagging for state words.. what would be the
real problem?


Entry: Monads
Date: Fri Aug 17 12:10:45 CEST 2007

i don't know much about type theory, but i think i understand how my
ad-hoc approach relates to monads using the unit-map-join formulation.

X     is state type
S     is stack type
( . ) is cons

unit ::  S             -> (X . S)
map  ::  (S -> S)      -> ((X . S) -> (X . S))
join ::  (X . (X . S)) -> (X . S)

so 'unit' introduces a new state object on the data stack. 'map' will
create a function that does what it did, but ignores the X part, and
'join' will accumulate one piece of state into another.

the first two are trival, and i do use them fairly explicit. but the
last one seems to be hidden a bit deeper, because i never use it
explicitly:

every state dictionary has a couple of words that bring stuff into the
monad, but they have type:

A     is assembly opcode

asm ::   (X . (A . S)) -> (X . S)

here 'A' is not the same as 'X', but in spirit it does the same
flattening operation.

looks like i'm missing some of the fun. clearly the 3 law formulation
has some benefit due to a higher level of abstraction, but what would
it bring me to make this a bit more explicit?

first of all, i need a proper type system. the monad objects should be
somehow tagged. that way 'unit' and 'join' can be made
polymorphic. 'map' should not be polymorphic, given i implement
monads as 'things on the top of the stack'.

;; map

(lift   (dip) curry)


the other two are problematic.  'join' is possible to do, since monad
types could be tagged so it _could_ be made polymorhic. but 'return' /
'unit'.. such polymorphism won't work because i can't infer the type!

i.e. 'return' is normally plugged into some expression that expects a
monad type. i have no way of determining something like that, so
'return' should have explicit annotation, probably best using just a
different name.

for example, the assembly 'return' for a single opcode would be

(asm:return    '() cons)   ;; wrap the single opcode in a list
(asm:join      append)     ;; concatenates the 2 state lists

what i do is to just combine those 2 operations into one that conses a
single opcode to the assembly list.

i think i sort of get the gist of it.. or not?


so, the other formulation uses

bind: (X . S) -> (S -> (X . S))  ->  (X . S)

so bind is like 'join' in that it combines a monad data type with a
function that maps from outside the monad to inside, and returns a
monad type. 

note: because i have only one type (a stack: each function maps stack
-> stack).

* in the general case, the source and destination monads for the
  'bind' operator do not need to be the same, but in my approach they
  are, since there is only one type that can be "monadified"

* i do not have the concept of a type constructor (types do not have
  an abstract representation), so i can leave that out.


so a stupid question maybe. how do you get stuff OUT of a monad?

i think there's something i didn't get. the type signature of bind is:

   M t -> (t -> M u) -> M u

so i guess if M u == t, bind can get things out of a monad. 

in general, it can get t out of M t (multiple times!), apply the
function (t -> M u) (multiple times), and combine 'stuff' from M t
with the (multiple) M u, and return an M u.


conclusion: not having 'real types' makes all this a bit difficult to
formulate.. it might be a nice exercise to try to do it anyway.


nice base for some more reading on the subject. maybe "Monadic
Programming in Scheme"

http://okmij.org/ftp/Scheme/monad-in-Scheme.html

this talks about the case where there's a single monad, or types of
different monads do not get mixed.


Entry: source annotation
Date: Fri Aug 17 16:45:25 CEST 2007

really.. does it make sense to NOT have the source annotation be
formal, if with a little more effort it can be?

It's sort of formal now.. Things that are not uncompilable have #f
semantics, the others are created straight from the named macro, so
should be right, or by composition from such, so should be right
because all code is syntacticly concatenable.

It sort of strikes me as odd that i can't have 'curry' or 'lift'
defined in a generic way, because quotation of data is not standard. I
could try to force it. Anyway, for 'lift' i only need base
semantics/syntax.

Wait, lift is possible if semantics is defined, but it requires that
quoted programs are always available. (even in forth macros!)


Entry: and so on..
Date: Sat Aug 18 01:48:03 CEST 2007

time to get back to pic programming... i didn't really anticipate this
static change and the move to brood 4. but this are really better this
way.

once the pic part is back online, it's time to look at interaction
macros, or how to create interactive meta functions.


so, timeline:

- interaction macros

- the standard 16bit forth (requires interrupt driven serial I/O and
  an on-chip lexer + dictionary)

- write something about compile/runtime and the different ways to fake
  the single machine experience.


Entry: state
Date: Sat Aug 18 14:34:18 CEST 2007

got interaction working. i changed it so the commands available in the
interaction mode need to be specified explicitly. this has to be done
for commands that take arguments anyway, so why make an exception for
0cmd?

see interactive.ss

so, i've been doing the snot-run thing, which works quote well. it's a
feast that state is stored elsewhere, and my function core can be just
reloaded. however, there are a few spots where i'm using state still..

one is IO. since it's non-functional anyway, storing the name of the
serial port couldn't hurt, right? wrong.. on restart, it needs to be
reset.

i made the 'boot' word which loads all the macros from source. this is
slow, i guess because of the constants?

so maybe i should just put the constants back as scheme file..


Entry: KAT and TAK
Date: Sun Aug 19 14:05:15 CEST 2007

I'm looking for a better way to explain the pattern matcher. Usually
generalizing helps. The reason why it seems special is because it is
only used with "macro pattern" and "quasiquoted scheme template"


Entry: no phase separation
Date: Sun Aug 19 15:16:21 CEST 2007

Now that i finally understand the point of phase separation, i wonder
if i can do something similar with the forth?

Maybe it's not necessary for small projects, but it does feel a bit
weird to first struggle to write scheme code that obeys mzscheme's
phase separation rules. To see that it's a good thing, and then to go
back to some non-separated way.

I see a roadmap on how to do this: just turn everything into scheme
syntax. The result after loading is a single function that generates
the program when evaluated. That way i know i'm going to get there. On
the other hand, i do not know what to give up then. My whole design
needs to change.

An other way is to do it incrementally. First make sure i can separate
code into macro definitions and rest. For just macro definitions this
is not so difficult. However, constants are a different story, since
they require compile time computation..

I guess that's where the problem is:


Entry: constant
Date: Sun Aug 19 15:27:10 CEST 2007

What are constants? Phase separation violation! In contrast to normal
macros, which obey separation because they do not use any values
created at compile time, macros generated by 'constant' join 2 phases.

The 'constant' word could be termed a "phase fold". The compiler after
'constant' is not the same one as before: it is extended with a
macro. 

This kind of behaviour prevents modularization of code, because it is
not clear what the definition of the new macro depends on, the only
thing that can be assumed is that it depends on all the previous code,
and that all the following code depends on the new macro.

The solution is that this behaviour needs to be unrolled: instead of
updating the compiler on the fly, an extention phase (where macros are
defined) needs to preceed a compilation phase (where macros are
executed).

There is a general way to unroll 'constant': split the code in 3
parts: the part before, the definition of the new macro, and the code
after. This is rather cumbersome and entirely unnecessary..

However, in the case of Purrr18 it is usually possible in to transform
the code to a macro definition. Instead of writing

	   1 1 + constant twee

one could write

           macro : twee 1 1 + ; forth

This enables the macro definition to be distinguished from the rest of
the code, to clarify the dependencies of a file's plain code on the
macros defined in that file.

The only reason not to do it the second way is because it looses the
name 'twee' in the eventual assembly code.

Removing 'constant' could lead to a better transparency in the code:
compiled macros could then be seen as 'only cache'.

Note that i would do this just for more transparency, not to eliminate
undefined symbols: macro name binding is still late.


Entry: phase separation
Date: Sun Aug 19 16:34:26 CEST 2007

So, a forth file contains both macros (M) and forth (F) code. The
forth code always depends (->) on the macros  (M -> F)

If a forth file depends on an other forth file, the macros from the
former depend on the macros of the latter, and the forth code depends
on both macros and forth code from the latter. Due to transitivity,
the arrows from M -> F in between files can be omitted, so one gets
something like

    Ma -> Fa
    |     |
    v     v
    Mb -> Fb

where the arrow from Ma -> Fb is left out.

What this would buy me is that i solve the problem of keeping the
macros consistent with the state of a target:

Target state is a consequence of compiling all the Forth code in a
project. However, as a side effect, a project defines macros that are
used to generate this code in the first place. There needs to be a
clean way to 'reload' these macros from the source code, so we can
connect to a target with the macros instantiated.

I'm trying to see how to make this more rigorous: how to make
incremental compilation work without having to manage dependencies
yourself? Basicly, how to map the nice module system of mzscheme to
incremental Forth development.

This is clearly not for now. It requires a lot of change. One of them
would be management of storage on the controller: if dependencies of
separately compilable modules are fully managed, incremental uploads
are still possible, and become 'transparent'. I.e. changing a module
but not changing its dependencies makes it still possible to update a
system on the fly, but in a transparant way.

I'm still quite happy with the ad-hoc hacked up way of incremental
development. But knowing this is possible might make the itch a bit
stronger.


Entry: dynamic updates and functional programming
Date: Sun Aug 19 16:53:53 CEST 2007

I guess most of this train of thought started after i got to using
sandboxes with SNOT. Currently it works +- like this:

SNOT (the bootloader)
  * manages memory: stores project state in a single toplevel variable
  * manages purely functional sandbox
  * implements REPL

outside of the system the edit-compile cycle runs: changes are made to
the collection of functions that acts on the state and a compiler
recompiles those that have changed. then 'restarting' the system is
almost instantaneous: the state remains, only the operations on the
system can change.

the requirements for this are of course that all state is stored in a
fairly concrete way: representation must not change from one version
of the system to the next.

if representation changes, a small 'converter' could be made..

What i'd like is something like a smalltalk environment, but then for
scheme. A lisp environment with incremental loading comes close, but
transparency is necessary. 

Smalltalk solves this by being completely dynamic: compilation is just
cache, and code can be edited on the fly. There is no 'off', it's
always running.

MzScheme solves this by being static but well-managed
dependencies. Separate compilation to make 'restarting' cheap. There
is an 'off', but it can be made small.

Using the approach above: managing ALL state separately renders a
virtually always-on system. The off period can approximate zero since
it's just "swapping a root pointer" once the code is
compiled. Compilation can take longer if changes to core modules are
made, but there remains a 1-1 correspondence between the system and
the source code.

I guess it's possible and not even too difficult to delay compilation
in the scheme case, making compilation behave more as a cache.


Entry: purification
Date: Sun Aug 19 17:17:13 CEST 2007

So i need to eliminate state. There are 2 cases where i've introduced
state because i thought it "wouldn't hurt"..

* the target IO port
* the project search path

The rest really behaves just as cache.

So if i'm allowed to be really anal about eradication, these things
need to change. Project search path is the easiest. Target I/O is
more difficult because it requires moving from a functional to a
monadic implementation.


Entry: eliminating global path variable
Date: Sun Aug 19 17:41:15 CEST 2007

to be able to eliminate the path state, i probably need dynamic
variables (parameters).

is this cheating?

not really.. since i'm using with-output-file already, and that
doesn't really feel like cheating.

this would also solve the problem with IO of course.. still i'm not
convinced it's not cheating..

one could say it's not cheating because the value has finite extent?
so why not implement monads as dynamic variables? because dynamic
variables are not referentially transparent, which you would want when
you 'run' a monad: it should still act just on the state provided, not
on something else...

so why are parameters different then? are they less evil when they are
constant? they represent 'context'.

* one thing is sure: they are less evil than global variables due to
  limited extent.

* if they are constants, they are less evil than when they are not

To really answer the question is to implement dynamic variables with
monads, and see how they are different. The problem i'm facing in my
ad-hoc state hiding approach is that i can't combine monads:

When i'm running something in the macro monad, i can't access anything
else. To have access to path, the monad should be bigger and include
'compilation context'.

The real solution is of course to make compilation independent of file
system access. Source code needs a preprocessing step that expands all
include' statements. Since it's only one keyword, this can be
implemented in the forth-load function. That function already
implements 'file system dereference', so why not include path
searching?

Ok, made it so. 'load' is now a load-time word so file system access
is concentrated in one point. 'path' is removed: this needs to be
specified in the state file, because it really is a meta-command.


Entry: cleaning up interaction
Date: Sun Aug 19 19:07:52 CEST 2007

This is the biggest change. Probably best to separate it out in a
different monad. The state associated with interaction is:

* I/O port
* target address dictionary
* assembly code

With this data it can start assemble code and upload it to the target.

But.. looking at the contents of the state file, there is not much
else!

(forth)       ;; might come in handy for interaction
(file)        ;; in case we want to access the file system
(config-bits) ;; on the fly reprogramming? some day probably
(consoles)    ;; this is the only real meta data not necessary for interaction..


maybe it's not worth it to split interaction off of prj. maybe it's
even just a bad idea: you'd want the 'fake console' to have power over
the whole project. impossible without giving it all the state.

let's just clean up tethered.ss and move out functions to badnop.ss

but... i'm using with-output already. so why not just have the i/o
commands do the same?

done. this immediately solves the problem of having more than one
device attached. i.e. a distributed system with all identical devices.


Entry: side-effect free macros
Date: Sun Aug 19 20:00:19 CEST 2007

i was thinking: if macros are side-effect free, constants can be
eliminated. because it's always possible to see if a macro is just a
constant: execute it, and if the result is '((qw <value>)) it is!

the only thing you would need constants for is to 'uncompile'.

another thing: what about making the partial evaluator reference
macros if it can be guaranteed the macros perform only computations
that can be completely reduced to values?

i need to disentangle this a bit..


Entry: no values
Date: Mon Aug 20 15:12:00 CEST 2007

i owe this to Joy: it's really good to have no "function value
quoting", i.e. just (foo) instead of something like 'foo

this leaves ' free for quoting literals, and has the benefit of a
symple abstraction syntax.


Entry: distributed programming
Date: Mon Aug 20 15:15:37 CEST 2007

The next hardware project is going to be krikit. It's going to be a
distributed system of small devices.


Entry: done
Date: Mon Aug 20 16:27:08 CEST 2007

yes, i guess so.. no pressing chages ahead, except for the macro/code
separation, side-effect free macros, and maybe dependencies.. which is
a biggie. another thing is interaction macros. so the todo looks like:

- move the words "constant, macro/forth, variable" and the 2-variants
  to preprocessor stage which can separate code into macros and forth
  code.

- add interaction macros


Entry: brood.tex
Date: Mon Aug 20 23:03:09 CEST 2007

i'm starting an explanation of macro embedding with a purely
functional approach. while i'm on the right way with my notion of
compilable, the effectful part is less obvious.

the idea is this: [ a 1 2 + b ] can be simplified to [ a 3 b ] if a
and b are effects.

somehow i'm missing something important.

maybe the situation is symmetric? instead of having language and
metalanguage, which both share some evaluation domain, they also have
functions that act on their full domain only.

i think i sort of got the duality now: the target depends on run time
state which is not representable, meaning only pure functions can be
evaluated.

...

my explanation is not completely sound.. when i'm talking about target
and host langauge, i never make the explicit conversion. there's
something wrong there. almost right, but not quite.

a compilable macro is something which can be 'unpostponed'. meaning,
it is a function that all by itself produces a program that can be
evaluated on the target.

...

another thing is that macros, in the way i implement them, are not the
macros i'm describing in the paper.

my macros are EAGER, they are a combination of the partial evaluation
strategy AND their original meaning.

the macros in the paper, at least the partial evaluation strategy, is
monadic. for compilable macros this makes no difference, but for other
algorithms, order does matter.


Entry: monoids and stacks
Date: Wed Aug 22 16:09:11 CEST 2007

something which has been tickling me for a while because i have it not
formalized in my head:

functional programming with stacks.. how does this work, really?

what's the relationship between state and stack?


so, compositional programming languages use compositions like [fg] to
express programs. all functions are unitary. that's nice to give some
framework about evaluation order (it being arbitrary, if there's a
representation of composed functions). so:

  Functional compositional languages make it easy to talk about
  partial evaluation: it's just the associativity law. Wether this is
  of any practical sense depends on wether we can partially evaluate
  FUNCTIONS to something simpler.

so let's start with inserting that thought in the paper..

then the other one is about locality of action. the fact that a
language is compositional doesn't really do much about this. you need
a way to ensure separation. this is where stacks come in.

but this is more about continuations than about being able to perform
partial evaluation.. really, the only thing i need to know is 

  * POSSIBLE: that [1 +] is equivalent to [1] followed by [+]
  * ECONOMIC: that representation is actually simpler

that's the end of the story. the fact that the thing uses stacks is
relevant to prooving that [1 +] is equivalent.

i need to clean up notation.. i'm using two different notations for
application. one rpn, and one pn. let's stick to pn, because i use
functions somewhere else, and reserve rpn only for compositions.

...

there's another thing that's really wrong in my explanation. something
i noticed yesterday already... macros are about IMPLEMENTATION of
partial evaluation. i really have only a single language! that's what
it feels like also when programming. so i think i can plow over my
whole text again... frustrating, but i'll get there eventually.

maybe this is why i like programming so much. making sense is only
defined from the point of works/notworks. math is too free for me.. i
am not strict enough.

ok. the plan: get rid of the notion of 'macro' and introduce it only
later. keep everything abstract, just show a way to translate forth
into a functional language operating on state + metastate.

looks like i'm getting somewhere.. and this is going to turn up some
conceptual bugs. looks like i needed to spend this time plowing
through misconceptions..

again, this is wrong.. ARGH!

the compiler is not a map. it's a syntactic transformation. what i
call a compiler now is just the property 'compilable'.

so the compiler is something that proves a program is compilable!

ok, i got it sort of explained now. so this composite function thing
is about semantics, which leaves more room to talk about the
implementation of the proof constructor (compiler).

just added a note about function definitions. creating new names is
either something which happens outside of a program, or has to use
side effects. currently it's the latter, but i'd like to move to the
former.


Entry: real compositional language?
Date: Wed Aug 22 22:37:47 CEST 2007

actually, the step to a real compositional language is not so big any
more. just adding the parsing words '[' and ']' for program quotation,
and possibly an optimization for ifte -> if else then conversion
should do it. all other constructs can then be translated into higher
order functions.


Entry: phase separation
Date: Wed Aug 22 22:44:48 CEST 2007
Name: phase_separation

i guess now that base.tex seems to be about bull-free, the step is
phase separation for forth files. basicly this means:

1. collect all names and macros. this includes constants, variables, AND the
macros used for compiling function calls.

2. compile the code.

so.. it should in principle be possible to have proper semantic
separation of names before a source file is compiled. currently, words
have a default semantics (target word). however, i could catch
undefined names if i catch all occurances of ':', and register a macro
for each of them that will compile a procedure call. that way i can
remove problems with macro/code confusion...

so 2.. a name always maps to a macro explicitly. otherwize it is not
defined. no more default semantics. the macro might choose to compile
a call instruction using a symbolic reference.

this means the language becomes a bit less flexible: 
         : (2)variable (2)constant 

are no longer accessible from forth, and are prepreocessor directives
that change the code into a form:

(macros
  (a 1 2 3 +)
  (b 5 -))

(constants
  (c 1)
  (d 2 5))

(tape
  ((broem) a bla 
   (lalala) bla broem))

where the tape is the layout of code memory with labeled entry
points. this structure is there to preserve multiple entry points
(fallthrough) and multiple exit points.

if macros are side effect free, constants can be eliminated. they are
simply macros that evaluate to a literal sequence, if they evaluate at
all.

i can even keep the current context for 'constant' suppose a forth
file starts with the code:

1 2 + constant broem

so the loose code "1 2 +" can be interpreted as a macro. the
consequence is of course that it's not possible to define constants
after the first function.

hmm.. i do need constants if i want constants in the assembly. because
to get them there, every constant needs to have a macro associated to
it that will compile the constant value.. so let's leave them in, but
employ the mechanism above to give them macro semantics. maybe a
constant is a macro that evaluates to a literal, so the actual macro
code can be stored somewhere else?

maybe the more important thing is to unify the compile-time constant
evaluation with macro execution? not really.. ai ai.. time to go to
bed..


Entry: set & predicate
Date: Wed Aug 22 23:56:09 CEST 2007

it never occured to me, but a set is indistinguishable from a
predicate function. operations on sets are then

(define (union        a b) (lambda (x) (or  (a x) (b x))))
(define (intersection a b) (lambda (x) (and (a x) (b x))))

a thing you can't do here is to iterate over the elements.


Entry: a day in bruges
Date: Fri Aug 24 10:32:26 CEST 2007

tourist in my own country.. anyway, i made some notes:

* partial evaluation/optimization

replacing a composition [fg] by a specialized function is always
possible in a compositional language. the reason why it doesn't work
for me is mainly because of 'hidden quotation'

for example the sequence "1 THEN +" contains a jump target. which is
not purely compositional.

solution: only pure quotations. all branching should be optimization
Forth is too dirty, need a syntax preproc. is there a way to have "[ 1
+ ] [ 1 - ] ifte" as the base form, and translate it into "if 1 + else
1 - then" ?  should i move all macros that break the compositionality
to a different level?

* terminology/concept cleanup: define compilability in terms of the
  existance of a retraction.


* proper credit

MOORE: required tail recursion, multiple entry (fallthrough) and exit points.

VON THUN: program quotation + combinators, program = function
composition and constants are functions, monadic extensions: top of
stack is hidden.

DIGGINS: typed view + things you can't do (whole stack ops kill stack
threading)

FLATT: separate compilation + phase separation

* semantics of jumps?

they get in the way of FCL formulation. a jump could be a
non-terminating evaluation? is there a way to make this sound?


* closures versus quoted programs

note that quoted programs are not closures, since they are not
_specialized_. for closures you really need dynamic behaviour: at run
time, some values need to be fixed. something that could emulate
closures is the consing of an anonymous function with a state
atom. this operation is call 'curry' in kat. it could be combined with
a monadic state for more elaborate emulation of closures & objects.


Entry: i hate it when this happens
Date: Fri Aug 24 13:31:09 CEST 2007

i have something in my head, about the relationship between
compositional stack languages, monads, virtual machines for the
implementation of functional languages and the lambda calculus and
combinators.. but i can't quite express it due to lack of literacy..

argh..

ik weet weer nie waar de klepel hangt..

dus:

1. compostional language -> put partial evaluation and meta
programming in a simple framework. independent of set!

2. elaborate on the set's substructure.


--

so, 1. gives a framework on how to build a compiler. but without
stacks, composition isn't really useful. so the stacks are needed as a
tool to create general functions that can be applied in several
concrete settings. so these functions need to somehow be independent
of SOMETHING. that something is the way in which run time data is
organized.

need to find a better explanation..

something that hit me just now: a computation on a stack language
always involves saving some state, and recombining it later.. there
are 2 ways this happens:

* most functions leave the bottom of the stack intact
* 'DIP' leaves a part of the top of the stack intact

this is probably related to normal order and applicative order
reduction.


--

another problem.. why is it so hard to get this formulated correctly?

in my exposition about parsing words, i cannot really use "variable
abc" as a good example, because it really is not compositional code,

that needs to be disentangled..

conclusion is right though: in order to disentangle this system, it is
neccessary to remove some reflection. to 'unroll' the dependencies.

and the picture is really about dependencies. functional programming
is more about getting your graph free of cycles than anything
else.. maybe that's the reason for stressing on the Y combinator: how
to introduce cycles, but not really.

there's another example in dan friedman's book: essentials of
programming languages. i can't find it now, but somewhere about
implementing an environment there is a need for a circular reference,
but he uses a trick to not have to do this..

maybe it's about how to make things static. to keep them from moving
so they can be looked at in peacefully and quietly :))


--

basicly:
- stack = environment (de bruyn index)


Entry: so.. what's he most important thing now?
Date: Fri Aug 24 16:06:47 CEST 2007

a lot of ideas need some fermenting still. but there's one that's
quite clear: names cannot be created dynamicly, because that kills the
representation as a declarative language.

so i need a preprocessing step that takes out all creation of new
names. this makes some things problematic. one of them is multiple
exit/entry points.

multiple entry points can be translated:

: foo a b c
: bar d e f ;

->

: foo a b c bar ;
: bar d e f ;

then at the point where '(label bar)' is assembled, the jump to bar
can be eliminated.

multiple exit points need to be translated to an else clause

: foo if a b c ; then d e f ;

->

: foo if a b c else d e f then ;

so it looks like it's not just names, also 'implicit names' or labels.


Entry: environment and stack
Date: Sat Aug 25 09:14:26 CEST 2007

let's elaborate on this a bit more. the stack can be seen as
related to an environment, which is a way to implement
substitution in lambda expressions. to simplify, suppose we
have only unary lambdas.

(lambda (a)
  (lambda (b)
    ((+ a) b)))

this can be rewritten using de bruin indices (starting from 0) as

(lambda (lambda ((+ 1) 0)))

where the numbers refer to an index into the environment
array. this gives an easy way to represent a closure as a (compiled)
lambda expression, and an environment.

maybe the missing ingredient in my understanding is the SK calculus?


Entry: paper again..
Date: Sat Aug 25 10:42:23 CEST 2007

in fact, i need to distinguish between syntax and semantics a bit
better. a compiler works on syntax, (a representation).

von thun has some text about this.. 

again, i'm amazed by how untyped you can be in scheme! i'm just
performing operations on lists, without ever having to clarify what
things are.. interpretation is a consequence of what functions you
apply on the symbols..

so, let's say that "working with symbols" is always untyped. they are
a universal tool of delayed semantics. maybe that's the idea behind
formal logic, right? by just specifiying HOW to operate on symbols,
you never need to explain what you are actually doing.

Quite an adventure, trying to provide a model for the language and
compiler.

* read Flatt's paper about macros again
* logic and lambda calculus.
* monads and their relationship with compositional programs.
* a purrr module system + compositional language


http://zhurnal.net/ww/zw?StokesTheorem

Funny. I have that book on my shelf, and i tried to start reading it
on thursday. I guess it has a major truth. Once the necessary
structure is in place, the conclusions are often trivial. So all the
effort is in the creation of structure. Sounds like programming.

Try "Once things are clearly defined, the solution is at most a single
line.", "Write the language, and formulate your solution in it.", "Ask
the right question."


Entry: fully declarative and compositional
Date: Sat Aug 25 11:45:01 CEST 2007

declarative:

all names defined in a source file are to be know
before the body of the code is compiled. that way, a program is a
collection of definitions.

compositional:

make all branching constructs fit the compositional view by using
combinators only.

both are largely independent, but should lead to a better
representation. advantages:

D
- side-effect free macros
- detection of undefined words 
- possibility of modularization (later)

C
- correct optimizations in the light of branching


let's learn a lesson from the past.. i can't afford to break it
again. the changes that need to be made can be made without changing
the semantics so much that a radical rewrite of forth code is
necessary. all constructs used at this moment need to be preserved.

is there an incremental path? the following syntactic transformations
are necessary:

1. constant -> macro
2. variable -> macro
3. word definition -> macro
4. split a file into macro + code


Entry: monads in Joy
Date: Sat Aug 25 13:56:47 CEST 2007

http://permalink.gmane.org/gmane.comp.lang.concatenative/1506
http://citeseer.ist.psu.edu/wadler92essence.html

so that's what i've got to do today.

after reading manfred's comments, i think i need to read more of his
work before i attempt to re--invent his ideas.

the paper by wadler gives some relation between monads and cps. might
contain what i need to explain the relation between monads and
stacks. probably reaching the conclusiong that stacks are monads.

let's see if i can learn something from this.

for each monad, provide bind and unit.

one complication is that functions in cat return a stack. let's see if
that makes things worse.

unut:  x -- M x
bind:  M  fn -- N

bind extracts values from the monad, applies fn to each of them, and
constructs a new monad from the output.

it's easier to use 'join', since 'map' is so trivial.

wait, is this really the case?

map is (a -> b) -> (M a -> M b)

from http://en.wikipedia.org/wiki/Monads_in_functional_programming

  (map f) m ≡ m >>= (\x -> return (f x))
  join m ≡ m >>= (\x -> x)

  m >>= f ≡ join ((map f) m)

this is a little different than what i've been talking about
before.. maybe it's best i try to formulate this in scheme first. See
brood/mtest.ss

-- the misconceptions

M a -> (a -> M b) -> M b

does not mean the monads are different!

it merely means: unpack, process, repack


so what is 'map'. map really is map!

see the next entry for more intersting stuff about monads in scheme..


Entry: monads in scheme
Date: Sat Aug 25 16:18:59 CEST 2007


;; Monads in scheme.
(module mtest mzscheme

  ;; Monads are characterized by

  ;; - a type constructor M
  ;; - unit :: a -> M a
  ;; - bind :: M a -> (a -> M b) -> M b

  ;; In words: something that creates the type (ad-hoc in scheme),
  ;; something that puts a value into a monad (unit) and something
  ;; that takes values out of a monad, applies them to generate
  ;; several instances of the monad, and combines them into one.
  
  ;; Let's create some monads in scheme, using ad-hoc typing:
  ;; representation is not abstract, and there is no type check. Start
  ;; with the list monad.

  (define (unit-list a)
    (list a))
  
  (define (bind-list Ma a->Mb)
    (apply append (map a->Mb Ma)))


  ;; Using monads, functions need to be put into monadic form. Simply
  ;; wrapping them with 'unit' is usually enough.
  
  ;; (bind-list '(1 2 3) (lambda (x) (unit-list (+ x 1))))


  ;; So what is 'map' for the list monad? Haha! It's map!

  (define (map-list a->b Ma->Mb)
    (lambda (l) (map a->b l)))

  
  )


so now introduce polymorphy. instead of storing stuff in a hash, it's
easier to just store a pointer to the monad structure in the record
for a certain monad, i.e. use single dispatch OO.

so i got a polymorphic bind, and a fairly decent interface that
abstracts away the polymorphism, so 'unit' and 'bind' for each monad
can operate on the representation only.

  (define-monad Mlist
    (lambda (a)        (list a))
    (lambda (Ma a->Mb) (apply append (map a->Mb Ma))))


so.. what i can i do with this?

maybe best to try to translate some examples from wadler's paper into
this mechanism.

or to define a 'do' macro. the Haskell code for the list monad

  do {x <- [1..n]; return (2*x)}

is a bit too mysterious.. let's try something simpler. the maybe monad.

wait, all my functions are unary.. damn. how to take multiple values
into a monad? can't really do that.. will need explicit currying.

this uses letM*
http://okmij.org/ftp/Scheme/monad-in-Scheme.html

(define-macro letM
  (lambda (binding expr)
    (apply
     (lambda (name-val)
       (apply (lambda (name initializer)
                `(>>= ,initializer (lambda (,name) ,expr)))
              name-val))
     binding)))

so i transform this to my code..

try this:

  a = do x <- [3..4]
       [1..2]
       return (x, 42)

  a = [3..4] >>= (\x -> [1..2] >>= (\y -> return (x, 42)))


now

(define-syntax letM*
    (syntax-rules ()
      ((_ () expr) expr)
      ((_ ((n Mv) bindings ...) expr)
       (bind Mv
             (lambda (n)
               (letM* (bindings ...) expr))))))
  
leads to this:


(letM* ((a (Mlist '(1 2 3)))
        (b (Mlist '(10 20 30))))
    (unit Mlist (+ a b)))

#(struct:monad-instance
  #(struct:monad Mlist #<procedure> #<procedure> #<procedure>)
  (11 21 31 12 22 32 13 23 33))


wicked. the macro expansion gives

(bind
 (Mlist '(1 2 3))
 (lambda (a)
   (bind
    (Mlist '(10 20 30))
    (lambda (b) (unit Mlist (+ a b))))))


let's see if the type of 'return' can be inferred in a structure like
this. no. the return type of the entire expression is determined by
the return in the letM* block. this type is arbitrary and only
determined by the context of the expression, to which we have no
access in scheme. one possibility to fake this is using a dynamic
variable.

so, i guess it makes more sense to switch to map and join as basic
operations?

no. i ran into a problem with double wrapping of structures that
requires the 'join' operation to be aware of the wrapping. so i'm
going to revert the changes.

next exercise: the state monad.

i never really understood this. a state monad contains a function that
will return a value and a new state.

-- "return" produces the given value without changing the state.
return x = \s -> (x, s)
-- "bind" modifies transformer m so that it applies f to its result.
m >>= f = \r -> let (x, s) = m r in (f x) s


EDIT: see monad.ss


Entry: kat monads
Date: Sat Aug 25 21:35:12 CEST 2007

the problem seems to be that 'return' and 'bind' need to be formulated
in a way that properly deals with the stack. somehow it seems to get
in the way.

let's take a new look at it, modeling things on 'map'.

fmap   	  s.a->s.b Ma -- Mb
join	  MMa -- Ma
return	  a -- Ma
bind	  s.a->s.Mb Ma -- Mb

(bind     fmap join)

the thing which bothers me is 'map'. something is smelly about map in
joy, because of the stack "doing nothing".

it's strange that 'for-each' feels really natural, because it has
threaded state. but map somehow feels wrong..


Entry: for-each is left fold
Date: Sat Aug 25 21:45:30 CEST 2007

for-each is foldl is sort of 'universal iteration'.

	  '() '(1 2 3) (swons) for-each   ==  '(3 2 1)

foldr is more like 'universal recursion', and i don't have a direct
analog in kat. maybe i should create one like this:

          '() '(1 2 3) (cons)  foldr == '(1 2 3)


Entry: state monad
Date: Sun Aug 26 13:08:41 CEST 2007

a state monad is a nice example of a computation. nothing 'happens' as
long as the monad is not executed explicitly by applying the value to
some initial state. i think this is a nice starting point to formalize
what i'm doing, since it's about the same principle: build a
composition that represents the compilation, and execute it on an
initial state.

so, really, monads are a way to formulate any computation as a
function composition. doesn't that sound familiar?

the thing to find out is 

* how my very specialized way of state passing fits in the general
  monad picture.

* why does 'map' feel so strange in Joy/KAT ?

* what is a continuation in KAT?

the last one i can answer, i think. it's a function that takes a
stack, and represents the rest of the compuation. so the continuation
of 'b' in [abcd] is just [cd]. i've added call/cc to base.ss


let's re-read von thun's comments


Entry: closures & stacks
Date: Mon Aug 27 14:49:10 CEST 2007

something to think about: a compositional language can have first
class functions without having first class closures, and without this
leading to any kind of inconsistency. like 'downward closures only'.

this brings me back to linear vs. non--linear.

a key observation is that linear data structures are allowed to refer
to non--linear ones, as long as the non--linear collector can traverse
the linear data tree (acyclic graph in the case we work with reference
counts as an optimization). but non--linear structures are NOT allowed
to refer to linear structures. (because otherwise they would not be
able to be managed by the linear collector).

this makes the non--linear collector trivially pre--emptable by linear
programs. PROVE THIS!


Entry: linear memory management
Date: Mon Aug 27 17:57:00 CEST 2007


something to think about is how embed a linear language in scheme as a
model. as long as its primitives never CONS, this should work.

i'm trying to formulate a machine that can express the memory
management part of a linear language, if it is given a set of
primitive functions. see linear.ss

this is an attempt to make poke.ss work, but from a higher level of
abstraction.

something i did wrong on the first attempt is to change the tree
structure WHILE still using the old addressing mode. permutation of
register contents needs reference addressing, so my macros are wrong.

this means i need a different representation of REFERENCES.

let's say a reference is:
- a pointer to a cons cell
- #t for CAR and #f for CDR


funny. i'm running into a problem numbering binary trees. the most
visually pleasing numbering is breath first

1
2 3
4 5 6 7
8 9 10 11 12 13 14 15

this corresponds the the binary encoding:

   1abcd...

where a is the first choice, b the second, etc...
the one i chose intuitively was

  1...dcba

which is not so handy, but is more efficient to implement when the
labeling doesn't really matter that much.


ok i got it working. i have a tree permutation 'engine' which is
accessed by numerical node addresses. now what does this buy me? a
simple way to talk about embedding linear trees.

in practice, some of the nodes are constant, and are better put in
registers.


Entry: binary trees
Date: Mon Aug 27 21:51:58 CEST 2007


still not 100% correct.. i'm loosing nodes.

ok. i'm making a mess of it, but i think can conclude the following:

1. it is possible to use a tree as the data universe
2. normal forth operations can be written as binary and ternary permutations on a tree
3. such a tree is conveniently addressed numerically

what i'm about to do is to:

- create an embedding of normal forth operations in a single tree, by:
  * fixing the positionss of the stacks
  * associating each operation to a permutation

- find a way to efficiently generate code for these operations, with
  the possibility of mapping some fixed nodes to registers.


AHA!

one pitfall i knew, and i run right in it. there's one operation which
is not allowed: if R points to a cons cell, it is not alloweded to
swap the contents of R with CAR or CDR of that cell, because this
creates a circular link, effectively loosing the cell.

more generally, it is not allowed to exchange R1 and R2 if they are in
the same subtree.

baker's machine contains no operations that can lead to such
permutations. it only talks about exchanging the contents of registers
with cons cells. this is different.

i'm trying to write the permutation for '>r', written as (D . R)

the following sequence of permutations is legal

((d . D) . R) -> ((d . R) . D) -> (D . (d . R))

which is (5 3) followed by (2 3). can this be written as a single
cycle (2 3 5) ? one would say yes..

so i guess i had a bug? since it created a circular ref in my previous
implementation.

now i can get (2 3 5) to work, but (5 3 2) doesnt!

i think i don't understand something essential here..

this is getting interesting!

i think i see the problem now. one is that my permutations are
inverted, and two is that (2 3 5) is not legal, but (5 3 2) is.

how to distinguish legal from illegal permutations?

and the inverse of (5 3 2) is not (2 3 5) but (2 3 7)

it looks like this encoding of the nodes is not very useful for tree
permutations.


Entry: legal permutations
Date: Mon Aug 27 23:39:23 CEST 2007


it looks like a more interesting approach is to start with operations
that are legal and invertible, and find their closure.

the difference with baker's machine is that i'm trying to use only one
root. hmm.. there has to be a way to see if a permutation is legal..

why is (2 3 5) not legal. because 5 gets the value of 2, which points
to 5. so a condition is that a register x cannot receive the contents
of a register y if x is in a subtree of y.

in (5 3 2) none such assignment happens. 
- 5 is not a subtree of 3
- 2 is not a subtree of 2
- 3 is not a subtree of 5

'subtree of' can be computed by comparing box addresses

    [1]
   [2|3]
[4|5] [6|7]

        [1]
      [10|11]
[100|101] [110|111]

a is a subtree of b if b matches the head of a.

this way, no circular refs can be introduced. instead of thinking
about cons cells, think of binary trees. it indeed does not make sense
to swap nodes if one node is a subtree of another node.

what about enumerating all legal binary permutations on an infinite
binary tree?


()           identity

(2 3)

(2 6)  (2 7)
(3 4)  (3 5)

(4 6)  (4 7)
(5 6)  (5 7)

(2 12) (2 13) (2 14) (2 15)
(3 8)  (3 9)  (3 10) (3 11)
(4 12) (4 13) (4 14) (4 15)
(5 12) (5 13) (5 14) (5 15)
(6 8)  (6 9)  (6 10) (6 11)
...


back from tree rotations, which are not general enough...

in binary

()

(10 11)

(10 110,111)  (11 100,101)

(100,101 110,111)


back to numbers

level (bits)
1                /
2                (2 3)
3                (2 6,7) (3 4,5)
                 (4 5,6,7) (5 6,7) (6 7)
4                (2 12,13,14,15) (3 8,9,10,11)
                 (5 8,9,12,13,14,15)

it's quite hard to specify without exclusion statements.. 

but i guess i got what i was looking for: limited to only binary
permutations, the legal ones are easy to characterize.

what about using multiple coordinates, and then embedding them in a
numeration? It is always possible to encode an n-tuple of natural
numbers as a single one by interleaving the bits.

a legal binary permutation from node A and node B (A < B) can be
written as the tuple (A - 2, s, d) where s denotes the same level
trees and d the dept from it. this is really clumsy and doesnt work..

it looks like what i am looking for is a primitive dup and drop. the
reality is, these are not primitive!


Entry: tree rotations 
Date: Tue Aug 28 00:03:47 CEST 2007

can i work with just tree rotations? yes. moving an element from one
stack to another is a tree rotation. the essence of a tree rotation is:

- reversal of P -> Q  to Q -> P
- movement of one of Q's subtrees to P

so a rotation is parameterized by 2 adjacent nodes P -> Q, and the left...

wait!

it's not a rotation, since the subtree that moves is the one in between.

it is a rotation if the stacks are encoded as

  ((D . d0) . (r0 . R))

then a rotation is simply

  (D . (d0 . (r0 . R)))

trees which represent associative operations have a value which is
invariant under tree rotations.

is this helpful at all, or am i moving away from my point? with 2
stacks, a data stack and a free stack, motion can be implemented by
rotations. 

this is not general enough.. i have no need to preserve ordering.


Entry: different primitives
Date: Tue Aug 28 00:55:17 CEST 2007

so with a 2-stack system (D . F) with D rooted at 2 and F at 3, the
primitives are:

D  = 2
D+ = 5
D0 = 4
D1 = 10

F = 3


the free list needs to be flattened. this can be done when reserving a
new cell or when dropping a data structure.

the latter is probably best since it is
* more predictable: deleting a large structure takes time
* all references to externally collected objects can be removed

so, i do have a need for rotation! if the CAR of the free list is not
NULL, rotate the free list, then DROP the newly exposed top and DROP
the part we rotated to the stack.

: >free  (D+ F) (D F) ;
: swap   (D0 D1) ;
: free>  (D F) (D+ F) ;   \ [a.k.a. nil / save]

: drop   null? if >free ; then rotate drop drop ;


like baker remarks, a lot of operations can be coded so they avoid
copying of lists. i have a lot of this code in PF already.. 

the moral of the story is: 

* this linear stuff is quite nice to build a language on top of, but
  you need a decent layer below it to create a proper set of optimized
  primitives to make it work efficiently.

* using a single tree works just fine, but is probably not necessary
  if the basic structure (like where the D, R and F stacks are)
  doesn't change.

* only use binary permutation of disjunct trees. disjunct trees are
  easier to spot for binary permuations.

* numbering trees in 1abc... fashion works well, and is easy for
  drawing diagrams.

* drop needs to deconstruct its argument.


the hash consing thing in this paper i dont get
http://home.pipeline.com/~hbaker1/LinearLisp.html

but but... about ternery permutations. they are easier to
understand. because the rotation i'd like to perform has to be
factored in a non-intuitive way..

maybe it's just the rotation operation that's difficult to express
that way? instead of focussing the movement of the data stack's first
CONS cell, it's easier to focus on the movement of the cell we want to
get rid of. so in the picture painted above, the operation 'rotate' is
actually 'uncons' and would be (9 5) (4 5)

i think that settles most of the questions. the rest is fairly
straightforward to fill in.


Entry: next
Date: Tue Aug 28 14:23:06 CEST 2007

after this small detour about trees, NEXT on the list:

* clean up syntax preprocessing & purely functional macros

* investigate on HOF syntax for Purrr18

* determine if Purrr is a valid project, or if it's best to aim for
  Poke.


Entry: ANS Forth - poke - PF
Date: Tue Aug 28 14:26:07 CEST 2007


the last question is quite an important one.. if i'm planning to write
a language for education, do i really want ANS Forth? the only reason
would be to have something 'standard', but for what reason. better
documentation?

i never used ANS Forth, and the more i get into this language
simplicity thing, the more i start to dislike it. i think i have all
the elements for a decent linear VM ala PF. should fit on a pic18

and.. a cleaner language is easier to teach. moreover, a poke language
can be made safe.

is it worth to stop somewhere in the middle to use a little bit more
optimal language, instead of one based on CONS cells?

this is not something to decide in an instance, but i think life is
already complicated enough to fill it with problems created by
weirdness in ANS that i don't use.. Forth is dead. long live KAT &
PURRR :)


Entry: Haskell
Date: Wed Aug 29 13:15:22 CEST 2007

just watched Simon Peyton-Jones’ OSCON 2007 tutorial, which clarified
a lot of things. he talked mostly about type constructors, type
classes, and the IO monad.


* IO a   is   world -> (world, a)

* and a type class is implemented as a record of functions that
  'travels independently' from values, i.e. dispatch based on return
  type.

* type constructors are also used for destructuring. this generalizes
  the 'list' constructor, and tuples (which are not constructors i
  think..)


Entry: hash consing
Date: Tue Aug 28 20:23:22 CEST 2007

so what's that all about. see:

http://home.pipeline.com/~hbaker1/LinearLisp.html
Reconstituting Trees from Fresh Frozen Concentrate

first, that section is not about hash consing, but about something
different: "our machine will be as fast as a machine based on hash
consing"

i dont get it..


Entry: compositional and?
Date: Wed Aug 29 14:30:25 CEST 2007

i was wondering what the deal is with compositional view. it allows a
simple framework for metaprogramming, but that's all.. i made this a
bit more clear in the paper.


Entry: curry-howard
Date: Wed Aug 29 16:42:46 CEST 2007

quite remarkable. i'm running into cases where operations from the
code i thought were merely a hack, like the 'snarf' operation, turn
out to be quite important for a monadic formulation of a stack
language.

in other words: i'm extracting some mathematical structure by naming
the types of all the transformations that are present in the code. i
think i'm just going to do this exhaustively..

in other words, by hacking around semi-blindly, following just an
ideal of 'elegance' i end up with a nice description of what i'm doing
in categorical sense.


Entry: arrows
Date: Fri Aug 31 00:29:36 CEST 2007

reading 'programming with arrows' by hughes.
this 'dip' business is really arrows..

just rewrote brood.tex to give a categorical relationship between a
TUPLE language and a STACK language.

what remains is to explain their difference...

it's been quite a day.. what did i learn really?

given a tuple language, mapping it to a stack langauge makes explicit
the need for run time 'cons' if the tuple language can create
closures.

ok, i need to go over this again since i lost direction a bit..

the CTL -> CSL bit is good though, since it reflects a 'real' part of
brood, namely the relationship between scheme and kat.

I'm still not really satisfied about the explanation. I probably need
some more time thinking about closures and dynamic memory: how to:

- combine a low level language with just stacks and function
  compostions, both implemented as vectors, with a linear memory
  model that supports closures.

- how to add 'constant trees' to a linear memory tree.

- what about trees and reference counts.

Also, i need to read Hughes paper about arrows.

what about this vague rambling:

- data stack = future data
- return stack = future code


Entry: stacks and continuations
Date: Fri Aug 31 18:19:05 CEST 2007

from wikipedia
http://en.wikipedia.org/wiki/Continuation

Christopher Strachey, Christopher F. Wadsworth and John C. Reynolds
brought the term continuation into prominence in their work in the
field of denotational semantics that makes extensive use of
continuations to allow sequential programs to be analysed in terms of
functional programming semantics.


for the linear memory case, i need to implement:
- closures (== cons)
- continuations (== a stack copy)

to do this efficiently, i need baker's approach to linear data
structures, which can be implemented using reference counts because
they cannot be circular.

something tells me i'm chasing something really obvious.. i guess the
next thing to tackle is to describe the linear language, and write a C
model for it. i.e. to implement POKE.


Entry: CSL vs CTL
Date: Fri Aug 31 22:03:26 CEST 2007

i talked myself into a pit.. what about "1 2 3 +". how can this be
seen as a CTL? only by making + operate on more than 2--tuples. this
means all arrows T_i -> T_j are also in T_{i+n} -> T_{j+n}


Entry: linear
Date: Fri Aug 31 22:29:04 CEST 2007

the next thing to do is to create closures without garbage
collection. this would make PF interesting.

so the deal is: tree structured data allows for 1--ref structures
which can be optimized using reference counts. i guess this is the
hash consing business.

hash consing =
- tabel van CONS cells
- bij (cons a b) -> check if cell is in hash: inc refcount else new

so that should be able to speed it up.. it's a bit smelly though.


Entry: poke
Date: Sat Sep  1 12:28:16 CEST 2007


yep.. time to get practical. this linear thingy is the most
problematic one.. i guess the thing i need to investigate is:

- write a linear memory manager in terms of a low-level set of
  operations (forth machine)

- write the linear machine's interpreter in itself.


i'd like to take a different approach with this: first write it in a
testable highlevel setting, then just map it to lowlevel code.

remarks:

 * by making the code storage nonlinear, a large problem is already
   solved: the return stack does not need to copy continuations. the
   return stack is a program == a primitive program | list of
   programs.

 * CDR coding. all code in flash are CDR-linked lists, but encoded
   such that they can be represented as vectors. this works very well
   with the remark above. it looks like this solves my earlier problem
   of vectors vs lists.

 * no branches. only combinators.

 * types: - primitive
          - integer
          - cdr-coded nonlinear cell
          - ram cell
  
 * type encoding: since there are only 4 types, 3 of which are memory
   addresses, it can be solved with a memory map, and N-2 bit integers.
   

there's one important part i forgot: VARIABLES
those don't really fit in the picture..


Entry: partial application vs. curry
Date: Sat Sep  1 23:35:14 CEST 2007


curry:  ((a,b)->c) -> (a->(b->c))

then partial application is i.e. curry (+) 123

so maybe i should follow christopher in:
http://lambda-the-ultimate.org/node/2266

and call what i'm calling curry 'papply'

and apparently, partial evaluation != partial application. so how do
they differ?


Entry: XY and stack/queue
Date: Sat Sep  1 23:53:44 CEST 2007

the [d r] thing i described about continuations yesterday is made explicit here:
http://www.nsl.com/k/xy/xy.htm

XY by Stevan Apter


Entry: goals
Date: Sun Sep  2 12:44:01 CEST 2007

the reason brood.tex doesn't work well is that i'm not setting goals
of the project. i started wandering when talking about categories...

so the goals are:

- create a language based on the ideas behind Forth, which is

  * easily mapped to a target (i.e. has very lowlevel elements)

  * less resistant to static analysis than Forth.

  * requires small resources in base form (i.e. just some stacks)

  * contains some highlevel constructs that can be easily optimized,
    i.e quoted programs ala Joy.

  * serves as an implementation language for a CONS based
    language. either a linear or nonlinear one.


Entry: references
Date: Sun Sep  2 13:15:59 CEST 2007

time to collect some references.


Entry: language levels
Date: Sun Sep  2 13:46:47 CEST 2007


- macro assembler / virtual forth machine: purely static. macros do
  not rely on any run time kernel support.

- macros with run--time support: some constructs that can not be
  translated to straight assembler require run time support code. for
  example indirect memory access using '@' and '!'

- dynamic memory: cons


what i'm guessing is that i need to get my dependencies straight. this
means:

- get rid of side--effects in macros (all names are identified in first pass)
- create a purely compositional base language with 'required optimization'

so where to start? it's a big job, but really needs to be done before
i start implementing linear CONS.

it looks like the end result here is going to be quite different from
what i have now. i'm basicly moving from a linear to a block
structured language.


Entry: block structure
Date: Sun Sep  2 13:54:45 CEST 2007

the real question is: should i implement the block structured language
on top of the linear one, or provide a set of macros to translate
forth into a block structured language, which is then transformed back
into a linear one?

it seems reasonable to keep the forth layer as the lowest one, and
translate into it. so basicly i need a lexer with list support.

time to factor out the basic problems:
* stream.ss
* stream-match.ss


Entry: lazy lists
Date: Sun Sep  2 16:35:20 CEST 2007

added stream.ss and corresponding matcher.

funny how reverse accumulation is no longer needed when you use lazy
lists!

maybe i should propagate this to the asm buffer? there is one problem
with the asm buffer though: it is used as a stack.

anyways.. i can make the lexer lazy. DONE. it's simpler now.


Entry: on lazy lists
Date: Wed Sep  5 17:29:30 CEST 2007

let's see if i can say something intelligent about this.. what i
notice is that streams make you avoid the following pattern:

* read list, process, accumulate as push.
* reverse the list

lexing/parsing fits this shoe nicely.

so.. are streams processes?

instead of using '@cons', one could just as well write:

- read
- process
- write

so what is the difference? it looks the lazy list approach is less
general, since it has only one output? multiple outputs need to be
handled using multiple lists. while the process view uses one process
and multiple streams.

and yes, these are processes. since the non-evaluated tails act as
continuations. every '@cons' should be read as write+block.

so, what about the asm?  it still needs to be used as a stack,
however, multiple passes can now be done lazily.


Entry: onward
Date: Wed Sep  5 22:27:58 CEST 2007

i keep getting distracted.. i got some work to do!

first one is elimination of side effects in macros: all side effects
in the brood application are to be cache only. this is an important
part that will open the road for more interesting changes, hopefully
leading to a fully compositional lowlevel language with a module
system.


Entry: monads and map
Date: Wed Sep  5 22:40:42 CEST 2007

so.. what about writing a macro for this 'generalized map - not quite
a real monad - collect results in a list' pattern?

i guess this is just unfold..

no it's not..

got this macro + usage:

  (define-syntax for-collect
    (syntax-rules ()
      ((_ state-bindings
          terminate-expr
          result-expr
          state-update-exprs)
       (let next ((l '())
                  . state-bindings)
         (if terminate-expr
             (reverse! l)
             (next (cons result-expr l)
                   . state-update-exprs))))))

  (define (@unfold-iterative stream)
    (for-collect
     ((s stream))
     (@null? s)
     (@car s)
     ((@cdr s))))
     
      
but it looks just ugly, so i'm going to forget about it.. i guess, if
this pattern shows up in code, it means i'm not using a proper hof.

what about writing it as a hof instead of a macro?

i think i'm getting a bit tired.. just reinvented unfold.. no, it's
unfold*


Entry: linear parser
Date: Thu Sep  6 00:25:26 CEST 2007


the parser can definitely be moved to streams. the fact that it
contains syntax streams is not really relevant to the structure of the
algorithms.. for example: i'm using 'match' in forth.ss

it changes a lot: the prototype of the parsers now is @stx ->
@stx. but the code should be a lot easier. 

due to the linearity of forth / compositional code, writing a macro
transformer as a stream processor instead of a tree rewriter makes a
lot of sense actually..

the preprocessor will translate a token stream -> s-expressions.

occurances of syntax-case can be replaced by @match. which is exactly
what i avoided in a previous attempt.. maybe i should just create a
@syntax-case macro that's similar to the @match macro, taking
partially unrolled syntax streams.

hmm.. pure syntax-case is a bit clumsy.. but the 'no rest' parser
macro i'm using does fit pretty well.


something i've been talking about before:

syntax-case: matcher for compilation: merge 2 namespaces (pattern var + template)
match:       matcher for execution: only a single lexical namespace

i don't know how to make the pattern more explicit, but it boils down
to something like this: if you're match together with quasiquote,
you're actually COMPILING something, not computing something.

in that case, pattern matching using syntax-case might be more
appropriate, even if you're not using scheme macros, because of the
merging of template and pattern namespaces. (which have to be mixed
explicitly using quasiquoting).

actually: syntax-case matches 3 namespaces:
- pattern
- template
- transformer namespace


Entry: SRFI-40
Date: Thu Sep  6 10:16:17 CEST 2007

it's been fun, but time to move to a standard implementation:
http://srfi.schemers.org/srfi-40/srfi-40.html

it would indeed be strange if this were not somehow standardized..

(require (lib "40.ss" "srfi"))

but, 40 has problems:
http://groups.google.com/group/plt-scheme/browse_thread/thread/637cc74047a7ada9

anyway: thing to remember: streams can be ODD or EVEN
http://citeseer.ist.psu.edu/102172.html

i'm using EVEN style: (delay (cons a b)) instead of (cons a (delay b))


so what exactly is the problem with
http://srfi.schemers.org/srfi-45/srfi-45.html

?

it can be seen in @filter, as explained in the srfi-45 document:
a sequence of

(delay
   (force
      (delay
         (force

is not tail recursive. this is because 'force' cannot be tail
recursive: it needs to evaluate, and cache the value before
returning. srfi-45 solves this by introducing 'lazy'

easy to see in:
(define (loop) (delay (force (loop))))


ok. so i'm sticking with my own lazy stream implementation. most of it
should be fairly easy to replace with some decent standard library
later. i don't think i'm doing anything special..


Entry: linear parser begin
Date: Thu Sep  6 15:36:48 CEST 2007

- all parsers are @stx -> @stx
- parser-rules: easily adapted (used by predicates->parsers)
- named-parsers


i'm forgetting something.. a parser needs to distinguish between
'done' and 'todo': the driver will stitch the stream back
together. otherwise each parser needs to explicitly invoke the driver
routine as the second argument to '@append'.

the reason we use a driver is to make each individual parser agnostic
of it's environment..

in concreto: the current implementation can be largely reused, but
list tails need to be replaced by streams.

then the remaining question is: does a primitive parser return 2
streams, or a list and a stream?

again:
- if parser does 1 expansion, it needs to return 2 streams.
- if it does multiple, it suffices to return only one

it's best to let the driver decide, so the first one is more
general. making both streams makes the interface simpler.

looks like the only thing this needs is a proper syntax-case style
syntax stream matcher so i'm not jiggling too many syntax<->atom
conversions. need to think about that a bit better, to see what the
prototype needs to be.


Entry: parser rewrite
Date: Thu Sep  6 22:46:20 CEST 2007

the end is near.. code seems to simplify a lot.

need to write 2 more generic parsers:

- delimited
- nested


interesting.. this stream business is deeper than i thought. i do run
into a problem though: (values processed rest) what if rest is only
determined if processed is completely evaluated?

by moving the 'append' to somewhere else, the forcing order can no
longer be trusted. does this really matter?? i need a break.

ok.. i got it worked out as '@split' which returns 2 values: the first
is a stream before a delimiting value, and the second is the stream
after.

the code i have now needs a certain evaluation order. i can make it
independent of that by forcing until the rest-stream becomes true.

that works. also got @chunk-prefixed working: which separates a
prefixed stream into a stream of prefixed streams.


Entry: macro mode
Date: Fri Sep  7 13:24:38 CEST 2007

i found out that ';' can just as well be used in macro mode for 'jump
past end', if macro mode can only contain prefixed definitions.  this
will bring multiple exit points to macros. can change this later.

anyways.. all parsers are now token (syntax) stream processors. it
should be really straightforward from here to:

- separate macro and code definitions
- perform separate compilation for forth files (macro definitions)


about the use of ';' in macros: this probably needs some dynamic
variable because of context: a macro representing a forth file != a
normal macro. in a forth file ';' means return to sender, in a macro
it means jump past end.. maybe i should avoid this?


Entry: bored
Date: Wed Sep 12 21:28:51 CEST 2007


i had some days off writing an article for folly, and my mind is
wandering away from the lowlevel forth stuff.. talking to a friend
yesterday i realized i need something different. i'm getting
stuck. let's rehash the problems i'm facing right now:


- i need pure functional macros: no side effects except hidden in
  cache / memoization. this requires a true code dependency
  system. doing this half-assed makes no sense, so i should at least
  have something like mzscheme, possibly piggy backed on top of
  it. that however is not easy, since this will probably mess up my
  namespace stuff. so i'm a bit stuck because i can somehow forsee the
  problems that are coming after i fix up my macros.

- i want to give up on portable ANS forth idea, and design a safe
  PF-like linear language. the stumbling block there is variables,
  since it's incompatible with the linear idea. at least, doing it
  using references to cells. maybe i can use some trick here? can
  variables be managed externally so they never need to be deleted?
  can they be seen as data roots like machine registers? something is
  not right in my intuition here..


EDIT: Mon Oct 8 21:06:17 CEST 2007 

Pure functional macros work now, and make things a lot better, but
this linear language variable thing i'm still quite puzzled by.


Entry: sticking to forth as basis
Date: Sat Sep 15 05:07:40 CEST 2007

reading http://lambda-the-ultimate.org/node/2452 forth in the news


i'm more and more convinced that forth should be the lowest level, not
some block structured higher level construct, which would require more
elaborate optimizations. it's best to have the pure control structs
(i.e. for next) as direct macros, and implement the higher code block
quoting constructs in terms of them. 

forth has this way with return stack juggling that's very powerful for
making new control structures. this is hard to do efficiently when you
tuck it all away in combinators..


Entry: brood paper
Date: Sun Sep 16 14:47:05 CEST 2007

actually.. it would be interesting to go over my ramblings and make a
list of things i got really wrong, or saw too simplistic. then see
what solution i got or how i came to understand the issues.


- monads are not just hidden top of stack items
- the relationship between closures and CONS
- syntax-rules and composition
- pattern matching and algebraic types
- lazy lists vs. generators: lists remain 'connected'
- 'natural' compiler structure: scoping rules, quasiquoting and syntax-case (3 levels)
- more specificly: quasiquote vs syntax case: when to use macros? is it code or data?
- looping and boundary conditions (i.e. image processing)
- cdr coding and lists as arrays
- importance of side-effect free 'loading' + relation to phase separation.


Entry: linear structures, variables and cycles
Date: Mon Sep 17 16:06:58 CEST 2007

in a linear structure (tree or acyclic graph if hash consing is used)
cycles are not possible. so how do you represent datastructures that
have some form of self-reference?

the thing we're looking for here is something akin to the Y
combinator: instead of having a function refer to itself, a different
function is used to turn a function to "tie the knot".

let's start with:

http://scienceblogs.com/goodmath/2006/08/why_oh_why_y.php

i'll try to put it in my own words, see next post. the link above has
an interesting comment on self-application. also, the wikipedia page
has some interesting links:

http://en.wikipedia.org/wiki/Y_combinator

so how to you apply this trick to data structures? my guess would be
to start from data structures in the lambda calculus, and then making
things more concrete.


Entry: Y combinator
Date: Mon Sep 17 18:55:07 CEST 2007

a fixed point p of the expression F satisfies F(p) = p. the Y
combinator expresses p in terms of F as p = Y F. combining the two we
get:

     F (Y F) = (Y F)

simply expanding this gives exactly what we want:

     Y F = F (Y F) = F (F (Y F)) = F (F (F (...)))

where the dots represent an infinite sequence of self applications.
that's all folks. in order to implement useful recursion, simply write
the 'body' F, and Y will take care of the rest.

let's make this a bit more intuitive. suppose we want to create a
function f which is defined recursively in terms of f. look at F as a
function which produces such a function f,

    F : x -> f

the recursion is a consequence of the infinite chain of applications 

    f = Y F
      = F (F (F ...))
      = F f

so what are the properties of F? first it needs to map f -> f. and
second if a finite recursion is desired, it needs to do this in a way
that it creates a 'bigger' f from a 'smaller' one, eventually starting
from the 'smallest' f which does not depend on f: this leads to a
finite reduction when normal order reduction is used.

let's solve this problem in scheme, for Y F = factorial. so we know
that:

   factorial = F (F (F (...)))

or

   factorial = F factorial

in words, F is a function that returns a factorial function if it is
applied to a factorial function. so the factorial function is a fixed
point of F. the Y combinator finds this fixed point as 

   factorial = Y F.

the rest is fairly straightforward: a nested lambda expression which
uses the provided 'factorial' function to compute one factorial
reduction step:

F =

(lambda (factorial)
  (lambda (x)
    (if (zero? x)
        1
        (* x (factorial (- x 1))))))


the thing which always tricked me is 'fixed point', because i was
thinking about iterated functions on the reals used in many iterative
numerical algorithms like the newton method. in the lambda calculus,
there are only functions and applications, so a fixed point IS the
infinite nested application, since that fixed point value doesn't have
another representation, while a fixed point of a function on the reals
is just a point in the reals.


Entry: algebraic data types
Date: Tue Sep 18 13:44:48 CEST 2007

look no further.. the plt-match.ss actually has this kind of stuff, at
least the pattern matching associated to algebraic types. and i think
it is extensible.

http://download.plt-scheme.org/doc/371/html/mzlib/mzlib-Z-H-34.html
http://en.wikipedia.org/wiki/Algebraic_data_type

"In computer programming, an algebraic data type is a datatype each of
whose values is data from other datatypes wrapped in one of the
constructors of the datatype. Any wrapped data is an argument to the
constructor. In contrast to other datatypes, the constructor is not
executed and the only way to operate on the data is to unwrap the
constructor using pattern matching."


Entry: pic network
Date: Tue Sep 18 20:40:10 CEST 2007

1. simple: 2 wires
2. robust: working boot loader


Entry: parser-tools lexer
Date: Thu Sep 20 19:20:11 CEST 2007

i'm replacing the lexer with the one from parser-tools. this is a lot
lot easier than writing your own. what a big surprise; too bad i
postponed it for so long..


Entry: message passing
Date: Thu Sep 20 21:15:05 CEST 2007

hmm.. message passing concurrency seems to be the real solution of
tying a core and metaprogrammer together. i should find out how to
formalize message passing (i.e. Peter Van Roy and and Seif Haridi's
book "Concepts, Techniques, and Models of Computer Programming"
http://www.info.ucl.ac.be/~pvr/book.html)


Entry: work to do
Date: Sat Sep 22 19:42:35 CEST 2007


* documentation
* bootloader (+- DONE)
* independent of emacs?

preparing for waag & piksel, the most important problem to solve is to
make the bootloader robust. this is probably best solved as:

    serial cable  plugged -> start console
    unplugged (i.e. with jumper to gnd) -> start app (at 0x200 hex)
    all interrupt vectors moved to 0x200 block

then this block can be made write-protected, so there's absolutely no
way to mess it up -> can eliminate ICD2 connector on boards.


Entry: purrr manual questions + necessary fixes
Date: Sun Sep 23 13:30:49 CEST 2007

* can i get at least an 16--bit library running without making it stand-alone?
* how difficult is it to unify macros and words from user perspective?
  -> interaction always compiles a 'scrap' function.
* is it possible to write all control structures in terms of tail recursion?

the more filo ones:

* exceptions are imperative features.. is this bad? when is this bad?
  it's like using continuations, which is interesting for backtracking
  etc. i'm leaning toward pure functional programming, but some
  imperative features are really OK as long as they are
  shielded. i.e. global mutable variables are clearly not. (namespace:
  single assignment = ok + possible to hack for debug).


Entry: new bootloader fixes
Date: Mon Sep 24 12:37:41 CEST 2007

i got the monitor working, now i need to get the synth back up. some
things that need fixing from the debugging side:

* a correct jump assembler (+- DONE: throws exception)
* a correct disassembler (+- DONE: lfsr broken)
* constants in console (DONE)
* cache macro compilation
* a command to erase a block of code during upload


note about field overflows: for data values, it should be ok: it's
quite convenient to assume they are finite size. for example, banked
addressing.

for code it's an error, since you don't have any control over this
while programming.


Entry: error reporting
Date: Mon Sep 24 14:15:54 CEST 2007

yes, i am at fault here. never really gave it much thought, but it's
starting to become a problem. my error reporting sucks.

one of the most dramatic problems is the loss of line numbers to
relate errors to original code. a solution for this is to use syntax
objects everywhere.

second is the way errors are handled in the assembler. currently i
have some code that's a bit hard to understand: i got used to hygienic
macros, and symbol capture looks convoluted to me. maybe i just need
to rewrite that first?

hmm.. what about systematicly replacing 'raise' with something more
highlevel. one of the things that is necessary is a stack trace. there
was some talk on the plt list about this recently. let's have a look.

there is (lib "trace.ss") which doesn't really do what i need, since
it's active. what about taking this error reporting seriously, and
giving it its own module? would be good to eventually document all
possible errors etc.

what about the following strategy: every dubiously reported error will
be fixed, no matter what it takes.


>> c>
ERROR:
#<case-lambda-procedure>: no clause matching 1 argument: (qw)

this is a stack underflow error

i was thinking about installing an error translator in rep.ss, but
this kills the tail position. therefore, errors need to be translated
at the top entry point, which in this case is in prj.ss

it's really not such a simple problem.. need to define what
information i'd like to get: errors need to b e reported at
'interface' level which is either compile/run of files/words.

compile errors are most problematic since they need to be related to
source location..


Entry: state mud
Date: Tue Sep 25 14:05:35 CEST 2007


the prj.ss file should do nothing more than fetching/storing state and
passing it to pure functions. i am a bit appalled by the way things
work in prj.ss, because this state binding tends to swallow
everything..

maybe it's not such a good idea after all? i guess it is still a good
idea, but its only function should be to manage state.

let's rehash state stuff:

* only prj.ss contains permanent state
* I/O uses read-only dynamic scope for the read/write ports
* macros etc.. are supposed to be read-only cache
* all the rest is functional

UPDATE: Thu Sep 27 22:56:03 CEST 2007
- moved some functionality to badnop.ss
- adopted a left/right column notation for state/function


Entry: boot code and incremental upload
Date: Tue Sep 25 15:05:23 CEST 2007

the basic rule for forth is: code is incremental. if you need to patch
backward, you need to do an erase + burn cycle. how to do this
automaticly?

it's probably not so hard to solve by performing (CRC) checks on
memory.


Entry: core syntax
Date: Tue Sep 25 18:05:36 CEST 2007

just writing the purrr manual and i got back to this language tower
thing... i really need a core s-expression based syntax for code with
multiple entry and exit points, instead of forth.


Entry: or
Date: Tue Sep 25 19:44:24 CEST 2007

Something that's really handy in scheme is a short-circuiting 'or'.
i'm in need for something like that to define interactive word
semantics: try executable words first, then try variable names, then
try constants (or later macros). In scheme this is easy because
variables are referenced multiple times, in CAT this is awkward due to
explicit copying/restoring of the argument stack.

Some backtracking formulation would be nice, but generic backtracking
is overkill. It also requires explicit handling of the continuation
object. Escaping continuations work fine here, and they can be stored
in a dynamic parameter, so no explicit manipulation of continuation
objects is necessary.

With 'check' being a word that aborts the current branch if the top of
the stack is false, using the quasiquote (see next post) this is
simply:

`(,(foo check do something check more stuff)
  ,(bar check do something else)
  ,(in case everything fails)) 
attempts

The apology:

 In a compositional language, escape continuation (EC) based
 backtracking might take the role of a conditional expression because
 it's often easier to go ahead and backtrack on failure than to
 perform a number of tests/asserts ahead of time which might CONSUME
 your arguments, so you need to SAVE them first. An EC can be used to
 restore the contents of the stack before taking another branch.


The disadvantage of course is that words that use 'check' are only
legal within an 'attempt' context, and are not referentially
transparent. I guess this is ok.. same as using catch/throw.

I do feel a bit like a cowboy now.. What about distinguishing 'bad'
exceptions from 'good' ones? Using exceptions in CAT has always been
awkward, but the 'attempts' syntax here seems nice.


Entry: quasiquote
Date: Tue Sep 25 22:12:34 CEST 2007

what about postscript style [ ] quotation to create data structures
with functions?  i can't use [ ] or { } since mzscheme sees them as
parentheses. only angle brackes are left alone.. so either i'm
creating a syntax extension, ie.e (list: (bla) (foo) (bar)) or i use
an angle braket structure. since the latter will work, i'm using that:
<* *>

what about just using the quasiquote here? i'm not using it anywhere
else and i'm already using quote. it's only legal on programs: and
unquote means: insert program body here.


Entry: assembler optimizations / corrections
Date: Wed Sep 26 02:05:11 CEST 2007

A) jump size optimization

currently i have none. recently i introduced at least error reporting
on overflow. i think the deal is that doing it 'really right' is
difficult; i'm not sure there exists an optimal algorithm. the
simplest approach is: 
 
  * convert small -> long jump
  * increment/decrement jumps before/after the instruction
  * update dictionary accordingly

it's probably easiest to do this on an already fully resolved buffer
(after 2nd pass). this algorithm is confusing due to the
forward/backward absolute/relative destinction. also, doing this
without mutation seems troublesome.


B) jump chaining

was really easy in the original badnop due to use of side-effects.


somehow this problem looks as if there's some weird control structure
that might help solve this is a more direct way.

OK... finding the optimal is apparently NP-complete

http://compilers.iecc.com/comparch/article/07-01-037

> [There was a paper by Tom Szymanski in the CACM in the 1970s that
> explained how to calculate branch sizes. The general problem is
> NP-complete, but as is usually the case with NP-complete problems,
> there is simple algorithm that gets you very close to the optimal
> result. -John]

or not?

http://compilers.iecc.com/comparch/article/07-01-040

  If you only want to optimize relative branch sizes, this problem is
  polynomial: Just start with everything small, then make everything
  larger that does not fit, and reiterate until everything fits.
  Because in this case no size can get smaller by making another size
  larger, you have at worst as many steps as you have branches, and
  the cost of each step is at most proportional to the program size.


so, it looks like the simple approach of using short branches and
expanding/adjusting + checking is good enough. 


Entry: platforms
Date: Wed Sep 26 05:11:06 CEST 2007

been thinking a bit about platforms. some ideas:

* 32 bit + asm makes no sense. GCC is your friend here, and should
  generate reasonably good code for register machines. split language
  into 2 parts: POKE for control stuff, and some kind of dataflow
  language for dsp stuff.

* AVR 8 bit makes not much sense either. there is GCC and i already
  spent a lot of time optimizing 8 bit opcodes.. learning the asm
  sounds like a waste of time.

* don't know if PIC30 makes a lot of sense. it is an interesting
  platform (PDIP available), and they are reasonably powerful, if not
  a bit weird.

maybe focus on PIC18, and a small attempt to get a basic set of words
running for PIC30?


Entry: capacitance to digital
Date: Wed Sep 26 05:26:25 CEST 2007


http://www.microchip.com/stellent/idcplg?IdcService=SS_GET_PAGE&nodeId=2599&param=en531579    

CAPACITANCE TO DIGITAL CONVERTER

To convert the sensor’s capacitance to a digital value, three things
have to happen. First, comparators and the flip flop in the comparator
module must be configured as a relaxation oscillator. Second, the
desired sensor must be connected to the relaxation oscillator. Third,
the frequency of the oscillation must be measured. The configuration
of the comparator and the SR latch require configuring the
comparators, the SR latch, and the appropriate analog
inputs. Connecting the sensor to the oscillator requires the control
software to select the appropriate analog input to the comparator
module’s multiplexer. It must also select the appropriate input to any
external multiplexer between the sensors and the analog inputs of the
chip. To measure the frequency of the oscillation, TMR1’s clock input
must connect to the output of the relaxation oscillator, and a fixed
sample period will be controlled by TMR0.

To start a frequency measurement, both TMR0 and TMR1 are cleared. The
TMR0 interrupt is then enabled. When the interrupt fires, TMR1 is
stopped, and the 16 bit frequency value in TMR1 is retrieved. Both
TMR0 and TMR1 can then be reset for the next measurement.

To keep the accuracy of the frequency measurement consistent, the
interrupt response time for the TMR0 interrupt must be kept as
constant as possible, so no other interrupt should fire during a
measurement. If one does, then the measurement must be discarded and
the frequency measurement must start over.

Once the 16-bit value is retrieved, the detector/decoder algorithms
can determine if the shift in frequency is a valid touch by the user
or not.

For more information on the interrupt services routine for TMR0, and
the initialization of the relaxation oscillator, refer to application
note AN1103 on Software Handling for Capacitive Sensing.


Entry: todo list
Date: Wed Sep 26 19:46:52 CEST 2007

URGENT:

* word reference manual

the primary goal would be to have documentation available at the
command console or during emacs editing, instead of just in paper
form. a tutorial can come later. now where  do i specify it?

* code protect the boot sector (OK)
* interaction macros (needs syntax + minor support in prj.ss)
* readline console
* command line completion
* make it installable (-> solve library deps? : install in collects?)
* check battery / BREAK resistor
* simplify prj.ss into chunks that operate on state explicitly (OK)


NOT URGENT:

* macro cache (maybe explicit files? read about compilation)
* scheme library split (module path handling)
* bootloader: automatic boot (2ND) sector patching
* assembler changes + functional macros


Entry: boot code
Date: Thu Sep 27 15:57:28 CEST 2007

this is actually pretty important. once i start sending out kits, it's
not so easy to change the bootloader.

some things to note:
- monitor.state -> (dict ...) format is the only important part
- boot sector is independent of any macros : only words count
- machine model obviously needs to stay stable (hasn't changed in years)
- binary api (the monitor commands) need to stay stable


what about making the bootstrap interpreter simpler? get rid of
anything other than 'receive transmit execute', and leave the rest to
a dictionary? if there's ever a problem for portability or whatever,
this might be the way to go: this interface allows to hide all
functionality in the dictionary associated to the boot kernel. right
now it's still quite pic-specific. some things will become less
efficient though..

also, the way the code is organised, sending commands will become more
difficult. the set i have now is complete enough, and reasonably
efficient. let's keep it simple and stick with the current one.

another thing: fixing the boot block. let's try that:
setting 30000B to A0 does the trick (CONFIG6H : WRTB)


Entry: using the ICD2 pins.
Date: Thu Sep 27 15:53:34 CEST 2007

last couple of days were a bit too much on the dreaming side. i need
something concrete to fix. i was thinking about simplifying the
programming interface. was thinking about using the ICD2 pins to also
do debug serial comm. but why? if my boot kernel is stable, this is
entirely unnecessary, except for reset!


Entry: ramp up to purely functional macros
Date: Thu Sep 27 20:57:29 CEST 2007

the parser.

STAGE 1:

- rewrite 'constant' as a macro definition

- separate macros from the body code, which is seen as a single
  function with multiple entry/exit points.


problem still not solved: 'variable'

currently, variable creates a constant containing a symbol, and 'code'
that performs the allocation later during the assembly phase. so in
fact, it's not so problematic.


Entry: prj.ss
Date: Thu Sep 27 22:59:52 CEST 2007

simplified it a bit: made state ops more explicit, and moved
functionality to badnop.ss

this looks like a nice approach in general. i do wonder why i still
need 'functional state' at the prj.ss level: most state updates are
intermingled with microcontroller state updates which are dirty
anyway.

one thing: it keeps me honest. on the other hand, i'd like to move to
some "image" representation. cached macros would be cool. maybe i should
look at that now.


Entry: macro cache
Date: Fri Sep 28 00:04:07 CEST 2007

it looks like the bulk of the 'revert' time is spent in needlessly
compiling code. there aren't so many run-time created macros: and
constants are currently not 'eval'ed. maybe i should make that so i
can snarf them out.

hmm.. spaghetti. the problem is that constants are still treated
separately. i can't unify them with macros until macros are purely
functional so they can be evaluated to see if they produce constant
values. solution dependences:

      file parsing to distinguish macro/code
then: purely functional macros
then: elimination of assembler constants

however, doing the first one requires elimination of assembler
constants!

looks like this is the reason why i can't oversee the problem: it's
quite a big loop. anyways, i can write the parsing step and test it
leaving the side-effecting macros intact. then move to side-effect
free macros and change the constant parsing to translate constants to
macros.

so. maybe i need an S-expression syntax first, so i can translate code
to it! for macros this is easy: i'm already using one. for composite
code however, it becomes more difficult due to the multiple entry-exit
points. this can be left alone in a first attempt.


Entry: product vision statement
Date: Fri Sep 28 01:05:20 CEST 2007

http://www.codinghorror.com/blog/archives/000962.html

for (target customer)
who (statement of need or opportunity)
the (product name) 
is a (product category)
that (key benefit, compelling reason to buy)
unlike (primary competitive alternative)
our product (statement of primary differentiation)


for embedded software developpers
who want to program small embedded systems
the Brood system
is a tool chain
that supports incremental bottom up development
unlike C
our product has integrated metaprogramming through built-in macros.


something like that..
interesting.


Entry: documentation
Date: Fri Sep 28 14:03:33 CEST 2007

write a purrr manual in tex2page by sending queries to the brood
system. this should use an interface similar to snot.

brood needs to be centered around services, of which snot is one. so
let's try this:


services with

  - direct access to brood for SNOT and RL
  - document generation


Does services.ss run inside the sandbox?

YES

So all calls from snot.ss -> services.ss go through a sandboxed
eval. Services.ss itself does not need to take care of this, and can
use direct calls.

the deal is this:

a CONSOLE needs to separate:

   - TOPLEVEL (represented by eval)
   - STATE (a data structure stored independent of toplevel)


Entry: persistence
Date: Fri Sep 28 19:26:54 CEST 2007

i must not forget that the way i use persistence is a SOLUTION, and
not the original problem.

the real problem is a conflict between two paradigms:

* TRANSPARENCY as in MzScheme's module system
* image persistence and run--time self modification


as usual, my problem is rooted in ignorance. i've been jabbing about
the distinction between the two above for a while, but the real
problem is compiler compilation time.

i need to have a look at MzScheme's unit system. it sould be possible
to reload units after recompiling them because they are mere
interfaces.


Entry: services
Date: Fri Sep 28 23:24:59 CEST 2007

hmmm.. i didn't really get anywhere today. but at least i figured out
what 'services' should be. it's just the stuff that snot has access
to, but without the snot interface. i renamed it 'console.ss' and took
it out of 'snot.ss', which is now just a bit of glue.


Entry: forth preprocessing
Date: Sat Sep 29 15:51:12 CEST 2007

parsing and lexing.
it's divided in a somewhat un-orthodox way

LEXING

there are 2 front ends:
  forth-lex              :: string -> atom stream
  forth-load-in-path     :: file,path -> atom stream

the lexing part flattens the load tree. i.e. during lexing, the source
code is made independent of the filesystem.


PARSING

this is where i have to break things, so let's commit first.

1. flat forth stream -> compositional forth stream with macros removed
2. constants -> macros

let's see if i understand: constants are bad. there is no way around
the fact that constant swallows a value: it's the worst case of
reflection. this is not compatible with current parser. keeping it
would require lookahead.

so 'constant' needs to be replaced entirely by 'macro' in source code.

looking at the previous entry [[phase-separation]] what is required is
indeed a parsing step that can translate

       1 2 + constant x  -->  macro : x 1 2 + ; forth

yes, this is of course possible, but is it really worth it? maybe it's
better to clean up the Purrr language semantics now than to carry
around the code that allows this. ad-hoc syntax is a nuisance.

so, current path: CONSTANTS are being removed.

that was easy :)

now, for variables.


     variable abc 

does 2 things: it creates a macro that quotes itself as a literal
address, and it adds code that tells the assembler to reserve a RAM
slot.

maybe i should use 'create' and 'allot' ?
(back to that later)

currently the parsing seems to work, except for the macro/code
separation step. for this i need a stream splitter. in stream.ss i
have '@split', which just splits off the head of a stream, not true
splitting.


status:
- parsing step: ok
- load! setep: ok (like previous load, but with macro defs separated)

next:

- remove all side-effecting macros
- change the assembler to take values from macros


remarks: 
  * is dasm-resolve still possible?  (value -> symbol)

status:
- monitor.f -> monitor.hex gives the same code


Entry: cleanup
Date: Sat Sep 29 21:21:54 CEST 2007


core changes seem to be working. the rest is cleanup. TODO:

- fix variable (OK)

- fix interaction constants (OK)

- fix sheepsint (OK)

- extract macros from forth file -> compositions + save as cache (OK)

- fix interaction macros that reduce to expressions

- trick macros into generating their symbol during compilation, and
  value during assembly. (restore disassembly constants)

- clean the assembler name resolver


Entry: storing application macros in state file
Date: Sat Sep 29 22:31:46 CEST 2007

why not?

this solves a lot of problems.. and they are available in source form,
so there's not problem to store them symbolically.


Entry: profiling
Date: Sun Sep 30 03:15:31 CEST 2007

on sight.. but still quite remarkable. loading monitor.f from source
to S-expressions takes a lot more time than either compiling the
macros or compiling the code to a macro and running it. both are
instantaneous.

ha!

actually, that's very good news. improving the speed of the lexer
seems a lot easier to do than improving the speed of the compiler.

looking a bit further, sheepsint.f seemed to be faster. the reason is
thus the constants. maybe i should just put them back to
s-expressions? they don't change much after all.


Entry: upload speed
Date: Sun Sep 30 03:40:57 CEST 2007

It's quite annoying the upload speed is so slow. I need a way to
change the speed on the fly.

EDIT:
baud rate: commit goes a little bit faster when baud rate is changed
from 9600 to 38400, so the limiting factor is probably the flash
programming.


Entry: parsing and printing
Date: Sun Sep 30 16:17:09 CEST 2007

there are a couple of placese in the brood code where (regular)
parsing and printing are done in a relatively ad-hoc way using
'match'. maybe i should have a look at extending match to provide
better pseudo "algebraic types".

EDIT:
http://www.cs.ucla.edu/~awarth/papers/dls07.pdf (*)
http://planet.plt-scheme.org/package-source/dherman/struct.plt/2/3/doc.txt

(*) looks really interesting. also, i need to have a decent look at COLA:
http://piumarta.com/software/cola/

EDIT:
i changed the syntax for the peephole optimizer to something more akin
to algebraic types & matching.. still a bit of a hack, but there's a
better adapted quoting mechanism now.


Entry: deleted from brood.tex
Date: Sun Sep 30 19:25:53 CEST 2007


Some important assumptions I'm making to support the current solution
are that code updates need to be made \emph{while running}, and that
the target is severely \emph{resource constrained} such that all
compilation and linking needs to be done off--target. This excludes
\emph{late binding} of most code.

Another assumption I'm making is that some binary code on the target
will never be replaced, and will drift out of sync with the evolution
of the language in which it was written. An example of this is a
\emph{boot loader}. Such code needs to be viewed as a black box.

This approach violates transparency.

To give this section some context, I have to make my \emph{beliefs}
more explicit. I believe that a compiler is best implemented using
pure functional programming, because it is in essence a
\emph{function} mapping a source tree to a binary representation of
it. This idea is easily extended with \emph{bottom up} programming,
where part of the source tree generates a compiler to compile other
parts of the source tree. In order to make this work, I believe you
need \emph{transparency}. By this I mean that all \emph{reflection}
(compiler compilation) is \emph{unrolled} into a directed acyclic
graph representing a code dependency tree.

On the other hand, I believe that a microcontroller is best modeled as
a \emph{persistent} data structure. A microcontroller is a
\emph{physical object}, and should be modeled as such,
\emph{independent} of the compiler that is used to create the code
comprising the object state. This is what makes Forth interesting: the
ability to \emph{incrementally update} without having to recompile
everything. Due to limited hardware support (flash ROM is not RAM),
\emph{late binding} becomes problematic, and also induces a
significant performance penalty. This makes \emph{early binding} a
reasonable alternative: in the end the objective is to at least
provide the possibility to write efficient code at the lowest level of
the target language tower.

This is the heart of the paradigm conflict. Where do I switch from a
transparent language tower to \emph{dangerous} manually guided
incremental updates? Maybe the question to answer would be: why does
one want to have this kind of low--level control anyway? The real
answer is that at this moment, I don't really know how to create a
transparent system. The real reason for that is that I've been locked
in a certain paradigm.

Let's explore what would happen if we lean towards any of the two
extremes. If the whole system were transparent, the controller code
would need to be treated as a filesystem if incremental updates were
still to be used. After code changes, one could simply recompile,
relink and upload only the parts that changed. This is the sanest
thing to do.


Entry: misc improvements
Date: Sun Sep 30 21:35:40 CEST 2007

note that 'load' as it does currently doesn't 'commit'. actually,
that's not how it's used mostly! also, automatic commit might be nice
for compile mode..

on the other hand, compile mode is kind of an advanced feature also.


Entry: structures for music
Date: Mon Oct  1 05:48:46 CEST 2007

this is more of a tutorial pre. i saw aymeric was using the stack to
store sequences, which is not a good idea.. i see 2 other ways: flash
and ram. i kinda like the x / . approach for pattern synths. the trick
is to do multiple voices, so i really need some kind of multitasking.

say i have 3 patterns

: bd  o . . . o . . .  bd ;
: sn  . . . . o . . .  sn ;
: hh  o . o . o . o .  hh ;

what do o and . do ?

let's assume that recursion is not allowed in these patterns. what can
we hide in a single invocation? a simple trick is to use the
dictionary shadowing: the words could call some fixed word, which is
re-implemented later.

: instrument   do something ;

: bd   o . . o . .  bd2 ;
: bd2  . . o . . o  bd ;

we could have:

: o instrument yield ;
: . yield ;

hmm.. it's probably better to directly use names instead of this
name-capture thing.

if recursion is disallowed, it should be possible to store each thread
in a single byte, so a lot of threads are possible. in that case, an
explicit interpretation and automatic looping might be better, using
routing macros.


Entry: purrr reference documentation
Date: Mon Oct  1 16:13:01 CEST 2007

documentation for each macro. this contains 2 things:

- stack effect (type)
- 1 line human readable doc which possibly points to more information.

so a word's meta info looks like (+)

((type . (a a -- a))
 (doc  . "Add two numbers"))

if i can't do types yet, i should at least put the stack effect in a
form that can be used later to do types. it's also probably a good
idea to add meta-data separately to not clutter the code.

so, how to infer types? from the lowest level (pattern matching
macros) i can infer a lot.

first some cleanups: i'm taking out the 'compiled' field in the word
structure, because it's better to just save the source of macros
before they're being compiled, instead of trying to recover them
later.

what about word-semantics? i forgot the reason why sometimes it cannot
be filled.

been poking in the rpn.ss internals and i guess it's best to have the
state tx take a compiler for backup. but, this doesn't work for some
other reason i can't remember.. tata: spaghetti.

let's see if i can hack around it now by simply providing a language
name for backup.


Entry: i need closures
Date: Mon Oct  1 20:26:22 CEST 2007

yep..

too much crap going on with trying to call from prj -> base and having
to pass arguments.

EDIT:
when i wrote 'compose' i made sure to not allow composition between
words with different semantics. however, i'm not so sure if that's a
good idea.. i only want to use closures on functional words, not on
state words. maybe is should let go of this control freakish behaviour
since the source rep is only debug: it doesn't work relyably for all
words to reconstruct from that source..


Entry: dsPIC
Date: Tue Oct  2 03:46:01 CEST 2007

maybe it's time to try it out, and gently grow it into being. some
challenges:

- 3 bytes / instruction
- 16 bit datapath
- addressing modes

flash block erase size is 96 bytes, but address-wize this counts as 32
instruction words.

  The dsPIC30F Flash program memory is organized into rows and
  panels. Each row consists of 32 instructions, or 96 bytes. Each
  panel consists of 128 rows, or 4K x 24 instructions. RTSP allows the
  user to erase one row (32 instructions) at a time and to program
  four instructions at one time. RTSP may be used to program multiple
  program memory panels, but the table pointer must be changed at each
  panel boundary.

I don't understand why it says 'four instructions at a time' and then
later on talks about 32 at a time: "The instruction words loaded must
always be from a group of 32 boundary."

And the confusion goes on "32 TBLWTL and four TBLWTH instructions are
required to load the 32 instructions."

this looks like a typo.. let's download a new version of the
sheet. got DS70138C now. they're at version E. it's got the same typo.

so assume i need to write per 32 instructions + some magic every
4K instructions (updating a page pointer?). apart from the latter it's
quite similar to the 18f, just a larger row size size.

it looks like this thing is byte addressed, but for each 2 bytes,
there's an extra 'hidden' byte! lol

ok, there is a sane way of looking at it: the architecture is 16-bit
word addressed, but every odd word is only half implemented:
instruction width is 3 bytes.

it looks like it's best to steer the forth away from all the special
purpose DSP tricks like X/Y memory and weird addressing modes. looks
like an interesting target for some functional dataflow dsl though.

there are 2 kinds of instructions: PIC-like instructions that operate
on WREG0 and some memory location, and DSP-like instructions that use
the 16 registers.

roadmap:
- find a 8bit -> 16bit migration guide from microchip
- partially implement the assembler to PIC18 functionality


Entry: direct threaded forth
Date: Tue Oct  2 07:26:49 CEST 2007

i'm toying a bit with the vm forth. and was thinking: it's not
necessary to go stand-alone. it's much better to test this vm forth as
another target.


Entry: type signatures from pattern matching macros
Date: Tue Oct  2 14:38:47 CEST 2007

It should be possible to mine the 'source' field of pattern matching
macros for types, or at leas stack effect, of functions.

the first matching rule is always the most specific one: if that fits
a certain pattern.

the REAL solution here is to change the pattern matcher to REAL
algebraic types instead of this hodge-podge. moral of the story:
whenever pattern matching occurs on list structure, what you really
are looking for is algebraic types.

yes... i'm not going to muck around in this ad-hoc syntax. i need a
real solution: something on top of the current tx. i need real
algebraic types.

there is this:
http://planet.plt-scheme.org/package-source/dherman/struct.plt/2/3/doc.txt

but for my purpose it might be better to just stick with the current
concrete list representation for the asm buffer.

what about:


(([qw a] [qw b] +)   ([qw (+ a b)]))

->

((['qw a] ['qw b] +)   `([qw ,(+ a b)]))


looks like 'atyped->clause' in pattern-tx.ss is working. it's indeed
really simple to implement on top of the matching clauses.

looks like 'asm-transforms' works too.

i ran into one difficulty though. call it polymorphism. in original syntax:

   (([op 'POSTDEC0 0 0] ['save] opti-save) `([,op INDF0 1 0]))

cannot be expressed in the new syntax. however, this is
exceptional. it's probably a good idea to make this polymorphism
explicit. EDIT: it is possible to use unquote!! a bit of abuse of
notation, but ...

let's write the pic18 preprocessor on top of asm-transforms instead of
compiler-patterns.

ok. done. old one's gone.

now it should be a lot easier to write some documentation or type
inference..

i tried to tackle the 'pic18-meta-patterns' but i don't seem to get
anywhere. current syntax is way to complicated. it really shouldn't be
too hard by taking a more bottom up approach instead of trying to use
'callbacks' that force the preprocessing of some macro's
arguments. write a single generator macro for each kind.

trying again. this is the thing i want to generate:

  (define-syntax unary
    (syntax-rules ()
      ((_ namespace (word opcode ...))
       (asm-transforms namespace
                       (([movf f 0 0] word) ([opcode f 0 0])) ...
                       ((word)              ([opcode 'WREG 0 0])) ...))))
    
from this

  (asm-meta-pattern (unary (word opcode))
    (([movf f 0 0] word) ([opcode f 0 0]))
    ((word)              ([opcode 'WREG 0 0])))


the thing which seems problematic to me is the '...'

more specificly

(pattern template) ...   ->   (pattern template) (... ...) ...

that doesn't seem to work.

it looks like the 'real' problem here is due to the fact that i'm
expanding to something linear.. i'm inserting stuff. i wonder if it's
possible to modify the asm syntax a bit so it will flatten
expressions.

wooo.. macros like this are difficult. i'm currently doing something
wrong with mixing syntax-rules with calling an expander directly. best
to stick with plain syntax-case and direct expansion: that's easier to
get right.

the deal was: sticking with syntax-rules as a result of a first
expansion worked fine, i just needed to put the higher order macro in
a different file for phase separation reasons.

so.. the remaining step is to collapse the compiler-patterns-stx
phase, and add the current source patterns to the word source field,
which would yield decent docs.

ok, done.

> msee +
asm-match:
((((qw a) (qw b) +) ((qw `(,@(wrap a) ,@(wrap b) +))))
 (((qw a) +) ((addlw a)))
 (((save) (movf a 0 0) +) ((addwf a 0 0)))
 ((+) ((addwf 'POSTDEC0 0 0))))
> 

that should be easy enough to parse :)
CAR + look only at qw.

the 'wrap' thing is something that needs to be cleaned up too.. i
tried but started breaking things. enough for today.

this is what i get out for qw -> qw

(((qw a) (qw b) --) ((qw `(,@(wrap a) ,@(wrap b) #f))))
(((qw a) (qw b) >>>) ((qw `(,@(wrap a) ,@(wrap b) >>>))))
(((qw a) (qw b) <<<) ((qw `(,@(wrap a) ,@(wrap b) <<<))))
(((qw a) drop) ())
(((qw thing) |*'|) ((qw thing)))
(((qw a) (qw b) ++) ((qw `(,@(wrap a) ,@(wrap b) #f))))
(((qw a) (qw b) swap) ((qw b) (qw a)))
(((qw a) dup) ((qw a) (qw a)))
(((qw a) (qw b) or) ((qw `(,@(wrap a) ,@(wrap b) or))))
(((qw a) (qw b) and) ((qw `(,@(wrap a) ,@(wrap b) and))))
(((qw a) neg) ((qw `(,@(wrap a) -1 *))))
(((qw a) (qw b) xor) ((qw `(,@(wrap a) ,@(wrap b) xor))))
(((qw a) (qw b) /) ((qw `(,@(wrap a) ,@(wrap b) /))))
(((qw a) (qw b) *) ((qw `(,@(wrap a) ,@(wrap b) *))))
(((qw a) (qw b) -) ((qw `(,@(wrap a) ,@(wrap b) -))))
(((qw a) (qw b) +) ((qw `(,@(wrap a) ,@(wrap b) +))))

i also made a 'print-type' function. for '+' :

((qw qw) => (qw))
((qw) => (addlw))
((save movf) => (addwf))
(() => (addwf))

this might be useful.. but what's more useful is the building of a
framework that enables this for all functions. it works for the
assembler primitives only.


Entry: TODO
Date: Sat Oct  6 20:54:13 CEST 2007

- live command macros
- put live commands in a namespace
- add doc tags: math/control/predicates/...
- write small tutorial:
  * assembler + PIC18 architecture
  * logic, addition and 8 bit programming (hex + binary)
  * the x and r stacks
  * route (DONE)
  * predicates & conditionals
  * run time computations & ephemeral constructs
- fix macro cache re-init + initial state loading (DONE)
- fix quoting in macros
- fix hardcoded paths (rename brood/brood)
- rename compilation stack

Entry: unsigned demodulator
Date: Sat Oct  6 21:30:01 CEST 2007

the pic18 has a hardware multiplier, which is nice. however, computing
signed multiplication takes quite a hit compared to unsigned. i was
wondering if i can do an amplitude-only demodulator using only signed
multiplications.

the entire function is unsigned -> unsigned.

    signal -> mixer -> I / Q -> I^2 + Q^2 -> LPF


[EDIT: deleted a long erroneous entry. the thinking error was about
the commutation of the LPF and the squaring operation. the above
expression just gives the average signal power.]

the correct formula is:

  X -> (I,Q) => LPF -> || . ||^2

that's completely symmetric wrt to phase. 

the LPF is straightforward: a simple 1-pole will probably do if i keep
the bitrate low. a 2^n-1 coefficient is easy to implement without
multiplication.

the I=XC and Q=XS multiplications can probably be simplified since X =
x-h and C = c-h have no DC components. here h = 2^(bits-1).

  I = X C
    = (x-h) (c-h) 
    = xc - hx - hc + h^2
    = xc - h(x + c - h)
    = xc - h(x - h + c - h - h)
    = xc - h(X + C + h)

    DC
    = xc - h^2

which is quite intuitive: take the average of xc, but remove the dc
component.


Entry: frequency decoding
Date: Mon Oct  8 00:26:37 CEST 2007

for krikit, the choice to make is to either decode the whole spectrum
(listen to everything at once) or listen only to a single band. this
is a choice that has to be made early on.. some remarks.

* FFT + listening to all bands is probably overkill. it's not so
  straightforward to implement, so the benefits should be big. FFT for
  point-to-point makes only sense when combatting linear distortion.

* single frequency detection is really straightforward. the core
  routine is a MIXER followed by a complex LPF. the output is phase
  and amplitude.

* using a sliding window average LPF together with orthogonal
  frequencies allows for good channel separation. this works for
  steady state only, so some synchronization mechanism is necessary.

* sending out a single message to multiple frequencies: easy to do
  with pre-computed tables for 0 and 1. phase randomisation to avoid
  peaks is possible here.

* i'm afraid of linear distortion due to room acoustics.. maybe FM/FSK
  should be used?

* if non-linear distorition is not a problem, DTMF frequencies are not
  necessary.

* using exact arithmetic, it is easy to update/downdate a state vector
  for rectangular window LPF. this update can be performed at the
  input of the mixer.

* bandwidth limitation for transmission.

http://en.wikipedia.org/wiki/Olivia_MFSK


Entry: network debugging + pic shopping
Date: Mon Oct  8 15:27:25 CEST 2007

- avoid remote reset: use WDT
- central power gives panic switch
- use a standard bus protocol for comm (I2C ...)

The 18f1220 doesn't have I2C, so it might be better to go for a
different component. Lowest pin count is 28. Let's take the one with
the most memory to have some room for tables and delay lines. I'm
thinking about

18f2620:  64kbytes flash, 3968 bytes ram (maxed out)

This is also a nice target for a standalone language. These are the
same, with some things missing.

          EEPROM (b)     FLASH (kb)
18f2620    1024           64
18f2610    0              64
18f2525    1024           48
18f2515    0              48


Entry: 8 bit unsigned mix -> complex 16/24 bit acc
Date: Mon Oct  8 15:32:30 CEST 2007

I've been toying a bit with a mixer + accumulator building block, and
it seems it can be quite simple. Some remarks:

- Perform signed offset correction out of the accumulation loop.

- Perform update/downdate for rectangular window at input due to
  commuation with mixer.

- As long as the result of accumulation fits in the word length,
  overflow is not a problem.


If a signed number X is represented by an unsigned number x, the
difference is X = x - h, where h = 2^{n-1} is 'half'. Per signed
multiplication there is an offset of h^2 = 2{2n-2}.

What this means is that once per 4 accumulations, the correction term
disappears due to word overflow if 2 bytes are used. However, the
maximal filter output occurs at full scale input, which will overflow
the accumulator if more than 4 accumulations are used, so maybe it is
better to use a 3 byte state. In any case, if the number of
accumulations is a power of 2, removing the unsigned offset is a
simple bit operation.


Entry: transmission
Date: Mon Oct  8 16:46:26 CEST 2007

Using hardware PWM with 8 bit resolution I can send out at 39kHz,
assuming fosc = 40HHz. This is still well beyond the maximal frequency
at about 3kHz, and won't pass the speaker, so an analog filter is not
necessary. Differential drive (half bridge) could be used.

One thing to note is that only ECCP (enhanced) can do multi-channel
PWM. The normal PWM is only single output, and all 28 pin chips have
just 2 x CCP. The 18 pin 18f1x20 has a single ECCP, and the 40/44 pin
18f4xxx also have one.

Looks like that's quite a limitation.. On the other hand, a CMOS
inverter could be used on-board.. Is that worth it? Probably not. A
simple coupling condenser will do the trick.


Entry: self programming 5V
Date: Mon Oct  8 18:01:24 CEST 2007

Something i just noticed in the 2620 datasheet: self-programming works
only at 5V ?


Entry: no apology!
Date: Thu Oct 11 01:43:40 CEST 2007


i tried a couple of times this week to explain the "ephemeral" macro
idea, but it's just insane. i need a real solution:

- macro code needs to know wether a certain word is defined or not.

- if a partial evaluation can't be computed, the error should be:

   * none, if the corresponding library code can be found.
   * "partially implemented literal construct" or something..


what i do need to explain is "Why leaky abstractions are not
necessarily bad." This is a core Forth idea quite opposed to the safe
language ideal. I'm using a lot of that stuff, and I guess it's good
to make a list of these.

looking at the code, this 'need-literal' error only happens in 3
places: toggle set and bit? i just took them out: they refer to code
words now, up to user to implement.


Entry: Purrr semantics
Date: Fri Oct 12 16:25:44 CEST 2007

As explained in Brood, there is only a single semantics of a Purrr
program: it is a compositional, purely functional language. A Purrr
program consists of a set of (recursive) macro definitions, and a
``body'' which defines a compilable function with reduced semantics.

It would be really cool if i could get rid of the explicit
'compilation' step, and make everything just declarative.

What i'd like to do is to apply this approach to scheme. Maybe that's
what PICBIT is doing?


Entry: train notes about syntax, semantics and metaprogramming
Date: Fri Oct 12 17:41:09 CEST 2007

I can identify 3 distinct uses of macros:
 - control flow (begin ... again)
 - optimization (1 2 +)
 - explicit meta (using the words '>m' and 'm>')

The latter is actually the same as the first. The 'm' stack is like
the 'r' stack: it is used to implement nesting constructs.


conceptual problem: jumps, can be solved by writing all jumps as
recursion and using higher order functions (combinators). together
with using only a single conditional statement, the solution is to
enable syntax for quoted macros. this leaves:

    * conditional (IF)
    * quoting operation (LAMBDA)
    * dequoting operation (APPLY)


The core ideas behind the macro language are:
    * purely functional (no side effects)
    * everything is first class
    * purely compositional (no syntax)
    
Then, the target langauge should inherit as much as possible from
these properties:
    * functional word subset (data stack)
    * possibility of HOF (with/without closures) using byte codes?
    * mostly pure compositional semantics, with a little syntax sugar

Construct a powerful metaprogramming system by starting with a pure
language, and making the transition (projection) from pure/ephemeral
-> non-pure/concrete explicit. In Purrr this is the decision to use
macros or words to impelement functionality.

Is metaprogramming a form of message passing? Sending "reconfig"
messages?


MERGE TODO:

- check how PLT classes solve name spaces issues + use this for macro namespace.

- fix macro quoting and nesting. maybe write program as list of macros
  instead of 1 macro now? it's isomorphic, but possible to manipulate.

- don't solve nesting in source preprocessor: that is to remain
  regular, and the parser is to be explicit (compilation 'meta'
  stack). maybe this requires a real extensible descent parser?

- check how Factor implementes closures

- make interaction words extensible

- check >> and 2/ simulation and partial evaluation words


Entry: notes remarks
Date: Fri Oct 12 17:43:50 CEST 2007

There are only 2 kinds of distinct primitive macros:

  - partial evaluation macros (written in pattern language)
  - nested structures (written in CAT)

Composite (recursive) macros can combine both. This seems to be the
way to explain how things are going + a way to clean up the code a bit
and reduce the number of primitive nesting macros. Apparently, that's
already accomplished.. on the other hand, these are entry points to a
type inference system.. edit: just re-implemented >c and word>c as
pattern matching words.


I am in trouble: I want to explain why I diverged from explicit
Forth--style metaprogramming to move to compositional macro semantics
with partial evaluation, and why at the same time i'm not going full
length: instantiation is still limited to a subset of the full macro
semantics. The thing is: having metaprogramming constructs in the
language disguised as 'compatible semantics' is a good idea: explicit
primitive macros can be reduced quite a lot. So what's the question??


Entry: debug bus
Date: Fri Oct 12 17:57:41 CEST 2007

      - identical clients
      - ad-hoc 1-wire instead of SPI/I2C/async/...
      - host = master
      - binary tree-like physical structure
      - cables/connectors ?
      - multihost: just use shared terminal

EDIT:
maybe an ad-hoc network is best to avoid at first.. let's get
something simpler working before trying crazy stuff.


Entry: quoting macros
Date: Fri Oct 12 19:53:43 CEST 2007

this looks a bit like the final frontier. currently i can't write
Forth in terms of a compositional language. with the current pattern
matching language, it would be trivial to do so if i had a
representation of anonymous macros. basicly i want:

	       [ 1 + ] [ 1 - ] ifte

that's easy enough if '[' and ']' are part of a parser preprocessor.
however, anything defined in terms of those, like 'if' 'else' 'then'
needs to be implemented as parser macros also! this complicates
things.. i see only 2 solutions:

	    - implement all nested words as parser words
	    - figure out a way to unify parsers and macros

what about this: allow the use of syntax '[' and ']' as a macro
quoter, but write words like 'ifte' in terms of Forth, instead of the
other way around.

again: i'd like to have an explicit compilation/macro stack lying
around, however, quoted macros are nice to have. this is
non-orthogonal, but does it really matter? i don't know what to think
about this..


Entry: Haskell
Date: Sat Oct 13 15:32:18 CEST 2007

I've been looking for an excuse to use Haskell for something
non--trivial. The demodulator (and unrelated, iirblep filter) might be
a good problem to tackle. OTOH, the real exercise is probably to write
a prototype in Scheme, test it, and then write a specific compiler to
translate that algorithm into C or Forth. So maybe best demodulator in
scheme (see filterproto.ss) and iirblep in Haskell?


Entry: the purrr slogan
Date: Sat Oct 13 18:45:18 CEST 2007

in order to explain what purrr actually is, it is best to set these
two points:


 * Purrr is a macro assembler with Forth syntax. It is implemented in
   a purely functional compositional macro language.

 * Because of the similarity of the procedural Forth langauge and its
   meta programming language most metaprogramming can be done by
   partial evaluation, blurring the distinction between the concrete
   procedural language, and the ephemeral macro language. In a sence:
   PE is not just an optimization, but an *interface* to the
   metaprogramming language.

 * The PE is implemented as greedy pattern matching macros (is this
   important?)

Entry: removed from purrr.tex
Date: Sun Oct 14 16:52:10 CEST 2007

\section{The Big Picture}

Purrr can be used in its own right, but it is good to note that Purrr
is part of the Brood system, which is an experiment to combine ideas
from Forth, (PLT) Scheme and compositional functional languages into a
single coherent language tower. Purrr can be seen as an
\emph{introspective boundary} in this language tower: the core of
Purrr is to be the basis of this language tree, but the scope of Purrr
is limited to a low--level language with Forth syntax and semantics
and some meta--programming facilities disguised as Forth macros. For
example, it is not possible to access the intermediate functional
macro representation directly from within Purrr at this moment; this
still requires extension of the compiler itself using the Scheme and
CAT languages. This separation between the Purrr language and its
implementation serves to to keep the programmer interface to Purrr as
simple as possible, while the detais of the language tower are worked
out to eventually lead to a more coherent whole. Purrr by itself is
reasonably coherent, although it is somewhat limited in full
reflective power by this language barrier. Eventually, Purrr should be
just an interface (with Forth syntax) to the low level core of the
compositional language tower in Brood.

Because Purrr is implemented only for the Microchip PIC18
architecture, there is no tested \emph{standard} machine layer: most
functionality is fairly tied with the PIC18. I am confident however,
that refining the split of the current code base into a shared and
platform specific component is fairly straightforward. Due to the ease
with which to create an impedance match in a Forth like language, I am
refraining from an actual specification of this standard layer until
the next platform is introduced.  By consequence, the border between
the machine model and the library might shift a bit.

Purrr's macro system is the seed for a declarative functional
language. Such a language would have no explicit macro/forth
distinction as in Purrr.


Entry: new ideas from doc
Date: Sun Oct 14 16:52:21 CEST 2007

It looks like things are getting cleaner: by taking this partial
evaluation thing serious, CAT primitives can be largely
eliminated. Just the words >m and m>, together with some stack
juggling words like m-swap, are enough to implement the whole
language. I just need to clean up a bit more so this idea can be
sealed as a property: no primitives except for a stack!

For documentation purposes it might now even be a good idea to write
most code in compiler.ss and pic18-compiler.ss in Purrr syntax,
leaving only the true primitives in s-expr syntax.  EDIT: that's a bad
idea until the forth syntax can represent everything the s-expr syntax
can.

The remaining cleanup brings me to the backtracking for/next
implementation. With just quoted macros and a 'compile' that executes
macros, this can be removed from the primitives.


Entry: writing lisp code in emacs
Date: Mon Oct 15 01:38:51 CEST 2007

watching slime screencast

* insert balanced paren: M-( with prefix arg


Entry: quoting macros
Date: Mon Oct 15 17:10:44 CEST 2007

Apparently, it was already implemented. I rewrote the for/next
backtracking so now it's expressed as recursive macros, except for the
part that tests the data structure constraint.

I guess what i have now is that compositional language forth
dialect. The only problem is that my Forth parser doesn't support
it. I just need to write some macros to transform code that uses
literal quoted macros into other constructs. Start with ifte:

   ;; Higher order macros.
   (([qw a] [qw b] ifte)
    ((insert
      (list
       (macro: if   'a compile
               else 'b compile
               then)))))
   

/me got big smile now :)


Entry: practical stuff : starting a new project
Date: Wed Oct 17 14:13:13 CEST 2007

I need to make my old 18F452 proto board work again, so this entry is
a seed for a "getting started" doc: how to get from nothing to a
working project.

EDIT: i'm switching to a 18F2620, so doing it over again.

Assumptions:
 * the project is part of (your branch of) the brood distribution
 * you're using darcs version control


1) Make a directory in brood/prj, and add to darcs

      cd brood/prj
      mkdir proto
      darcs add proto

2) Copy the following files from another project. i.e. prj/CATkit and
   add them to the darcs archive
    
      cd proto
      cp ../CATkit/init.ss
      cp ../CATkit/monitor.f .
      darcs add *

3) Edit the init.ss file to reflect your project settings.

skip step 4-6 if you have a chip with a purrr bootloader

4) Edit monitor.f for your chip

   That file includes the support for the chip in the form of a
   statement:

      load p18f2620.f  

   Look in the directory brood/pic18 to see if such a file exists. If
   it does, go to step 5).

   If not, you need to create one and generate a constants file from
   the header files provided by Microchip. I.e.:

      cd brood/pic18
      ../bin/snarf-constants.pl \
      		< /usr/share/gputils/header/p18f2620.inc \
		> p18f2620-const.f

   The .INC file can alternatively be found in the MPLAB distribution,
   in the MPASM directory. 

   Now you need to create the setup file for the chip. Start from a
   chip that is similar

      cp 18f1220.f p18f2620.f

   And edit the file to reflect changes necessary for chip startup and
   serial port initialization.
   		

   Don't forget to add the files to darcs, and send a patch!

      darcs add p18f2620*.f
      darcs record -m 'added p18f2620 configuration files'
      darcs send --to brood@zwizwa.be http://zwizwa.be/darcs/brood

   In case you can't send email from your host directly, replace the
   "--to brood@zwizwa.be" option with an "--output darcs.bundle" option and
   send the resulting darcs.bundle file.


5) To compile the monitor in the interactive console type this:

      project prj/proto
      scrap

6) Make a backup copy of the monitor state.

      cp prj.ss monitor.ss

   And flash the microcontroller using the monitor.hex file.  In case
   you're using the ICD2 together with piklab, the command line would
   be:
   
      piklab-prog -t usb -p icd2 --debug --firmware-dir <dir> \
                  -c program monitor.hex

   Here <dir> is the directory containing the ICD2 firmware, which can
   be found in the microchip MPLAB distribution.


7) Next when you start the console, go back to the project by typing:

      project prj/proto


8) Now you can start uploading forth files using commands like:

      ul file.f

   This will erase the previously uploaded file and replace it the new
   one. If you want to upload multiple files, use the 'mark' word
   after upload to prevent deletion:

      ul file1.f
      mark
      ul file2.f

   Now the next 'ul' will erase file2.f before uploading a new
   file. To erase files manually, use the 'empty' word.


--- LIVE MODE ONLY ---
bin/purrr
project prj/CATkit
ping


Entry: this is a simultaneous fix/todo log for the previous entry
Date: Wed Oct 17 14:28:38 CEST 2007

- add default entries to dictionary on init
- single baud rate spec? mine it from forth source, or the other way around..
- standard naming for the state file?
- for chips that come with a bootloader: need to save the pristine file
- fix state file rep so it is a standard s-expression tagged with 'project'
- fix absolute path
- add 'serial' tag to port
- add a 'chip erase' or a fake one using "mark empty"

the 3 different state files:

    - init.ss         "most empty" state
    - monitor.ss      state file of bootloader only
    - prj.ss	      current state

these names are set as default, but can be overridden.

ok. done.

'monitor.ss' is never written by the application, so ppl with just a
monitor.ss file can revert to just that file (not implemented yet).


Entry: operations on dictionaries
Date: Wed Oct 17 15:34:26 CEST 2007

I'm trying to factor dictionary operations a bit. I already ran into
'collect' which takes a list of tagged pairs, and collects all
occurences for each unique pair. Doing this stuff purely functional
becomes difficult if performance is an issue: naive algorithms are
quadratic. Hash tables could accellerate. It seems overall that
mutation is the thing to choose here..


Trying to write these hierarchical combination things i'm getting
convinced that it's a bit of a mess.. (name . value) pairs are well
defined, but hierarchical structures require polymorphy. To make the
analogy with ordinary functions, basicly you're dealing with a
function that maps a value to a value OR another function..

Maybe the whole abstraction is broken?

I need to think about this.. something profound seems to be hidden
here. I'm going to hack around it for now.

I think I get it.. and it's trivial again.

  A hierarchical hash table (HHT) is an implementation of a finite
  function which maps tag SEQUENCES to values. All operations on HHTs
  have the semantics of operations on finite functions.

From this follows that paths need to be created if a value is
stored. It doesn't make sense to have to create the directory before
storing a value. Otoh, storing a value in a tag sequence, where one of
the top nodes is not a hhash is an error.


Entry: PIC write protect
Date: Wed Oct 17 20:20:23 CEST 2007

write protection works well and all, but i can't get it undone! i
think it works without problem in mplab, but using the piklab
programmer erasing the chip doesnt seem to work...

what is needed is a full chip erase. it doesn't look like piklab is
doing this correctly. on to installing mplab again..

OK i got it: memory that's protected requires a BLOCK ERASE, and such
an operation needs Vdd > 4.5


Entry: macro nesting
Date: Thu Oct 18 14:15:39 CEST 2007

time for the hiary problem: the syntax-rules -> syntax-case equivalent
for macros. what do i need:

there is only one decent way of doing this: use scheme
metaprogramming. i like forth and all, but for numeric stuff, it's
just easier to have variable names.. let's invent some new construct:

\ load a scheme file implementing macros
load-scheme  filename.ss

been hacking a bit, but i need a plan..

* s-expression files contain scheme expressions, not forth files with
  s-expression syntax. this effectively needs a scheme parser down the
  line, something that can convert the inline atoms to proper
  invokation.


what about this: make it possible to load plt modules from
forth. modules are stored as a single s-expression.

hmm... again.. some questions:

* how to store a module definition in the state file, so it can be
  instantiated?

all macros in the '(macro) dict get evaluated using def-macro!, which
does:

  (define (def-macro! def)
    (ns-set! `(macro ,(car def))
             (rpn-compile (cdr def) 'macro:)))

rpn-compile evaluates `(macro: ,def)

so this won't work to store modules. it's probably best to represent
the macros differently in the state file, so it's just scheme code,
and then create a module: evaluator.


that's the main problem: 

* how to store the source of things that generate macros, in this case
  a scheme module, so they can be re-instantiated from the state file.

* do this without introducing ANY limit on what can be included in the
  scheme file.

* without introducing yet another special case. in fact it's probably
  better to remove a special case driven by this requirement.


let's go back to how macros are parsed. ok. they are included as a
(def-macro: <name> . <body>) expression in the atom stream. i guess
this needs to change to include a (def-module: . <body>) form.

why not change the def-macro: thing to a more general def-scheme:
syntax?


(def-macro: name . body)

-> (def-scheme: (def-macro! name body))

or..

have def-macro! support modules. i guess that's the simplest way.

ok.. changed the tag to "extend:" and changed the function that
implements the extension to "extend!"

i'm running into some bad behaviour.. need to formalize


Entry: forth translation
Date: Thu Oct 18 16:58:21 CEST 2007

Time to formalize the forth parsing. Some notes:

- it's actually just a lexer: no nested structures are handled in this
  stage: all is passed to the forth macros, which use the macro stack
  to compile nested structures.

- FILE: the first stage does only file -> stream conversion. this
  includes loading (flattening the file hierarchy)

- PARSE: the second stage does 'lookahead' parsing: all
  non-compositional constructs get translated to compositional
  ones. this also includes macro definitions.

The problem I run into is the FILE stage, which also needs to inline
scheme files, but gets messed up by the forth parser. I just need to
tag them differently.


Entry: error reporting
Date: Thu Oct 18 22:26:25 CEST 2007

using 'error' instead of 'raise' is a good idea since continuation
marks are passed. the rep.ss struct marks CAT words, so something
resembling a trace can be printed. the cosmetics can be done later,
this is good enough for now.

done. maybe want to convert some exceptions that are clear enough back
to raise so they don't print a stack trace. (reserved-word time-out)


Entry: hardware prototyping
Date: Fri Oct 19 11:22:00 CEST 2007

TODO:
- sine wave generation
- debug network
- connect a modulator and a demodulator

the first one seems rather trivial to me, so let's do the network
today. first thing is to give up on an ad-hoc bus: that's ok for
uni-directional stuff, but bidir is a pain. so let's go for something
standard.

i got the samples in yesterday. got them running on the breadboard
with intosc. if we can pull off the project on 8MHz, we can run on
2xAAA cells: the 18LF2620 need only 2V, but they need 4.2V @
40MHz. i'm going to stick to intosc for now.

next: I2C

* 2 lines are used: RC3 = clock, RC4 = data, these need to be
  configured properly by the user. on the 18F2620 their only other
  function is digital IO.

* registers:
   - SSPBUF = serial data I/O register    
   - SSPADD = device address
   - SSPCON1, SSPCON2, SSPSTAT = control registers

* errors:
   - write collision

* firmware controlled master mode: seems it's just more work, so never
  mind..


Entry: TODO
Date: Sat Oct 20 13:30:29 CEST 2007


- get I2C working between 2 18F2620 chips on breadboard at intosc, as
  fast as possible.

- fix purrr.el : stupid broken indentation is annoying the hell out of
  me. clean up the file first, then automate the indentation rules
  generation etc..


Entry: message passing interface
Date: Sat Oct 20 14:00:30 CEST 2007

Since I2C is a shared bus architecture, care needs to be taken to
place operation in a sane highlevel framework. The interface i want is
asynchronous message passing. Messages should either be bytes, or a
sequence of bytes (in which case 'message' contains the size, and the
a/f regs contain the message)

      message address i2c-send

Let's suppose for now there is only a single process per machine, and
build multiple process dispatch on top of single process. 

To do this bi--directionally, an event loop needs to poll for
messages. Dispatching of highlevel messages (internal addresses) can
be done as a layer on top of single message passing. So i need a send
and receive task, and make sure they don't collide

   * it's always possible to RECEIVE, so that should be the background
     task. this simply waits until a message arrives.

   * it's only possible to SEND if the bus is free, so a SEND might
     block.
    
The problem is that a message might come in while waiting to send out
a message. Therefore messages need to be queued. The moral of the
story:
  
     A send can never block a process, only a receive can.

So what is a task? It is a function that maps a single input message
to zero or more output messages. The output can be zero in a
meaningful way, because the task has internal state. So basically, a
task is a closure, or an object.

The driver routine can be a single task, since the hardware is
half-duplex. See pic18/message.f for the iplementation attempt.

Something to think about: the ISR needs to be completely decoupled
from the tasks that generate output messages. This is the whole point
of buffering: if there is straight line code from RX interrupt ->
computation task, the tasks that might run a long time will not be
pre-empted. So:

      The RX ISR and the dispatch loop are distinct.

what it looks like (yes i need to pick up hoare's book again..)

\ Message buffering for a shared bus architecture. The topology looks
\ like this:

\                            wire
\                              |
\                              | G
\         A             E      v      F
\  wire ----> [ LRX ] ----> [ LTX ] ----> wire
\                |             ^
\  . . . . . . . | B . . . . . | D . . . . . . .
\                v      C      |
\             [ HRX ] ----> [ HTX ]
\
\ Code above the dotted line runs with interrupts disabled, and
\ pre--empts the code below the line. Communication between the two
\ priority levels uses single reader - single writer buffers. The 6
\ different events are:
\
\ A) PIC hardware interrupt
\ B) RX buffer full condition
\ C) TX buffer full condition (execute task which writes to buffer)
\ D) wakeup lowlevel TX task from userspace
\ E) wakeup lowlevel TX task from kernelspace
\ F) PIC hardware send
\ G) wakeup lowlevel TX task from bus idle event
\
\ A task is an 'event converter'. The 4 different tasks are:
\
\ LRX) convert interrupt (A) to tx buffer full B and tx wakeup E
\ HRX) convert tx buffer full (B) to rx buffer full (C)
\ HTX) convert tx buffer full (C) to tx wakeup (D)
\ LTX) convert wakeup (data ready: D,E) to hardware send.
\
\ The pre--emption point is A: this causes no problems for the
\ low--priority task because of the decoupling provided by the receive
\ buffer. The only point that needs special attention is the LTX task,
\ which can be woken up by different events D, E and G, and care needs
\ to be taken to properly serialize message handling. To do this, both
\ D and E should invoke LTX with interrupts disabled. For E this is
\ trivial: just call the LTX task, for G is is already ok since it's
\ an isr, so D needs to explicitly disable interrupts.
\


Entry: todo today
Date: Sat Oct 20 15:52:49 CEST 2007

- write highlevel buffer code and try out with current serial before
  moving to I2C

- write mini 'hierarchial time' tutorial for sheepsint

- check mail just sent to technocore for delails of the next couple of
  days.

haha.. did none of them :) i suck at planning. what i did do is to
write a synth tutorial that is an introduction to the hierarchical
time thing + some explanation of a pattern language. what this doc is
leading me to is the need for some kind of dynamic variable binding
for code words: i already have 'hook.f' but something more general
should be used. something which directly deals with variables.


Entry: re-inventing C++
Date: Sat Oct 20 16:11:22 CEST 2007

i'm running into the need for polymorphy: i want to express generic
algorithms in a sane way. because of the philosophy of purrr, this has
to be done in a static way, with dynamic built on top of that later
maybe.

Oei this is going to lead to a whole lot of doubts about namespace
management.. Let's concentrate on the practical issues first.

EDIT: i'm going for name mangling.. see below.


Entry: hierarchical time
Date: Sat Oct 20 18:13:38 CEST 2007

One thinking error i made is: if a note word is SYNC followed by
CHANGE, then you can't compose words that start at the same sync. as a
result, SYNC needs to follow CHANGE, and the toplevel invokation needs
to provide proper synchronization.


Entry: the 'i' stack
Date: Sat Oct 20 23:14:45 CEST 2007

what about this: i'm using an extra byte stack, and 'x' is a symbol
that's useful in other contexts.. why not call the stack the 'i'
stack, since it's already used as a loop index in for .. next loops?

hmm.. great idea, but not really feasible without an automated
identifier replace.. it's everywhere.


Entry: dynamic words
Date: Sun Oct 21 00:50:39 CEST 2007

basicly, i need to find words to properly handle execution tokens.
there are 3 uses for a symbol related to dynamic code:

      * declare
      * invoke
      * change behaviour

if it's avariable, invokation will be explicit: because i don't want
the thing on the stack, an extra level of indirection should do it:

    2variable BLA
    BLA invoke
    : changeit BLA ->  ...... ;

another possibility is to use a parser word, which i'm not so keen on
using.

what syntax is better depends on the usage: do invokations dominate,
or do behaviour changes? i used the "->" word in ForthTV to set the
display task: that's a single vector, invoked in only one place, but
muted in a lot of places. let's go for this approach. results in
vector.f (hook.f is basically the same, left there for forthtv)


Entry: todo
Date: Sun Oct 21 16:15:32 CEST 2007

- hierarchical time
- highlevel buffer code (requires some polymorphy)
- tonight: fix purrr.el, clean up stuff in doc/


Entry: hierarchical time
Date: Sun Oct 21 16:37:48 CEST 2007

so what's the problem?

you want to have a class of words which "snap to" a timing grid, but
you want to be able to call a collection of fine scale words from
coarse scale words, without messing up the sync. the problem is that
if you do:

  : foo   8 sync-tick bar bar ;
  : bar   7 sync-tick .... ;

there are too many waits: "8 sync-tick" followed by "7 sync-tick"
waits for the next 7-scale tick.

somehow the sync word needs to know that the current time is already
ok. either:

  * assume that the caller does the outer bounds, and have callees do
    only subdivision. this works, but is cumbersome.

  * find a way to see that we're running synchronized.


how can a 0->1 transition in bit n be recognized in the bits < n?
they're all 0. but that's not very helpful.

damn i need coffee.

the question to ask is: did we recently sync? this can be answered by
copying the whole counter register to some place, and computing the
diff. this also allows to trigger on edges.

what about this: use some dynamic scoping for syncing. there is only
one word 'sync' which will synchronize on clocks given the current
time scale. 


for each time scale one needs:

        a word that can compute the current phase count. this needs a
        bit offset and the last sync point. bit offset might be easily
        stored as a bit pattern.

global:
	the counter
	the last sync point

\ compute time difference from last saved sync point, using mask to
\ ignore fine scale.
: sync-diff
    sync-counter @
    sync-last @ -
    sync-mask @ and ;
macro
: sync-inphase?
    sync-diff nfdrop z? ;
forth    

actually that doesnt solve anything.. it's quite easy to wait until a
condition changes, but it's a lot less easy to determine wether the
condition just happened.

really, the only thing i see is to have patterns like this:


_|_|_|_

which can be nested in larger scale patterns like

_______|_______|_______|_______

_|_|_|_ _|_|_|_ _|_|_|_ _|_|_|_


there the first and last syncs are removed, and only the subdivision
is synced to. it's then te responsability of the caller to turn things
on and off.

it looks to me that this is a real pain to work with.. maybe i should
just write a couple of words and see if it's actually sane to get
something working.. one thing i thought about was to bind the current
sync level to the word "|"

: hihat [[ noise 10 for | next ]] ;

where the [[ and ]] save and restore the synth config on the x
stack. that's 7 bytes per level, which is a bit too much probably, so
stick with manual saving/restoring.


ok. there really is only one decent solution: escape continuations. in
order to make proper use of synchronization, the caller needs to
indicate how long a word is allowed to last.

now, instead of that, think of there being only one voice at all time,
which simply accepts events from a separate entity. so the synth looks like:

[ CONTROL ] -> [ VIRTUAL SAMPLE PLAYER ] -> [ CORE SYNTH ]

each virtual sample is a word that loops forever. this requires multitasking.


Entry: generic functions
Date: Tue Oct 23 16:09:09 CEST 2007

When trying to implement the buffer algorithm, i ran into the need for
abstract objects: each buffer (queue) is going to have the following
interface:

  read
  write
  read-ready?
  write-ready? (maybe.. in case buffer-full condition is used..)

I have enough with a static object system: anything dynamic has to be
handled explicitly on top of that using byte codes (route) or vectored
words. So what is needed is simply a static (compile time) method
dispatch.

Should there be special syntax for messages, or do we just use a
single flat namespace, with some words dedicated as messages? For
example: 'read' could be such a message: always requiring a literal
object. This seems simplest, let's try that first and change it if it
is not appropriate. So:

   - WHERE is 'read' defined
   - HOW is 'read' defined

Suppose we use a 'method' keyword for creating new methods. This
probably trickles down to making the parser also generic. Let's use
CLOS terminology.

So what am I doing?

    I am roviding a means for static namespace management so I can
    write generic algorithms (as macros). As of this point NO effort
    is made to implement dynamic generic algorithms: this should be
    built on top of the static version.

My approach is going to be very direct: if more abstraction is needed
i will fix it later. Currently multpli dispatch is not yet
implemented. The interface should be:


    class BLA	          \ create a new object (a macro namespace)
    method FOO	          \ declare a new method object
    BLA method: FOO ... ; \ define a new method FOO of object BLA
    BLA FOO	    	  \ invoke method FOO for object BLA


So, how to implement.. This was the easy part:

   ;; Dictionary lookup.
   (([qw tag] [qw dict] dict-find) ([qw (dict-find dict tag)]))
   

Now the thing to do is to store the dictionary somehwere. This has to
mesh with the macro definition part of purrr.. let's see (using s-expr
macro definition syntax on the rhs)

     class BLA	    == 	       (BLA '())
     method FOO	    ==	       (FOO 'FOO dict-find compile-message)

Here 'compile-message' depends on what's exactly stored in the
dictionary: macro objects or a mangled symbol. It's tempting to just
go with symbol mangling: that way ordinary syntax can be used, and
interface to the rest of the language is really straightforward.

Let's go for the simple symbol mangling, which doesn't even need
dictionaries:

     A class is a collection of methods. Classes are identified by a
     symbol. A method is a macro which dispatches to another macro
     based on the symbol provided.

     class BLA      ==         (BLA 'BLA)
     method FOO     ==         (FOO 'FOO dispatch-method)
     
     : BLA.FOO ... ;

     FOO BLA        ==         'FOO 'BLA dispatch-method


Entry: problem in macros defined in forth syntax: quote doesn't work properly
Date: Tue Oct 23 17:43:02 CEST 2007

suppose i want this:

	: broem  ' broem ;   ==  (broem   'broem)

how to do that? currently this just gives an ifinite expansion because
the quote is not recognised. why? because inside the 'definition'
parser, the parsing words won't work.. this is probably a good thing,
but quote does need to work.. let's separate parsing words from quote
parsing.

the lex stream should be made a bit more clear.

FORTH -> [load flattener] 
      -> [forth stuff: parsing words + definer environents] 
      -> [quoting] 
      -> SEXP


Entry: locals for macros?
Date: Tue Oct 23 20:08:46 CEST 2007

Once more than 50% of a macro's code is stack juggling words,
something needs to be done about it. The macro below is a typical
'multi-access' pattern: an EXPANSION instead of a CONTRACTION.

\ transfer bytes from one object to another
macro
: need not if exit then ;
: m m-dup m> ;    
: transfer-once  \ source dest --
    swap >m >m

    ' ready? m msg need m-swap
    ' ready? m msg need m-swap
    ' read   m msg m-swap
    ' write  m msg m-swap

    m-drop m-drop
    ;
forth

What i really want is a locals syntax for macros that perform a lot of
expansion:

: transfer-once
     { src dest }

     ' ready? src msg need
     ' ready? dst msg need
     ' read src msg
     ' write dst msg
;
    
The macro system already has a syntax for locals, so i just need to
add this to the parser + choose the right semantics (code or data).


EDIT: also, what about just . (dot) for name binding operation?


Entry: locals
Date: Tue Oct 23 21:53:04 CEST 2007

Actually i did this before. I guess in brood-2 there's a syntax that
takes words like this:

      (a b | a b +)

Resembling Smalltalk's syntax for anonymous functions. i just saw
Factor also uses the vertical bar.

What i could do is to combine this with my special quoting syntax:

(a | a)      == execute
(a | 'a)     == identity

Following the rationale that words are mostly functions, and constant
functions are the exception.

This kind of syntax took me a while to get used to, but it makes a lot
of sense: has lead to a lot of simplified mixing of scheme and cat
code.

So what about combining that with destructuring?

       ((a . b) | 'a 'b +)

Hmm.. Let's leave that as an extension. There's no reason not to
however..

I think I need a dosis of good old fashoned confidence to go for the
quoted approach. What is more important: to stay true to the fact that
symbols are functions, or to go for the lambda-calculus approach of
using symbols as values + explicit application.

Even though it looks strange, the issue is: do i stick with my
previous realization that his is a good thing dispite it's strange
look. So the choice is either (classic):

      (a b | a b +)   ==  +
      (a | a execute) ==  execute

this has the interesting property that permutations are easily
expressed. or do i go with my approach

      (a b | 'a 'b +) == +
      (a | a)         == execute


What I could do is to use 2 forms of binding, and i guess that's what
i did before. have | do the stuff abouve and || do the normal thing,
or the other way around.

      (a : a)  == execute
      (a | a)  == id
      (a : 'a) == id

using the ':' has the added benifit of reminding you of a
"definition".


Entry: lambda
Date: Wed Oct 24 23:11:02 CEST 2007

Having had a night to sleep on it, i think it's going to be:

       (a b | a b) == id


* Lambda is simply too important to gratuituously do different.

* Data parameters are used more than function parameters, which in
  turn are easily quoted.

* It is compatible with current stack comment notation.


Entry: implementing lambda
Date: Thu Oct 25 13:20:50 CEST 2007

apparently i need to be careful where to introduce local variables in
the syntax expansion. as long as there's a lambda expression enclosing
a (xxx: a b c) macro, all lexical variables are identified properly,
but in this they are not:

  (define (bar? x)
    (eq? '\| (->datum x)))
  
  (define (represent-lambda c source)
    (let-values
        (((formals pure-source)
          (split-at-predicate bar? (syntax->list source))))
      #`(make-word
         '#,((c-language-name c) c)
         (quote #,source)
         (lambda
             #,(if (null? formals) #'stack
                   #`(#,@(reverse formals) . stack))
           #,(fold (lambda (o e)
                     (dispatch c o e))
                   #'stack
                   pure-source)))))

the 'dispatch' operation doesn't recognize lexical variables yet,
because the enclosing lambda macro hasn't updated the symbols.. so
lambda syntax should be introduced at a higher level.

i need a shortcut, only for macros, and then work up the abstraction
if necessary. the thing to extend is the 'macro:' form itself.

hmm.. i'm making a bit of a mess of it..

the lexical scoping for the macros is a bit special, and is probably
best handled using the pattern matching transformer stuff: the lexical
variables in macros should be bound to literal arguments in the
assembly buffer.

  (a b | a b +)

   ->

  (([qw a] [qw b] it)  (insert (list (macro: 'a 'b +))))

which is really awkward in the current composition.. it's probably
easiest to make a special purpose matching word as a straight lambda
expression. something like:

  (match stack
         (((('qw b) ('qw a) . rasm) . rstack)
          (let ((a (literal a))
                (b (literal b)))
            (apply (macro: a b +)
                   (cons rasm rstack)))))

	 
Actually.. This is quite universal, except for WHERE to find the
arguments.. Anyways, let's get on with it.

  (make-word
   'macro-lex:
   '(a b \| a b +)
   (match-lambda*
    (((('qw b) ('qw a) . rasm) . rstack)
     (let ((a (macro: 'a)) (b (macro: 'b)))
       (apply (macro: a b +) (cons rasm rstack))))))


The first macro using lexical variables in synth-soungen.f

  macro    
  : sync bit | \ --
      begin yield bit tickbit low?  until
      begin yield bit tickbit high? until ;
  forth    

Subtle ay :)


Entry: theory
Date: Thu Oct 25 21:07:55 CEST 2007

in order to finish brood.tex, it looks to me that type theory is not
really the most important thing to brush up on: partial evaluation
is. there's a lot of stuff here:

  http://partial-eval.org/techniques.html

i need to give some proper attention. if only to relate my intuitions
to things people have spent some thought on.


Entry: multiple exit points
Date: Thu Oct 25 21:48:31 CEST 2007

instead of writing macros containing 'exit' which are really a loaded
gun, it might be better to write a proper while abstraction that uses
multiple conditions. unfortunately, an 'and' is not very easy to
optimize..

  macro
  : need not if exit then ;
  : m m-dup m> ;    
  : transfer; src dst | \ --
      
      begin
         ' ready? src msg need 
  	 ' ready? dst msg need 
  	 ' read   src msg
  	 ' write  dst msg
      again   

      ;
  forth

why is this complicatied: because i don't want to use 'and'. what i
want is a word 'break' which breaks from a loop on a condition. maybe
'transfer;' is good enough: since i already have arbitrary WORD
exitpoints, i can use this to get any control structure exit point: it
also prevents juggling of the control stack (macro stack).


Entry: move
Date: Thu Oct 25 22:19:09 CEST 2007

for this i need 2 pointer registers. thing is: i'd like to use the x
stack's register to do this a bit efficient, but then i can't use for
.. next !

implementation detail anyway.. 


Entry: buffers
Date: Fri Oct 26 13:04:40 CEST 2007

next on are data buffers. i have some code that uses 14 byte buffers
together with some dirty trick of storing read/write pointers in one
byte for easy modulo addressing. i could dig that up again?

what is a buffer?
     - 2 pointers: R/W
     - base address of memory region (statically known)
     - size (statically known)

suppose i represent it as 2 literal values:  rw-var offset

see buffer.f for draft (committing now)

but..

isn't it wise to write some code for generic 2^n buffers? where a
buffer consists of 2 variables, a mask indicating its size. ok, did
that but it leads to more verbose code.

a different strategy could be to store the read pointer or difference
at the point where W points, this saves a cell that's normally used to
distinguish between empty and full. hack for later..

anyways, i stick with the current: its probably good enough. i need to
move on.

nibble-buffer.f tested.

     
Entry: 0= hack
Date: Fri Oct 26 13:29:59 CEST 2007

i'd like to figure out a way to efficiently implement the 0= word,
wich turns a number into a condition. the problem is that 'drop'
messes up the zero flag, so i used a 2-instruction movff trick
before.. but using drop should be possible when using the carry flag.

hmm.. nfdrop is only 2 slots.. i don't think i can do better really.


Entry: I2C comm
Date: Fri Oct 26 16:16:20 CEST 2007

how to get this going? the typical 'debug the debbuger' problem: I2C
is going to be used for the debuggin network, but until that works


master/slave:

  to preserve symmetry, it might be wise to use a dedicated single
  master node which runs debug code, so all the kriket nodes can be
  identical (slaves).

  ideally, all cricket chips are free from ICD2 and SERIAL ports, and
  have only power, ground, and I2C clock and data.


send/receive:

  let's stick with the ordinary monitor protocol over I2C. the thing
  to do is to make a hub.


Entry: SD-dac
Date: Fri Oct 26 22:07:00 CEST 2007

A Sigma-Delta Modulator (SDM) can be thought of as an
error-accumulation DC generator: given a constant input, it will
generate the correct average DC output, with a quantization
error noise spectrum that is high--pass.

A First-order SDM is an extremely simple circuit: it consists of an
accumulator with carry flag output: at each output instance, the
current output value is added to the accumulator, and the resulting
carry bit is taken as the binary output, and discarded.

I had this idea of running an 'inverse interrupt' machine: instead of
loosing time in ISR, just run an infinite loop, but allow at each
instance one primitive to run, which needs to spend an exact amount of
cycles. Probably not worth the hassle, but could be interesting for
really tight budget.

Anyways, this could be an alternative to PWM for kriket sound
generation. It should in theory give better quality. but probably that
also needs a deeper accu. With fast interrupt it's only 3
instructions:

	movf	OUTPUT, 0, 0
	addwf	ACCU, 1, 0
	rlcf	PORTLAT, 1, 0

assuming it's bit number 0 in port, and the rest of the bits we don't
care about (i.e. are inputs)

the problem here of course is that it's not just output that counts:
the output also needs to be computed.

Looks like it's not really worth it. Best to use PWM interrupt with
plugin generator code. At 2Mhz to get the carrier above audible
frequencies would put the divider at 64, and the carrier at 31.25
kHz. (The interesting thing here is that it could also be used for
bit-bang midi output at the same time :)

To get this going: best to add a small modification to sheepsint to
switch it into PWM mode.


Entry: FM sheep
Date: Fri Oct 26 23:16:24 CEST 2007

ok.. let's see what's necessary to make an FM (PM) synth in style of
Yamaha oldies. using a proper synchronous fixed time sharing approach
a lot is possible:

1. 31.2 kHz  1  x SDM output
2.  7.8 kHz  4  x 8 bit synth voices
3.  9.7 kHz  32 x envelopes

for all this i have 64 instructions.

one envelope per operator is more than enough. i've been checking out
the code for table lookup, and it can be brought down to 4
instructions

movf PHASE
movwf TBLPTRL
tblrd
movf TABLAT

but i doubt if 8 bit phase resolution will be enough..


Entry: hub board
Date: Sat Oct 27 15:07:51 CEST 2007


make hub board, first for serial, then for I2C. the idea is that a hub
board can be placed inbetween a normal serial board and a PC host:
it's only goal is to provide control over the serial slaves.

the condition is that all slaves have identical code, which means that
host indeed can switch without problems between different slaves:

[ PC ] --- [ HUB ] === [ S1 ] === [ S2 ] === ...

requirements:

  * the interface that implements this should be transparent: there
    should be no need for calling code on the hub directly. (except for
    debugging the hub where the host has just hub's dictionary).

i suggest to do this we use the next slot of 16 interpreter commands
to pass through monitor commands to the hub.

again: if i manage to get things working this way (async serial hub) i
have no need for I2C to do networking.. in fact, in order to get I2C
working i better build a proper debug network!

and more: if i get this serial passthrough to work, moving to a
synchronous 1-wire approach should be no problem.

ok, i have 50 solutions now..

TODO:
	- make it work for serial = standard
	- use serial to bootstrap 1-wire
	- MAYBE use I2C after that, probably too complicated


Entry: 1-wire revisited
Date: Sat Oct 27 15:47:27 CEST 2007


yes, why not.. it's a cheap hack but might be worth it. and i already
have provisions for it on the CATkit board, so the solution should be
re-usable. (CATkit: COMM is RA4).

let's stick to the ordinary monitor protocol with RPC semantics: (host
asks question, client responds / acknowledges). this is already
half-duplex, so fits nicely in a shared bus context. a simple start
bit, 8 data bit, stop bit could be used for comm using the following
waveform:

	  1 1 X 0 1 1 X 0

with X the 'shared bus' point, we can have a bidirectional link:

    * there's always power in a cycle (at least 50%)
    * bus is high when idle
    * there's a sync point 0->1 for slave sync
    * the send/receive is software control

protocol could be somthing like:

    * master just sends (start bit, 8 data bits, stop bit)

for the CATkit board, the sync could replace the fixed TMR2. let's try
the following:

    * fix CATkit's no-serial cable detection. (OK)
    * drive a CATkit board with a square pulse
    * use TMR2 to perform timed read or write

next: config RA4
    * open drain output (needs external pullup - master side?)
    * does have protection diodes to both sides

so, in theory it should be able to feed the chip through the
protection diodes.. but as far as i can see, it doesn't boot
properly. after adding a diode RA4 -> VDD it boots on DC. i don't
understand..

so, on to the controller. from the host side, everything is
synchronous. so timing should not be an issue. driving a couple of
busses in parallell poses no extra problems.

hardware: the dallas 1-wire bus apparently drives the targets through
a resistor, instead of a transistor. i was wondering how to prevent
hazards on the bus, and this is probably it: brief inspection shows
that a faulty client can bring down a network easily by shorting
during charge phase. a resistor also limits the charging current. so i
guess resistors are good. (i wonder if the weak pull-ups can perform
this task.. probably better not.)

pic has quite a large maximum current sink (25 mA), which would
determine the minimal size of the pullup resistor, i.e. at 5V the
minimal is R = 200 ohms.

simplifications WRT dallas 1-wire:

  * one slave per wire: no elaborate synchronization protocol
    necessary: all flow control is done in software using the purrr
    protocol. (host initiates transfer by sending a couple of bytes
    and waits for reply)

  * multiple slaves: they need to be addressed. in that case, some
    protocol is necessary. i.e. addr = 0: broadcast, no
    reply. otherwize: address followed by a couple of data bytes.

  * can use a 4-phase regime 10XY, where the receiver samples
    inbetween X and Y.

  * in case no comm is needed: master leaves line high: no unnecessary
    drain when pulling the resistor low.


using RA4 on the 18F1220. and for sending? can probably use an 18F1220
as a hub too, if it uses just one output. which output to use? only
RA4. maybe one bus is really enough? this way i could use simple RCA
splitter cables to build a network.

ok, i thought i needed an open drain output. apparently not: just
switching between 0 and Z is enough.


Entry: CATkit/krikit debug board
Date: Sat Oct 27 21:39:41 CEST 2007

* in debug mode: one bidirectional power/clock/data per slave (raw
  byte protocol: no address). this makes it a drop-in for the normal
  async serial io for the monitor. in 'midi' mode the port can easily
  run unidirectional shared. bidirectional shared is a software
  problem that can be solved later.

* using the 18f2620 for driver. the package is small enough to be
  practical. it can run without xtal at 8Mhz and has on board i2c for
  more elaborate networking later on. it has enough pins to add some
  status output. (i.e. RGB led)

* port B is used for communication. RB4:RB7 have interrupt on change,
  so could be used for more elaborate slave comm later. 

* running CATkit on a full line through 1k gives a 2V drop = 2mA.:
  that sounds about right. since this is low bw debug comm, it sould
  be possible to just leave the line idle = high. that means no clock
  is coming in.


so what about this: 

 - run CATkit TMR2 at a higher rate, i.e. 31.25 kHz. this would give
    * a decent timebase for SD sample tests
    * a 7.9 kHz bitrate for debug comm
    * ability to send MIDI data from CATkit board


I wrote the code for the network debugger. The 4-phase modulator and
receiver transmitter framing words are done and tested. The remaining
thing is how to switch between receiver and transmitter. Probably
something like this:
    
    - start with receiver
    - receiver gets idle -> check tx buffer -> tx / rx
               gets data -> start rx state machine
    - transmitter stop -> check tx buffer -> tx / rx
                  data -> tx state machine

these can be taken into one loop, and activated depending on rx/tx
flag.
         

Entry: sheepsint urgent todo
Date: Sat Oct 27 23:06:44 CEST 2007

LIST MOVED DOWN


Entry: nasty sub bug?
Date: Sun Oct 28 14:32:32 CET 2007

the following code leads to incorrect asm:

    123 @ 124 @ -

	dup	
	movf	123, 0, 0
	subwf	124, 0, 0

that should be subfw ??

the problem is in "123 @ -"

took the - and -- words out of the 'binary' meta patterns and fixed.


Entry: rtx
Date: Sun Oct 28 16:41:44 CET 2007

looks like it's +- working, at least the transmitter.
one little problem still, if client syncs to 0->1 transition, what
happens when it picks up in the middle of a data stream? suppose #x55
which is just a bunch of:

0100 0111 ...

syncing to the right frame is not a problem: per bit there is only one
0->1 transition to sync to. so the problem is that each client should
start with an idle line. it's the same problem as async serial.

so..

receiver for sheep. let's stick with a RX state machine only. the deal
is this:

- interrupt on change: detect 0 -> 1
     reset TMR2 + RX state machine

all logic from hub.f can be re-used, except for the top sequencer,
which should be
      route   ; ; ; rx-bit ;


Entry: comm on catkit
Date: Sun Oct 28 17:38:26 CET 2007

there are 2 ports left: RA4 RA6

both are not very interesting: no interrupt on change, or interrupt
facility. interrupt pins that can be reused are:

	  RB5 (INT1/TX)
	  RB2 (INT2)     not without cutting traces or removing R8
	  RB0 (INT0)     not without removing last pot
	  RB5-RB7 (KBI)  multiplexed with switches
          RB4 (KBI0/RX)

can it be done using polling only? i.e manually synchronize on each
start bit or something. need to think a bit more, but it looks like
manually polling is going to be problematic. the easiest thing is
really RB2/INT2: it's a proper interrupt, and its functionality is not
used atm.

maybe i should leave catkit out of it and try to get it to work on
krikit first.. catkit needs an update anyway, and this could be a nice
addition. reminder:

    - ditch AUDIO- for INT2
    - external rectifier diode
    - serial RX 100k pull-down
    - fix pot distance
    - fix switch distances
    - room for LED
    

Entry: Manchester
Date: Sun Oct 28 19:30:21 CET 2007

i'm wondering wether it's not simpler to use Manchester code. (BPSK
with square waves)

symbols are 01 and 10

once synchronized, the signal can be locked by allowing resync in on
the fixed transition at half symbol. syncing can be done on an idle
line, all one (10).

catch: for uni-directional with sender = master this works fine, but
bidirectional is problematic.


Entry: eliminating the pullup resistor
Date: Sun Oct 28 19:40:41 CET 2007

In case there's one slave only, the pullup resistor can be eliminated
by using a current-limiting resistor to prevent short-circuit on
collision.


Entry: slave on krikit
Date: Sun Oct 28 19:59:46 CET 2007

Got one spewing 123, now need another one listening.

Slave uses RB0 (INT0). Apparently i can't pull the line all the way
down.. Probably on-resistance (i'm pulling down 100 ohm..)

Sequencing is an interplay between INT0 and the TMR2.

INT0 -> reset timer phase + call 'rtx-next'
TMR2 -> 'rtx-next'
        

The other one got 123, and some shifted version out of sync. To get
better sync during debugging, bytes could be interleaved with a 10 bit
idle preamble.  This would guarantee resynchronization after the first
faulty reception.


Entry: strong 1
Date: Mon Oct 29 05:17:06 CET 2007

          --- Vdd
           |
          [Ru]    /--[Rl]-o SLAVE I/O
           |      |
MASTER o---o------o--|>|--o SLAVE Vdd
                          |
                         === C
                          |
                         --- GND

0 1 2 3
0 1 X X

phase 1 is 'strong drive' directly from Vdd, not through a pullup
resistor. this avoids strong sink currents and large voltage drop.

during phase 0 and 1, MASTER is OUT. also if it's sending in 2 and
3. when receiving, master is Z, so Ru pulls up the line.

a slave can still mess up by pulling a line high, but the short
circuit is prevented by Rl. 


Entry: intermezzo: macro vs. return stack
Date: Mon Oct 29 15:50:14 CET 2007

actually, this is quite simple. if i change the terminology a bit,
compilation of local labels for jumps and run-time control flow using
execute and exit could be unified somehow.


Entry: about named macro arguments
Date: Mon Oct 29 19:46:11 CET 2007

maybe it's better to stick to prefix syntax to not gratuituously move
away from forth syntax. after all,

: 2@ var |
   var @
   var 1 + @ ;

is not too much different from

: 2@ | var |
   var @
   var 1 + @ ;

It will also simplify the implementation.


Entry: urgent stuff
Date: Mon Oct 29 20:52:04 CET 2007

time flies. i need to get     debug network running today. it should
not be more than patching the interpreter to the rtx: the hub should
just be a loop that polls the serial port, and possibly executes some
special purpose commands. the slave needs a new dispatch table
connecting to rx, tx from the slave rtx.
                          the

so todo:
- get this debug patch-through to work: nothing fancy, just repeat
- fix some of the urgent problems


Entry: hub interface
Date: Mon Oct 29 23:25:15 CET 2007

i'd like to do this with changing as little as possible. connect to a
hub just like any other project, but there should be a way to execute
its application without needing knowledge about the dictionary of the
hub device.

let's change interpreter.f to

\ token --
: interpret #x10 min route
             ; receive ; transmit ; jsr    ;
     lda     ; ldf     ; ack      ; reset
     n@a+    ; n@f+    ; n!a+     ; n!f+   ;    
     chkblk  ; preply  ; ferase   ; fprog  ; 
 
     e-interpret ;

the last word should lead to a reset if it's not implemented, or to
the interpretation of an extra set of byte codes. in any case, it is
required to be filled in by specific monitor code.

now: if there's no extension implemented, should invalid commands be
ignored or not? there's no proper way to react to invalid commands,
since they can quote the following bytes, leading to a completely
non-interpretable state... just reset is probably good enough.

another problem: if the hub just passes through, how to control it
after switching to passthrough mode? serial break is an option. need
to figure out how to send that in mzscheme though..

there should be a more elegant solution, but this requires either the
traffic to be quoted, or the new interpreter to actually understand
(parse) the traffic to see what comes through. the latter is not so
easy because of quoted bytes.

a better solution is to completely override the boot interpreter. that
way all traffic can be properly redirected.

i guess i'm making it too difficult. the real problem is: this hub
thingy doesn't fit in my debug or run view: the cable can't determine
wether a boot interpreter should be started or not. let's start there.

next: name for the protocol.. i'm going for E2: it's the binary
representation of 0 and 1: 0100 0111 = E2 with lsb first.


Entry: the big questions
Date: Tue Oct 30 00:25:54 CET 2007

probably the huge shot of caffeine i got today, but i'm in delusion /
big-idea mode again.. i run into a lot of bootstrap problems. today's
boostrap problem is debugging the debugger. somehow i think
bootstrapping is really the only significant problem.. it's the
"getting there" that's important practically, not so much the staying:
that should be obvious.

i find it a facinating subject. i need to read more about it:

 * need to play with piumarta's cola stuff: objects and lisp as ying
   and yang (though lisp has it's own ying and yang: eval and apply, i
   wonder if this is the case for objects? probably something with
   v-table lookup).

 * need to read about 3-lisp and reflective towers

 * i'm not so sure if writing a proper language bootstrap is valuable,
   but somehow it looks like yes. brood is a bootstrape exercise
   really. i'd like to end up, not necessarily at scheme but at a
   dynamic language to run on small machines.. maybe cola is the way
   to proceed?


another thing i need to read up on is partial evaluation and C code
parsing and refactoring, but that's secondary really.. maybe bootstrap
is indeed the only real problem


Entry: parsing again
Date: Tue Oct 30 04:07:36 CET 2007

* added packrat parser code from Tony Garnock-Jones
  this should "end all _real_ parser woes" when i switch to a
  different syntax frontend.

* for the forth regular parser, i just need to add proper syntax for a
  regular syntax stream pattern matcher: i have no real recursive
  parser need for the forth (really out of principle: to stick to the
  roots and make the language simple to understand. there's something
  to say about a simply parsed language when teaching!)

* the only reason i'm using syntax streams is to be able to recover
  source location information and to use syntax-case. the latter is
  probably not the right abstraction.


what i want to say is something like:

(parser-pattern (macro forth)
   ((macro <stream> forth)    ----))

where the '<stream>' is bound to a syntax stream

from portable-packrat.scm :

       (packrat-parser expr
		    (expr ((a <- mulexp '+ b <- mulexp)
			   (+ a b))
			  ((a <- mulexp) a))
		    (mulexp ((a <- simple '* b <- simple)
			     (* a b))
			    ((a <- simple) a))
		    (simple ((a <- 'num) a)
			    (('oparen a <- expr 'cparen) a)))

i read on the wikipedia page that a packrat parser is necessarily
greedy. i'm not sure in what sense..


Entry: finite fields
Date: Tue Oct 30 07:32:21 CET 2007

http://www.lshift.net/blog/2006/11/29/gf232-5

in 8 bit, the biggest prime is 2^8-5 = 251

i'm not sure what this is useful for though.. some error checking /
correcting stuff? the article talkes about "a" finite field, as if it
mostly doesnt matter which..

ah.. the wikipedia article on coding theory mentions subspaces of
vector spaces over finite fields. a naive way would be to use i.e. a
3-space in the 4-space over GF(251).


Entry: fixing the assembler
Date: Tue Oct 30 16:35:18 CET 2007

made the dictionary into a parameter to move code from internal ->
external definitions. now i need to abstract away the control flow of
the assembler: eliminate assemble-next

let's see, the type of control used is:

      * comma -> expand to list of instructions
      * comma0 -> same, without updating instruction pointer
      * register -> dictionary mutation
      

primitive ones:

      * again (retry) -> retry assembly with updated (dictionary?)
        state

properties:

      * assembling an instruction is 1 -> 0, 1 or more


looks like the major difficulty is in assembler operations that
recurse.. currently it's handled by just pushing an opcode in the
input buffer and calling next. i'm going to make this recursing
explicit.

how should non-resolved symbols be handled? just returning the
instruction seems best. i do need to fix absolute/relative addressing.

what about leaving restart to the sequencer? the idea is to provide
some expansion to plug-in assemblers (asm-find) 


so the point where i need to make some changes is the way 'here' is
used: chicken and egg:

      * can't determine 'here' untill all previous instructions were
        assembled.

      * can't assemble instruction intil it is know how far a forwared
        reference is.

what about trying to solve this with backtracking? is that overkill?
maybe backtracking with memoization? maybe assembly itself is cheaper
than memoization :)

maybe every instruction should be compiled to a thunk that takes just
the absolute address?

hmm.. need some time to sort it through.. it should be possible to
write this in a lazy way..

roadmap:

 - get it to work like it did before
 - change the implementation of 'here' to a parameter
 - create a graph data structure from 'label'
 - figure out the control flow for some backtracking like thing
 - write some graph opti (i.e. jump chaining)

another remark: having 'labels' as pseudo instructions is bad. they
shoul really be true graph elements: pointers to instructions.

hmm.. i need a break.

lap.. goes wrong 'somewhere' :)

maybe i should fix the 'here' thing first before trying to get it to
run, since it's somehow messed up. i need to start over:


Entry: here kitty
Date: Tue Oct 30 22:54:39 CET 2007

what with 'here' ?

now that it's separated out a bit more, it's easy to see it is a bit
of a mess: i'm using org-push and org-pop so i can't just eliminate
it..

i need to separate these concerns:

* ORG / ORG-PUSH / ORG-POP   = telling where things go

  it's easier to cut out the intermediate part and handle it
  separately. a bit of a crazy way of doing things..

* 'here'  = using self-referencing blabla

i need a proper way of expressing all these dependencies: once 'org'
stuff is dealt with, and the absolute/relative problem is solved, the
remaining problem is one of relaxation.


Entry: relaxation problem
Date: Wed Oct 31 00:45:26 CET 2007

some choices need do be made before enough information is present, but
instead of completely starting over (backtracking) the form of the
problem is such that the intermediate solution can be updated. as long
as a complete dependecy graph is present, the solution is quite
trivial: just recurse over all dependencies.

some hints for finding the right data structure

1. instructions that do not reference code locations, either as jumps
   or just as literal words, are irrelevant and can be ignored.

2. labels point _in between_ instructions

3. keep the cause of events abstract: any instruction that has a
   reference can grow.

4. this is related to functional reactive programming

let's stick to the idea of instruction cells: each cell contains a
single symbolic opcode with arbitrary length.

thinking of this as cells sending messages to each other, there are 2
kinds of messages:

- tell next cell it has moved
- tell cells that depend on a label they need to update

looks ok at first, but for non-contiguous code that doesn't have a
non-decreasing code distance between several nodes, this might not
terminate.. if i make sure code never shrinks, this should be ok
though..

hmm... i need to read a bit about this. i guess in general it's
"linker relaxation".

so most important notes:

   * downward updates from a size change can be eliminated

   * to ensure termination, only expand/contract in one direction:
     that way it will at least stop at the case where all references
     are expanded.

   * if a size change happens as a consequence of an update


Entry: a more traditional approach
Date: Wed Oct 31 03:43:54 CET 2007

http://compilers.iecc.com/comparch/article/07-01-038

    There is a type of assembler that does exactly the same thing on
    every assembly pass through the sourcecode. Pass 1 outputs to
    dev>nul and is full of phase errors, pass 2 has eliminated most
    (or all) phase errors (output to nowhere) and pass 3 usually does
    the job in 99%+ cases whereupon code is output. On each pass
    through the sourcecode (or p-code in your case) you check for
    branch out of range, then substitute a long branch and add 1 to
    the program counter, causing all following code to be assembled
    forward+1, then make another pass and do the same thing again
    until no more branch out of range and phase errors are found do to
    mismatched branch-target addresses.


That doesn't require my esoteric approach and seems a lot simpler
really: just keep it running until the addresses stop changing.

So just do as before, but:

   * keep a phase error log
   * use a generic branch instruction which gives short or long branches
   * every pass is completely new
   * split 'old' and 'new' labels, make new labels mutable?
   * put 'here' in a dynamic variable
   * make a quick scan for labels to find out undefined ones

NEXT:
	prepare assembly code so multiple clean passes are possible:
	- get rid of 'mark' for example.
 	- put 'here' in a parameter
	- remove all dictionary manipulations
	- find a way to handle var and eeprom.. maybe separate pass/filter?
	
the goals is clear enough.. just some disentangling to do first..

different approach: 

  * use the previous approach, but keep the dictionary after every
    pass (clean it inbetween)

  * keep a log of the name registerations to determine phase errors.


Entry: comparators and square waves
Date: Wed Oct 31 05:00:41 CET 2007

before trying anything with sine waves, it makes sense to at least
have a go at pure binary singnals spanning the entire bandwidth. i'm
curious as to how far i can completely eliminate amplification, and
use only a comparator?

i do loose all signal presence detection capability, and amplify noise
tremendously. but this does transform everything into a software /
filtering problem. i guess with some good codes i can actually get
things through..


Entry: shopping for opamps
Date: Wed Oct 31 19:59:42 CET 2007

@ maxim for low voltage rail-to-rail.

i can get as low as 2.7V for 
MAX4167   5MHz, 1.3mA      (DUAL)
MAX494    0.5MHz, 0.15mA   (QUAD)


Entry: name spaces and objects
Date: Wed Oct 31 23:32:41 CET 2007

i'm trying to figure out how to make the name manging work well enough
to create static metaprogramming interface which supports generic
programming at the macro level.

  * write algorithms in macro form
  * instantiate them statically as many times you need


what i'm really missing is higher level macros. with those, i can
build anything i want really..

so why is it impossible to have those? i probably need to give up on
forth syntax..

(let me finish my verbose buffer code before i try to answer..)

ok. i don't know really.

let's first try to get things like this out of the way:

: bbf.tx-empty>z bbf.tx buffer.empty>z ;  
: bbf.rx-empty>z bbf.rx buffer.empty>z ;  
: bbf.rx-room>z  bbf.rx buffer.room>z ;  
: bbf.tx-room>z  bbf.tx buffer.room>z ;  
  
: bbf.>tx      bbf.tx buffer.write ;  
: bbf.>rx      bbf.rx buffer.write ;
: bbf.tx>      bbf.tx buffer.read ;         
: bbf.rx>      bbf.rx buffer.read ; 
: bbf.clear-tx bbf.tx buffer.clear ;
: bbf.clear-rx bbf.rx buffer.clear ;

what i want is just

     ' bbf.tx- compile-buffer
     ' bbf.tx- compile-buffer

i can't even do variables since they are macros..

  * yeah i need to be able to generate macros
  * and fix name clashes within a compilation unit: both words and macros.

maybe the trick is really to define 'compilation unit' properly?

in my current approach, a macro can't pop up during expansion of code.


i need to get the philosophy right:

   * a flat namespace is nice for an application: everything is
     concrete. we're "among freinds" and last names are not necessary.

   * it sucks for writing library code

the solution in mzscheme that works for me is functions +
modules. local module namespace can be used for small specialized
utility words. i'd like to have something like that in forth.

the problem is: i'm taking a really static stance in which macros play
a central role, not functions. this works as long as macros are
sufficiently powerful, which means higher order macros.

now, let's pull the problems i'm having apart:


i wrote some buffer code, which is just macros. to instantiate a
buffer one doesn't simply do "bla create-buffer" or something, but it
is necessary to specialize a lot of functions manually. that's
completely unacceptable.


Entry: higher order macros
Date: Thu Nov  1 01:06:36 CET 2007

In order to solve some particular template problems, i'd like to have
higher order macros. this amounts to instead of splitting up a source
file as:

        MACROS -> PROCEDURES

splitting it up as

        ... -> MACROS^2 -> MACROS -> PROCEDURES

of course, there should be no limit to the tower.


The real problem is: i have no sane syntax space left! In macros i
can do this:

  macro : make-a-123
            ' a-123 *: 123 exit ;

which is already pretty ugly because of quoting issues. But what am i
going to invent to make higher level expansion work?

One thing is sure: taking out the reflection (making macros
side-effect free) killed the possibility of generating names at
compile time, EXCEPT for function labels. But those are really just
data: it's a hack that doesn't really count.

   So I have a GOOD THING: independent declaration instead of
   sequential variable mutation for creating new macro names, that
   causes a BAD THING: limited reflection due to improper phasing.

Actually, I already knew that, but i'm starting to feel it now:
artifical limits are no good. Even if they serve a higher goal.. Maybe
that makes them not artificial?

The limit i created is actually there for a reason: to use partial
evaluation to make it possible to perform compile time operations
without the need for an explicit meta language: without the need for
quotation like `(+ ,a ,b) or it's beefed up syntax-case / syntax-rules
variant.

(funny how the only 'meta' part of the language is the macro stack: it
punches holes in reality somehow ;)

So let's pat myself on the back:

   * the current macro / forth thing is GOOD. it is easy to use, easy
     to understand, and avoids most quotation issues that arise in
     practice by relying on partial evaluation. it gets pretty far
     without the need for an explicit metalanguage.

   * it's NOT GOOD ENOUGH because it's the top level: it can't be
     metaprogrammed itself!


The metaprogramming operations i'm looking for are those that create
new macro NAMES. Creating new macro BODIES should not be so terribly
hard: it is in fact what should be used for the quotation based
language.

So the core of the business should be the question why this works in
scheme:

(define-syntax make-macro
  (syntax-rules ()
    ((_ name body)
     (define-syntax name
       (syntax-rules ()
         ((_) body))))))

box> (make-macro bla (+ 1 2))
box> (bla)
3


Entry: poke
Date: Fri Nov  2 04:27:04 CET 2007

i'm taking a day off.. so technically i'm not allowed to write in this
log. however, i got into PF today, and wrote a rant on the PF list
about mapping and partial evaluation.

maybe it's time to start writing poke, or nailing down the
requirements. the idea behind poke is to have a machine model for DSP
like tasks that can be setup (metaprogrammed) by say a scheme
system. the idea behind an application is this:

	1. a program is compiled for a VM.

	2. a new VM is instantiated (on a separate core/machine)

	3. the VM now runs in real-time: doing its own scheduling and
  	   stack based memory management, being able to communicate
  	   with its host system and other VMs

each VM is a linear tree stack/tree machine.


i'd like to do this without writing a single line of C code: have it
all generated. that's the only way to be serious about generating
*some* code.

it should have an s-expression interface with which it talks to a host
scheme system. this acts as message passing: no shared state
allowed. this syntax should have an easy extension for binary data.

it should be 'ready' for multiprocessing. what i mean with this is:
each processing core should be able to run a single machine instance,
so instances should be able to talk among each other in a simple way,
and there should be a schedulure available on the VMs to handle the
message passing.

i was thinking about a 'binary s-expression' approach to limit
inter-machine communication parsing overhead. data should still be
list-structured though, and word-aligned. for human interface, a
simple front-end could be constructed. arrays can be allowed for ease
of wrapping binary data.

internally, cons based lists are used for all representation. cdr
coding is used to be able to represent programs linearly. memory
inside the machine consists of stacks only. each machine uses a
limited set of data types, making re-use lists efficient.

aim for the highest possible gcc code generation efficiency: i see no
point in targeting anything else than gcc, so all extensions are
allowed. i just checked (see doc/gcc/tail.c) for tail call support and
it seems to work when putting functions in a single file. it also
works putting the functions in different files apparently. that's good
news. state passed: 3 stacks: DS/RS/AS

the target language should be a pointer-free safe language. this is
going to be a bit more difficult, probably have to split in safe /
unsafe parts.

the 'system' language and the 'inner loop' language are different and
should be treated as such. i probably should start with the latter and
build the control language as a layer on top. the former is a
forth-like language extended with linear tree memory and the latter is
a multi-in-multi-out language to be combined with combinators.


  1. all C code generated: need a generator.
  2. message passing interface using s-expressions.
  3. run-time memory (stacks/trees) is locally managed
  4. other (code) memory is static/readonly, loaded by host
  5. safe target language (from a certain point up)

so poke seems like a really straightforward extension to
forth. getting it compatible with PF will be quite something
though.. all this is pretty low priority. the only difficulty is how
to deal with pointers for optimizing the linear stack/tree data
structures. 'safe poke' :)


Entry: mix
Date: Fri Nov  2 05:31:52 CET 2007

then the thing that could be used immediately in both PDP, PF and PD
modules: a language to describe inner loops and iterators, to yield C
code that can be straight linked into the projects.


Entry: instantiating abstract objects
Date: Fri Nov  2 15:19:40 CET 2007

i'm giving myself one hour to think about how to fix the verbosity of
the following code:


macro
: tx         #x100 tx-r/w #x0F ;  \ put buffers in RAM page 1
: rx         #x110 rx-r/w #x0F ;        

: tx-ready?  tx-empty>z z? not ;  
: rx-ready?  rx-empty>z z? not ;  
: tx-room?   tx-room>z z? ;
: rx-room?   rx-room>z z? ;
forth

2variable tx-r/w
2variable rx-r/w  
  
: tx-empty>z tx buffer.empty>z ;  
: rx-empty>z rx buffer.empty>z ;  
: rx-room>z  rx buffer.room>z ;  
: tx-room>z  tx buffer.room>z ;  
  
: >tx      tx buffer.write ;  
: >rx      rx buffer.write ;
: tx>      tx buffer.read ;         
: rx>      rx buffer.read ; 
: clear-tx tx buffer.clear ;
: clear-rx rx buffer.clear ;


the ONLY difficulty here is that i can't generate macros, including
variables. is there an other way to solve the problem?

is it possibly to hide everything in one single macro? yes. if
tx-empty>z is never expanded as a function this is actually
possible. then what remains is just:


macro
: tx         #x100 tx-r/w #x0F ;  \ put buffers in RAM page 1
: rx         #x110 rx-r/w #x0F ;        
forth

2variable tx-r/w
2variable rx-r/w  

tx >buf
tx buf>

maybe i can somehow make an 'un-inline' function work? like
memoization?

something which gets me half way there is a blocking read/write
operation: only for dispatch loops this then becomes problematic.


conclusion: i guess it's ok to go for this approach:

   On the subject of code reuse, there are 2 options. Either you write
   it as procedure words, or as macros. Using the procedure word
   approach will lead to smaller code size but slower speed (since
   run-time dispatch is probably necessary). 

   Using the macro word approach can lead to fast inline code which
   might be not optimial for code size.


Entry: e2 debugging
Date: Fri Nov  2 17:41:24 CET 2007

current setup: hub (master) connected to krikit (slave) which runs a
loopback. there is communication, but somehow a start bit gets
lost. there are 4 places where it can get lost:


1.    hub transmit (OK: clear on scope)
2.    slave receive (OK: sending #xFF all one gives reply)
3.    slave transmit (OK: reply has start bit)
4.    hub receive

i have no trigger scope or logic analyser so i need to construct a
steady state error condition i can sync my scope to. i can measure
slave transmit if i manage to add some wait code in the hub. such code
is probably necessary for other purposes.

so. running a couple of experiments makes it clear that 1-3 are
ok. the problem is with the hub receive that doesn't see the start
bit.

i don't see the problem. as far as i can isolate it, somehow the start
bit gets missed by:
    - the rx state machine is in the wrong state
    - the rx/tx switch comes a cycle too late
    - ...

i need something that's easier to test. i suspect the rx/tx switching
is the cause, so maybe i can make a better switcher?

i did notice a slightly borked waveform for the startbit
however.. let's see if i can get a better view and see where that's
coming from..

that was wrong. i start over:

     - fixed timer compensation, now at least the signal is stable
     - clearly to see that there's a phase problem

i'm wondering if it's not just a speed problem. timer is running every
64 clocks.. well.. it's easy to test by just running it slower
really.

YES! it was.. running 4x slower fixes the problem. time to do some
profiling then!


Entry: e2 + interpreter
Date: Fri Nov  2 23:21:38 CET 2007

i'd like to make 'transmit' and 'receive' late bound. that way it's
easy to switch the interpreter's default I/O. but i need to do it
cheaply: using vector.f and execute.f requires too much boot space.

wait.. that's the case for catkit. for the 2620 i have a lot more
room. maybe i should go that route then, and solve the catkit problem
when it poses itself.

time to make some decisions:
     * allow both serial + e2 ?
     * build e2 in boot loader?

actually, i do need e2 in boot loader. working as a safety
measure. hmm.. let's get it to do what it needs to do first.

ok. i can ping krikit. fixed the saving/restoring of the a reg so i
can access the stack. code upload doesn't work yet. i guess it has to
do with a missed 'ack' due to interrupts being disabled. maybe i
should build in the ack in the fprog?


  NOTE: about saving the a reg. if there are interrupts, the a reg
  needs to be saved anyway (or it's use protected with cli), so maybe
  it's best to just always save on clobber? alternatively, always save
  on clobber in isr.

i added an ack to fprog and ferase, but apparently that's not
enough. one line can be written, then it messes up. some code is
needed to properly resync the transceiver after programming so it
picks back up at the next idle INT0.

for debugging purposes, i should make a version that uses polling
only, so it can be used to setup interrupts. thinking about it, i
probably need to modify all opcodes so they give a sync themselves, so
no buffering is required. (uart has 1 byte). hmm.. it's not so simple
really.

actually, it is: all interpreter tokens have RPC semantics: they
return at least one value, except '00' which is a sink, and 'reset'
which can't have a return value. the 'ack' opcode can then be
eliminated, an possibly replaced with 'interpret'.

nop, reset -> no ack
receive	 -> ack
transmit -> value
jsr, lda, ldf -> ack
n@a+, n@f+  -> stream of bytes, no ack necessary
n!a+, n!f!  -> ack
ferase, fprog -> ack
chkblk, preply -> stream of bytes, no ack necessary

this should get rid of the requirement to have buffered io. remaining
timing issues can be handled with appropriate delays.


an interesting extension when 'receive' and 'transmit' are made
dynamic is to have them read from memory. that way a small program
could execute from ram.


Entry: boot protocol changed
Date: Sat Nov  3 02:27:28 CET 2007

      * fprog and ferase now give an 'ack' themselves. this is
        necessary for receivers that suffer when interrupts are
        disabled.

      * the #x00000000 password is eliminated: with boot code
        protection this isn't necessary.


Entry: separate compilation + name spaces
Date: Sat Nov  3 15:14:27 CET 2007

as a consequence of the way compilation works, it is possible to rely
on the fact that, per compilation unit, names can be overwritten. what
i mean is that it is possible to 'load' the same file twice, but with
different words/macros bound in its environment.

this comes close enough to the 'dictionary schadows' paradigm i'm used
to in PF, and which actually works pretty well: it avoids the need of
a name space mechanism fairly effectively.

an extension to this could be to allow for exports: provide only those
macros and words necessary.

then another extension: why not install the macro source in the target
dictionary? there's no realy reason not to, and it makes 'mark' work
for macros (given that i delete and re-instantiate the macro
cache). or.. i could use this as an indicator for using the macro
cache or not.

one thing that has been bugging me: if i define a word or a macro, i
do want it to override the previous word or macro.

i should make a list of the name space trade--offs for writing a forth
really.


Entry: roadmap
Date: Sat Nov  3 15:51:02 CET 2007

  - get programming to work over e2 (restart receiver after fprog: add
    a macro hook to interpreter.f) (done)

  - fix acks in interpreter.f and tethered.ss (done)

  - make it work without interrupts and put it in the boot loader

  - figure out the 'strong power 1' phase, and test with slave.

  - test over longer twisted pair cable.


Entry: no middle road
Date: Sun Nov  4 01:04:57 CET 2007

some thoughts about 'accumulative' code, due to lack of a better word.

in light of the recent remarks about higher order macros, i have the
impression i am mixing 2 paradigm in a not so elegant matter:

    1. functional programming, mzscheme's independent
       compilation with 'unrolled' metaprogramming dependencies.

    2. the accumulative image model, where a language grows by
       accumulating more power, which then can immediately be used to
       define new constructs.

i knowingly took out a part of 2. to get purely functional macros that
could be safely evaluated for their value at interaction time.
however, the the interactive compilation does work in an incremental
way. it looks as if i am forced into some middle road compromise.


Entry: embedded programming in 2007
Date: Sun Nov  4 15:53:35 CET 2007


the question i really like to answer: without too much bias (the the
tool i wrote) what is the point of writing static, early bound code in
2007, even if we're talking about microcontrollers.

  * is there really a 'complexity barrier' below which one HAS TO move
    to quasi manual compilation and allocation?

  * will this barrier remain in existence, or will better tools make
    a more high-level approach possible?

EDIT: some things i was thinking about yesterday:

  * leaky abstractions are hard to work with. starting from assembly
    and "thinking up", using purrr to help you write the application
    is the right approach. starting from some high-level understanding
    of the language and having to learn all its limitations doesn't
    really work. the problem seems to be the manual resource
    management: time, space, and synchronization between global
    variables, and hardware devices. 

  * it seems i loose most of my time in low-level configuration issues
    which give little feedback on error, and dealing with situations
    that are hard to debug due to dependence on external events. low
    level design really is a debugging problem: setting up experiments
    to try to isolate errors. hence the use of loads of specialized
    (hardware) tools used in professional environments.


Entry: concatenative introduction email
Date: Mon Nov  5 18:44:15 CET 2007

Dear All,

Allow me to introduce myself. My name is Tom Schouten. I live in
Leuven, Belgium and I'm 32 now, if that helps paint a picture. I've
been interested in concatenative programming for a while and lurking
here and there.. To educate myself, I wrote quite a lot of code in the
last couple of years, and I'd like to share some of the results, but
maybe even more the resulting questions. (warning: long post, story of
my life :)

My background is in electrical engineering. My heart lies in music
DSP.  I've been working up the ladder from electronics, to machine
language and C/C++, through Pure Data (a data flow language) to Perl &
Python to finally end up at Scheme and functional programming. I'm
flirting a bit with Haskell, but really just read because most recent
interesting functional programming texts use that language.

http://en.wikipedia.org/wiki/Pure_Data

The problem I'm trying to solve to guide me a bit is "Build tools to
write DSP code, mostly for sound and video processing, in a high level
language." I ran into limits of expressiveness writing video
extensions in C for Pure Data, about 4-5 years ago. Apparently there
are no freely available tools that solve this problem, so I take that
to be my mission.

About 3-4 years ago I started writing Packet Forth (PF) as an
attempt to grow out of my C shoes. It was at the time I discovered
colorForth, and I was wondering if I could create some kind of cross
breed between Pure Data and Forth. PF now looks a bit like Factor on
the outside, though is less powerful. PF uses linear memory management
(data is a tree), with some unmanaged pointers for data and code
variables. PF's point is a to be a scripting language which tosses
around some DSP operations written in C. It doesn't aim to be a
general purpose language. The darcs archive is here:

http://zwizwa.be/darcs/packetforth/

Some more more highlevel docs aimed at media artists here:

http://packets.goto10.org

I got a bit frustrated with the internals of PF, mostly because there
is still too much verbose C code, and a lot of C preprocessor macro
tricks that could best be done with a _real_ C code generator.

So I dived a bit deeper into Forth, and early 2005 I started at the
bottom again: I wrote an indirect threaded forth for Pure Data (mole),
and started BADNOP (now dubbed BROOD 1), an interactive cross compiler
for the Forth dialect Purrr, an 8-bit stack machine model for Flash
based PIC Microcontrollers.

http://zwizwa.be/darcs/mole
http://zwizwa.be/darcs/brood-1

Mole made me 'get' Forth finally: the first versions of PF were mostly
blind hackery to get to know the problems before the solution. For
mole, I actually followed tradition a bit more (Brad Rodriguez'
"Moving Forth"). This lead to a more decent PF interpreter.

The forth I wrote to write the cross-compiler for the PIC Forth was a
mess. I was experimenting with some quotation syntax but realized that
what I was really looking for was lisp, or a lisp--like concatenative
language. At that time, early 2006, I discovered Joy, so I ditched the
compiler and rewrote it in CAT (not Christopher's Cat) which was
written in scheme (BROOD-2). After some refactoring and rewriting due
to beginner mistakes I am now at BROOD-4, with the CAT core written as
a set of MzScheme macros. This CAT is a dynamicly typed concatenative
language with Scheme semantics. I consider it an intermediate
language. Currently it is only used to implement the Purrr compiler (a
set of macros) and the interaction system.

http://zwizwa.be/darcs/brood

Purrr is as far as I know somewhat novel. All metaprogramming tricks
one would perform in Forth using the [ and ] words are done using
partial evaluation only. I've tested this in practice and it seems to
work surprisingly well.

I am still struggling a bit with the highl level Purrr semantics
though. Concretely, it is a fairly standard macro assembler with
peephole optimization. Nothing special there. On the other hand, its
macro language is a purely functional concatenative language which is
'projected' onto a real machine architecture after being partially
evaluated. I tried to explain these concepts in the following papers:

http://zwizwa.be/darcs/brood/tex/purrr.pdf
http://zwizwa.be/darcs/brood/tex/brood.pdf

(for the latest versions it's always better to use the .tex from the
darcs archive)

The latter needs some terminology cleanup, but it contains an
explanation of the the basic ideas, and an attempt to clarify the
macro semantics in a more formal way. I'm interested to learn what i
need to read in order to frame these concepts in proper CS speak... It
looks like I'm either terribly ignorant of something that already
exists (I went through a couple of stages of that already), or I found
a clean way of giving cross-compiled Forth a proper semantics.

On a lighter note, I'm using Purrr to build The Sheep, a retro synth
inspired by 1980's approach to sound generation. It runs on CATkit,
and has been used successfully many times in beginner "physical
computing" workshops, as electronics is called in non-engineering
circles :)

http://zwizwa.be/darcs/brood/tex/synth.pdf
http://packets.goto10.org/packets/wiki/CATkit

(the scary guy in the picture is not me :)


Entry: krikit board design decisions
Date: Mon Nov  5 17:52:11 CET 2007

- 4 x AAA -> need at least 5V. alternatively, use a 9V cell and a
  transistor for speaker output.

- RGB led onboard

- debug connector = battery connector (RCA plug)


Entry: TODO
Date: Mon Nov  5 21:40:23 CET 2007

list has moved to TODO file.


Entry: polling E2 interpreter
Date: Mon Nov  5 23:15:31 CET 2007

it's not entirely without trade-offs to choose for a polling
interpreter for E2 in the boot code.

PRO: independent of interrupt routines which is useful for debugging
application isrs.

CON: completely synchronous and non-buffered. this requires some
careful coding in order not to miss any data.

Maybe the boot code should contain both versions?

This leads to objects really: a vtable is a dynamic route word.

2variable stdout
: do-stdout stdout invoke ;
: e2-stdout stdout -> route
       rx ; tx ; on ; off ;


hmm.. i messed up slave.f: diff tomorrow..


Entry: macros and procedure dictionary
Date: Tue Nov  6 06:03:54 CET 2007

maybe the trick is to just get rid of the distinction between
procedure words and macros: a single namespace, with procedure words
being equial to

: bla 123 compile ;

this combined with a preprocessing step that identifies all labels in
the source text. a single namespace is easier to understand. separate
compilation units gives shadowing, while inside a single compilation
unit circular references are possible.

what i want this to move toward is a more and more static declarative
structure. maybe i should re-implement namespaces and build them on
top of the mzscheme module system. i doubt the solution i can live
with eventually will be significantly different than mzscheme's..
maybe a bit more liberal? or is that just because of current
implementation?

maybe i should make the compiled macros into a real cache, and store a
master version as a s-exp tree..


Entry: redefine
Date: Tue Nov  6 15:59:00 CET 2007


i need to 

  * make it illegal to redefine macros: they use a caching mechanism
    which replaces names with values (procedures).

  * make it illegal to define a label that is already a macro


the real problem is that redefines need a proper semantics in CAT. for
the forth, i think shadowing redefines are best: 'empty' is practical,
and it should work for macros too.

CAT is currently designed so redefines are illegal: this allows the
use of values instead of boxed values. some possible routes out of the
mess:

   - prohibit redefinitions
   - use shadowing + proper cache
   - use boxed values (reset the code inside the 'word' struct)

a deeper question is: why not use mzscheme namespaces for all macros?
answer: because i rely on late linking. is there a way around this? it
probably makes it too complicated, since i need to figure out a way to
map it to BOTH modules and units..

let's stick to the current hash table name space, and go fore the
boxed approach: mutate the words themselves, instead of their hash
table entry.

OK. that seems to work. remaining problem: defining words that are
macros. a way to solve this is to define each word in the dictionary
as a macro, compiling [cw <name>]

let's not.. i've added a warning, which made me realize that i do use
this: macros can call words with the same name as a fallback. that
mechanism might be more worth than a safety net.. no, a safety net is
more important: can fix the delegation using a symbol prefix. what
about doing this automaticly? the last matching pattern is always
mapped to a runtime call?

i do need to fix dangling macros though. let's see if i can run into
that case again..

ok, it's clear: a dangling macro can be disastrous.

this is a mess..

assume there are 2 classes of macros: CORE and PRJ.

PRJ needs to be flushed whenever the project changes. i am not sure
wether macros from CORE will actually bind to those defined in
PRJ. there is no such plugin behaviour as far as i can tell, but
nevertheless it is possible to go wrong so i should do this:

             flush cache = 
               - invalidate all prj macros (make them raise an exception)
               - detach them from the namespace


looks like i got it now: ns-flush-dynamic-words! + support


Entry: asm rewrite
Date: Wed Nov  7 00:58:54 CET 2007

found another asm bug: variables get allocated on each pass now. this
doesn't seem to be fatal though, just inefficient. sheepsint works, so
it can't be the weird hub.f bug i'm chasing..

[EDIT]

several things might change here, but it could be a good idea to keep
the current operation until i have time to clean it up a bit. cleanup
would be:
  - move 'here' to a separate dynamic variable
  - handle different dictionaries better.

the problem now is that 'allot' gets called multiple times without
reset.. it's probably best to filter it out in a preprocessing step.


Entry: sheep transients
Date: Wed Nov  7 04:45:41 CET 2007

'sound' needs to be a stack: a circular one, initialized with valid
sounds, or a delimited one so a sound can end in 'done' to fill the
rest of a control slice with another sound.

the point of this is to create a concatenation at run time. it is of
course possible to do this at compile time, but the fun would be in
*mixing* sounds..

i think i have the solution there: each pattern tick a 'program' is
erased, and filled with instruments that are played after each other,
with the last tone = silence.


Entry: low impedance signal source
Date: Wed Nov  7 07:00:27 CET 2007

i'm trying to understand the difference between these 2 statements:

 * for a low--impedance source you best measure current, while for a
   high--impedance source you best measure voltage.

 * a current source has high impedance, and a voltage source has low
   impedance.

the deal is that these are 2 different kinds of "measurement" because
of the entire different scale of energy involved: for sensors, you
want maximum energy transfer, but to "measure" a current or voltage
source, you want minimal energy transfer.

looking at a sensor as a voltage or current source, you want to "max
it out".


implementation:

so, doing this with an opamp is really trivial. bias (+) on Vdd/2,
feed back from (out) to (-) using Ra, and connect the current source
between the virtual ground (-) and (+).

                    R
              /---/\/\/\/\--\
              |  __         |
              | |  \        |
       /--||--o-| - \_______|___o Analog -> uC
   |\  @     _o_| + /
   | ||@    |   |__/   
   |/  @    |          
       \----o          
            |          
o--/\/\/\/--o---/\/\/\/--o 
GND        2.5V         5V


then:
 * connect the speaker to 2 analog inputs, so they can be switched in
   analog high Z mode: not good to bias digital ins at Vdd/2

 * run the opamp and bias network off of digital output

  
let's see if analog Z1/Z2 can be PWM outputs. no such luck.. maybe use
a transistor to shield the detached (Z) driver pin from the Vdd/2 bias
voltage. or just not use pwm...

remaining question is if the opamp, when powered down, can take a
large differential input voltage.


EDIT: the circuit doesn't work without a capacitor: the coil is a
short circuit at DC, connecting (-) and (+). due to nonzero offset
voltage, this saturates the amp.

EDIT: i understand now why measuring current is not a good idea. the
impedance of the device is dependent on frequency: 0 for DC, rising
linearly. if you measure current, the signal will have a strong low
frequency content. however, if you measure voltage through a resistor
that's say 10x larger than the stated impedance, the response is
flattened out since the resistor dominates. so the classic one works
better:

   SPK                Rg
    o          /---/\/\/\/\--\
    |          |  __         |
    |    Rs    | |  \        |
    o--/\/\/\--o-| - \_______|___o Analog -> uC
|\  @        ____| + /
| ||@       |    |__/   
|/  @       |          
    |       |
   === Cs   |
    |       |
    |   Cn  |
o---o---||--o
|           |          
o--/\/\/\/--o---/\/\/\/--o 
GND        2.5V         5V


Here Cn reduces noise by lowering the AC impedance wrt to the high DC
impedance point at (+). Rs = 47R and Rg = 100k give decent
results. About 2000x or 66dB.

The values of Rs should be as low as possible to reduce noise. I'm
comfortable now i understand the trade-off.

EDIT: i switched to using closed loop current measurement again, this
time limiting the overall gain to about 100x (using Rg=1K). followed
by a second stage with 100x gain this seems to work better. i suppose
my original problam was just due to too high gain, running into opamp
limitations.


EDIT: going back to the circuit with Rs: in that one the opamp's input
could be decoupled from what goes to the speaker by switching the (-)
and the top of the bias network to ground.


Entry: e2 hub
Date: Wed Nov  7 17:37:52 CET 2007

now i need to find a way to program the e2 hub. on problem in the
boot protocol is that i have no way to packetize the stream.. what i
want is a hub which is mostly in repeater mode (for the commands
0->15) but responds to other commands itself.

there are 2 alternatives. either write a 'fake' interpreter which
simulates the state machine that parses the debug input stream, or
change the protocol so it is delimited.

the former is a stupid short term thinking hack.. let's packetize.

hmm... this is quite a change again: thinking about optimizing the
problem. oops bad word :)

a way to do this quickly is to just prepend every message with the
size. that way the core interpreter can ignore it, but the repeater
can transfer without being able to interpret.

let's try that first.

notes
  - should put 'ack' at 1, so a stream of ones gives ack messages

hmm.. i chickened out. it's a lot of changes at once. lots of places
to go wrong. will cost me some frustration.. let's find another way,
go for the stupid hack.

if i can make the message length not dependent on context, meaning a
previous message length, i can probably derive the lengths manually.

the only problems here are the block transfer words, stuff that comes
back from the uC can be echoed without problems. (i'm thinking about
ping reply..)

ok, made the protocol context-free in the host -> target direction.

so, next:
  * make hub understand protocol (OK)
  * add hub commands
  * move to polling implementation in bootloader

hub commands: these should be an abstract interface for things one
would like to do with a hub. arbitrarily
 
    set client  (0=hub)
    on client
    off client
    
    
hmm.. i don't see it so clearly. what am i trying to accomplish? by
default i should be in 'hub application' mode, but it should be
possible to switch to hub debug mode too. the latter can be permanent
switch (requiring access to the dictionary to switch back).

i got it sort of figured out now. 

  TODO: * make hub switch between hub-interpreter and interpreter
          using external resistor. do this when hub is finished.

        * until then, find a way to start the repeater without having
          the hub dictionary loaded.


Entry: how much amplification?
Date: Thu Nov  8 01:22:04 CET 2007

i have 12 bit at my disposal. amplification is mainly determined by
the ratio of distances. i'm measuring current, which should be
proportional to sound pressure, which is 1/r^2

so, suppose i use a gain factor of 64 = 8^2, this gives a range ratio
of 8. say 1 - 8 meters: don't put them closer than a meter, and
further than 8.


Entry: poke & precompiled loops
Date: Thu Nov  8 06:14:42 CET 2007

i think i need to separate out the c code generator so i can start
generating code for PF and Pure Data, which will not be anything
forth-like so doesn't really belong in brood. in fact, the way i think
of it now (something akin to functional reactive programming) it will
be quite the opposite.


Entry: RS next order
Date: Thu Nov  8 17:42:19 CET 2007

 * linear regulators
 * 9V clips / battery holders
 * transistors?
 * high ohm resistors
 * xtals + caps / resonators
 * blue bell wire
 * schottky diodes
 * small signal diodes


Entry: more modem design decisions
Date: Fri Nov  9 02:44:47 CET 2007

 - modulation or not?

  some modulation is necessary since i can't transfer DC. but
  frequency response looks really non-flat to go for a wideband
  approach. i need to experiment.

 - FIR or IIR

  what is needed is a decimating filter. i can probably get much
  further with a crude windowed FIR than an IIR.


Entry: demodulator
Date: Sat Nov 10 15:12:58 CET 2007

i'm not going to waste time on trying out a pure square wave
modulation. let's stick with some simple demodulator, and have a look
at the numbers.

i currently have the debug net running at 8kHz. this should also host
the filter tick, which consists of:
   - read adc + update filter state
   - once every x samples, wake up the detector tasklet

what if i start out with using a FSK, because it requires no
synchronization, and use a square window where the 2 frequencies are
placed each other's zero.

so.. square window does have perfect rejection for the harmonic
frequencies. it's only the stuff that lies inbetween that is
problematic. ok, this is obvious.

the problem can be entirely moved to synchronization and linear
distortion due to transitions. if the receiver listens during a steady
state part of the signal, perfect rejection is possible. so the main
questions are:

  - how to synchronize?
  - how to limit transitions?

which brings me back to PSK.. maybe it is just simpler to use? as long
as the start of a symbol can be detected (threshold) and the phase can
be corrected (preamble) the rest seems not so hard really.

again, from a different angle: demodulating PSK is a synchronous mixer
followed by a low pass filter. i assume that a rectangular window is
going to be good enough as an LPF, which just leaves the problems of
signal detection and synchronization.

if i leave the non-synchronized receiver on constantly, it outputs a
24 bit complex number. during synchronization this needs to move
toward zero. the phasor will rotate once per window length. which
direction? if the direction is known, it's possible to detect a
crossing. the direction is determined by the rotation direction of the
mixer phasor.

i need to up the frequency: 2 MIPS is not enough. maybe i should do
that first.. then the output stage then the receiver then a decoder.

next actions:
 * have a look at PSK31 demod code
 * build the output stage (either PWM or SD)
 * build board + move to 40 MHz (monday: can't find xtals, maybe test
 on the dude?)

Entry: PSK31
Date: Sat Nov 10 18:56:01 CET 2007


PSK31: Peter G3PLX
http://det.bi.ehu.es/~jtpjatae/pdf/p31g3plx.pdf

some ideas from the paper:

 * this is a protocol for live communication. Error correcting codes
   introduce delays.

 * use relaxed bandwidth for the filter for smaller delays and lower
   cost.

 * take advantage of high frequency stability of modern HF radios

 * demodulation by using 1 bit symbol delay and comparison. ??? i dont
   get this one.

 * synchronize using the amplitude modulation component!

 * viterbi decoder for convolutional code


Entry: a single port for debugging
Date: Sat Nov 10 20:37:18 CET 2007

wait a minute. if i manage to plug the E2 protocol through to the icd
port, i could standardize on a single set of connectives. however, the
connection is not standard, but it is 4-wire (can run over telephone
cable) is synchronous and has a clock too. what this would solve is
the bootstrap upload problem, which is a nasty one..


Entry: transmission bandwidth
Date: Sat Nov 10 20:51:22 CET 2007

something i never really understood: Fig. 4 in the PSK31 paper shows
the bandwidth for random data. why is this so wide? why is reversal
not the highest bandwidth?

other questions: try to explain what this 'bit delay' demod is + how
the amplitude demodulation sync works.


Entry: BPSK synchronization
Date: Sat Nov 10 21:25:37 CET 2007

there are 2 kinds of synchronization necessary: carrier
synchronization, and bit clock synchronization. the former can use a
PLL, the latter can use the 1->0 transition.

suppose the following bit encoding: 8N1, with 1 = idle, and 0 = start
bit. during idle the phasor needs to be predictable. this is either a
fixed value, or an oscillation between 2 signal states. picking the
former this gives

   1 = carrier
   0 = inverse carrier

during idle, the synchronizer works: this is a PLL state machine which
turns a single phase increment left or right depending on which
quadrant the phasor is in. there are 3 bits determining quadrant.

there needs to be an AGC which reduces the 24 bit phasor to an 8 bit
phasor for easier demodulation and synchronization.


Entry: so why not use AM?
Date: Sat Nov 10 21:50:43 CET 2007

somehow both FSK and PSK seem too complicated. maybe i should start
with AM, then later (never) continue down the road and try FSK (double
AM) and PSK (with synchronization).

the most important interference we're going to find is bursts. these
should be able to eliminate using stop bits: 1 = on, 0 = off, which
means a burst will probably lack a stop bit.

algo: continuous square window filter with signal detector feeds into
simple sampler. if the sampler is not active, every 1->0 transition
will wake it up.

before starting with AM, i can just use some noise modulated
protocol. hell, anything that can get a 1 accross.


Entry: roadmap
Date: Sat Nov 10 22:16:08 CET 2007

EDIT:

  * try strong phase and run it off E2 (OK)
  * level detector, use the RGB led.

  
Entry: E2 next
Date: Sun Nov 11 01:02:47 CET 2007

apparently the E2 signal interferes quite a bit with the amplifier,
which is not such a big surprise. so i guess it's time to mature the
debug network a bit: 

  * switch to idle mode (keep high) when there's no host -> target
    comm.

  * find out what the initial 'missed ping' is all about.

i'm going to add stop bit checks to at least eliminate 1->0 bus
glitches as a source of errors.

that wasn't the problem... something is wrong with bus startup. maybe
i need to make sure 'off' will actually switch on power state?

looks like the error is with the slave init.. i get a predictable
reply to a ping after bus-reset:

> 13 >tx
> rx-size p
3 > rx> px
2D > rx> px
AD > rx> px
F7 > rx-size p


i made a little progress here: 

> hub-init
ERROR:
time-out: 1
> rx-size p
0 > 5 >tx 8 >tx 0 >tx
> rx-size p
1 > rx> p
131 > 9 >tx 2 >tx
> rx-size p
2 > rx> px
FF > rx> px
D0 > 

this sends the commands for fetching the 2 bytes at rom address
#x0008, which indeed should be #xFF, #xD0

this is reproducable. so i can conclude that the bytes get received
properly. something goes wrong in either the slave transmitter or the
host receiver..

i give up.. i can't find it. a workaround which seems to be stable is
to send an 'ack' which will send back a garbled byte.

apparently, unplugging and replugging the E2 connector gives the same
behaviour: first byte coming from slave is corrupted. so it can't be
host side..


Entry: amp notes
Date: Mon Nov 12 15:29:42 CET 2007

i changed the circuit back to 1K input impedance, 100K feedback in
first stage. the second stage has 1K + 100nF, and 100K feedback, and I
have no idea why this works: less noise, and it seems to have a good
response in the intended range..

maybe because most sounds have a 1/f response? i don't know... it
responds well to whistles, which is nice.

this is a 10 kHz pole... so it's basicly set up as a differentiator?
maybe because i have GBW rolloff this works? i'm puzzled.

i tried with a LM358N and it gives a lot more noise.

i tried TL072CN and it gives too much bandwidth! so, i use a
compensated integrator with 1K/100n in the source and 220K/4.7n in the
feedback section. looks like this is final enough.. maybe beef up the
amp just a tiny bit more..

PARTS LIST:

2 x 220K
1 x 100K
2 x 10K
3 x 1K

2 x 15pF
1 x 4.7nF
2 x 100nF
2 x 10uF

1 x 10MHz
1 x 18F2620
1 x TL072CN
1 x LED(red)
2 x 6 PIN HEADER
                                           C2 4.7nF 
                                        /-----||----\
                                        |           |
   SPK            Rg 220K               |  R2 220K  |
    o          /-/\/\/\--\              o--/\/\/\---o
    |          |  __     |              |  __       | 
    |   Rs 1K  | |  \    |       R1 1K  | |  \      |  
    o--/\/\/\--o-| - \___o--||--/\/\/\--o-| - \_____o LINE
|\  @        ____| + /     C1 100nF     __| + /
| ||@       |    |__/                   | |__/
|/  @       o---------------------------/          
    |       |
   === Cs   |
    |  10uF |
    |       |
    |   Cn  |
    |  10uF |
o---o---||--o
|           |          
o--/\/\/\/--o---/\/\/\/--o 
GND        2.5V         5V


First stage gives 220 x amplification.

TL072 (TI version, i'm using ST version) has a GBW of 3 MHz, with 220
x amplification this gives rolloff at 13 kHz. so for the first stage
i'm good.

Second stage is a band pass filter with 22 x amplification:

G . . . . ._________
          /.       .\
         / .       . \
        /  .       .  \
         1/t2     1/t1

t1 = R1 C1 = 100us  -> 10kHz
t2 = R2 C2 = 1000us ->  1kHz


because f1 > f2 the gain is not R2/R1 but R2/R1 * f1/f2. 

a bit quirky, but it works.. maybe i should try with exchanging the
time constants so f2 > f1. 

looks like these changes keep the transfer function the same, with a =
sqrt(10)

R2 -> 1/a R2
C2 -> 1/a C2
R1 -> a R1
C1 -> a C1

so, there's a reason to do it like i did! the capacitors are
smaller. so where's the trade-off? maybe noise due to large resistors?
however, when f2 > f1 the gain is independent of the capacitors.

let's make this a bit more intuitive. what happens when C1 is made 10x
larger, so f1 = 1kHz, and C2 is made 10x smallr, so f2 = 10kHz? the
gain is now 10x more, so then the gain can be reduced by making R2
10x smaller, which again requires C2 to be 10x larger. so the net
effect is:

R1 -> R1
C1 -> 10 C1
C2 -> C2
R2 -> 1/10 R2

this gives a 1uF capacitor. so alternatively R1 can be made 10x
larger, which requires C2 to be made 10x smaller. giving 10K and 470pF
respectively. (EDIT: this is what i did. works fine).

makes more sense now. so is there a ciruit that has independent
frequency and gain?

hmm.. i just tried the LM358N again, and it gives good results
also. guess the TL022 was just too low bandwidth? yup. 0.5 Mhz. hmm
the LM358N is only 1MHz ?


Entry: building the first krikit boards
Date: Mon Nov 12 15:41:18 CET 2007

 - not using E2, use serial + icd2 instead
 - xtal 40 MHz operation

pins to determine:
     * opamp + bias power
     * analog input (maybe first stage also)
     * speaker out

and figure out if the opamp can take 5V when it's powered down,
otherwise it needs an extra pin to pull the (+) input to ground also.


Entry: new sheepsint default app
Date: Thu Nov 15 20:00:15 CET 2007


something like this:

buttons:
* noise on/off
* xmod
* reso
* reset = silence

take button state from ram, uninitialized, so it survives reset.

xmod control uses 2 x 2Hx - 20kHz log
reso needs robustness for reso freq < main osc freq

3 x frequency nobs

2 knobs left.. maybe some modulation? osc 2 frequency + modulation
index. (formant / noise frequency)


Entry: fake chaos
Date: Sat Nov 17 02:27:51 CET 2007


following the same line as the formant oscillator, a fake chaotic
filter could be made. such an oscillator (some / all?) contain
unstable oscillations that are 'squelched'. the points where such
squelching happens are randomly distributed, but the bursts themselves
are quite stable, leading to an approximation as randomly spaced fixed
bursts.

does this work with the current setup? no.. it uses a random
period. that's different.

so... using the reso algo, it boils down to randomizing p0 with fixed
p1 and p2.

randomizing could be fixed + variable. the question is when to update
the period. the easiest is a fixed rate..

continuous updates seem to work. now p0 is modulated with a uniformly
distributed, taking care not to over-modulate so p0 wraps
around. everything is moved to prj/CATkit/demo.f


Entry: amplifier noise
Date: Mon Nov 19 21:25:32 CET 2007

for the next iteration of the krikit board, it might be a good idea to
improve the amp a bit. there are 2 things to consider:

  * input stage noise (impedance)
  * filter/amp stage capacitor values vs. noise and power consumption


Entry: shopping
Date: Mon Nov 19 23:57:44 CET 2007

AITEC:
 - perfboard?

VOTI:
 - perfboard

RS:
 - perfboard (RS: 206-8648, manuf: RE 200 HP)
 - oscillator
 - 9V battery holders + linear regulators
 - 8pin sockets
 - small signal diodes
 - high ohm resistors
 - blue,black bell wire


Entry: krikit todo
Date: Tue Nov 20 17:58:26 CET 2007


 - determine pins: analog in, opamp enable, 
 - output transistor: speaker out pin.
 - debug net: E2 / serial: minimal slave complexity solution

also for catkit: it might be best to connect the E2 bus to the serial
TX pin, which is multiplexed with an INT pin. (also for 2620? no, but
can be connected externally.)


Entry: ditch E2 ?
Date: Tue Nov 20 18:16:27 CET 2007

simple TTL serial with a bit of careful programming to ensure enough
'on' time (basicly, large enough cap or some extra '1' bits in the
data) might be a better approach, since it doesn't require a special
decoder in the target chip.

i could use a 'standard' here: the stereo minijack used in some A/V
equipment. ftdi sells them too apparently:
http://www.ftdichip.com/Images/TTL-232R-AJ%20pinout.jpg

or: leave the choice between E2 and serial open. given a bit of delay
on the client side when sending, and a proper 'listen' phase on the
host, a serial protocol can be used using the same hardware as the E2
bus:


         1K
TX  o--/\/\/\/\----\
                   |
RX  o----------o---o---o BUS
               | 
       /--|<|--/
       |
VDD o--o
       |
      === 10uF
       |
GND o--o---------------o GND


another thing to do would be to make the E2 protocol compatible with
the hardware uart. the problem here is the factor 5...


          (+)                      (-)

SERIAL    client + hw simple       4 wires 
SER+POW   client simple            3 wires
E2        client complex           2 wires


hmm...  the thing which i find most attractive is the possibility to
have a POWER socket that can be used as a comm. the rectification
diode also acts as a protection diode this way, and diode drop is not
really a problem when powered from 3V-5V.

so the main question is: how to make SERIAL run over 2 wires, given
the setup above? this is a sofware problem: how to synchronize. the
question is wether this extra synchronization effort will lead to more
slave complexity, with the bound being the E2 rx/tx.

 POW: works as long as cap is large enough
 RX:  always works
 TX:  works as long as host leaves room on the cable..

so the problem is for host to create a window. this shouldn't be too
big, for POW reasons.. to solve the timing issues here, it looks to me
that complexity will be a lot higher. i guess it's best to stick to
E2, but keep open the possibility of unidirectional serial comm.

conclusions:

     (4) serial + separate power over 4 lead telephone wire
     (3) serial, power from data, using stereo audio cable
     (2) E2: 2 wire, power connector

which one for krikit?


Entry: chaotic oscillators
Date: Wed Nov 21 01:08:37 CET 2007

( analog intermezzo.. )

making chaos with a mixer usually involves an EQ, feedback and
gain. the nonlinear element is a saturation. i've never seen this
particular cicuit realized as a special purpose chaotic
oscillator. lots of comparator based stuff, but no saturation..

the simplest i can find is this one:
http://www.ecse.rpi.edu/~khaled/papers/iscas00.pdf

which uses 3 integrators, a comparator and a summing amp. i'd like to
get the parts count down to 4 amps, so i can use a quad. this means
the summing amp needs to be incorporated somewhere else.

i'd like to try the following. on the plane i made a (faulty) circuit
with a svf with positive feedback (timeconstant t = 1 for simplicity)

   x'' =  -x + a x'

if a > 0 this circuit is unstable. for a < 0 this is a standard
biquad.

now, if the integrators can be biased to some voltage, this bias can
be derived from a smith trigger acting on one of the state variables x
or x', switching between two unstable points. the st is stateful, so
chaos is possible on two planes: R^2 x {1,-1}

( what i did wrong is to apply the bias to the (+) input, which can't
be correct since the capacitor voltages are relative to this point, so
changing bias would also change the state variables. )

so.. what about using the saturation of the opamp? by measuring the
voltage at the noninverting inputs, saturation can be detected. in the
phase plane of an svf, there are 4 points where saturation can
occur.

using the general timer principle: 

  * detect a certain condition (one of the state variables saturated)
  * discharge one of the state variables

see:
http://www.scholarpedia.org/article/Chaotic_Spiking_Oscillators

switched 2nd order: a classic unstable biquad with state variables x
and y, where x is decremented with the output of a schmitt trigger
before it's fed into the y integrator and the integrator chain input
summer.

see:
http://www.cs.rmit.edu.au/~jiankun/Sample_Publication/IECON.pdf

which basicly contains the circuit i just (re)invented..


EDIT:
some refs from johan suykens:

M.E. Yalcin, J.A.K. Suykens, J.P.L. Vandewalle, Cellular Neural 
Networks, Multi-Scroll Chaos and Synchronization, World Scientific 
Series on Nonlinear Science, Series A - Vol. 50, Singapore, 2005


Entry: problem chips
Date: Wed Nov 21 20:40:14 CET 2007

18LF2610-I/SP doesn't seem to want to program..
ok. that was stupid. they're not self-writable.


Entry: krikit pins
Date: Wed Nov 21 22:31:09 CET 2007

input works seemingly without problems. output is going to be a bit
more problematic.. i think it's not a good idea to drive the speaker
with the pin directly.. for 2 reasons: 8 ohms is to high a load and
the drive point needs to be tolerant for analog voltages (a CMOS input
is not, and i'd like to use the PWM)

with the current setup, a PNP switch is probably best.

so, design variables:

    * PNP / NPN  (cap to ground or Vdd)
    * suppression diode?
    * feed from battery (9V) or Vdd (5v regulated)

EDIT:
1K with PNP on 13/RC2/CCP1

EDIT:
ok. make sure the speaker is not full-on, the transistor gets really hot.

EDIT:
running into a problem: i'm using high = off, which apparently the PWM
doesn't like: it gives a single spike. so i need to explicitly turn
off PWM.

EDIT:
the chip resets unexpectedly. trying now with ICD2 attached: seems to
be stable. so something's wrong with my reset circuit probably. could
be power supply stuff. some spikes..


Entry: standard 16 bit forth
Date: Thu Nov 22 04:53:21 CET 2007

i keep coming back to a standard forth for sheepsint. purrr18 is there
to stay as a low level metaprogrammed machine layer, but teaching it
is a real pain..

maybe the time is there.. maybe a safe language is not the way to
go. maybe a simple forth is more important? maybe standard is
important after all? i have a lot of design choices to make, like
building the interpreter on top of a unified memory model or
not..

 [ mostly triggered by ending up at the taygeta site (from e-forth) ]

more questions. if i want to make a standard forth platform, wouldn't
it be better to go for the 18f2620 with a resonator and a linear
regulator, and add a keyboard in and video out while i'm at it? why
not the dspic then? ( because i didn't port to it yet, tiens! )

so possible projects for januari:
  - portable forth on top of purrr18
  - linear safe language on top of purrr18
  - dspic assembler + compiler
  - a home computer based on 18f2620

strategically, portable forth seems to be the best option, since this
solves most of the documentation issues.. dspic and the home computer
are more of a lab thing. the linear safe language is something i need
to figure out how to do first in PF context.

portable forth could use 'code' words to switch to purrr? maybe it's a
good exercise all in itself to try to write a standard forth, and not
care too much about optimization etc.. i have my non-standard forth
now, so it's good to aim for the average.


Entry: the circuit again
Date: Thu Nov 22 18:59:33 CET 2007

because the input impedance is so low, the 10uF cap is really not
neglegible! in fact, this gives a 10ms time constant with a 1K
resistor. that's 100Hz, but the filter cuts of at 1kHz, so it's ok..

it's not ok for what i wanted to do, which is to use only a single
transistor to drive the speaker without a capacitor. this might
reverse polarize the cap: it's probably best to keep the switching
frequency high enough so this doesn't need to happen.. i wanted to
replace it with a ceramic one, but that needs at least 1uF.

wait a sec.. maybe it's just not possible to drive the cap to a
negative voltage? yup.. the + side of the cap will be at saturation
voltage.

                           
                  Rg 220K  
               /-/\/\/\--\
               |  __     |
        Rs 1K  | |  \    |
    /--/\/\/\--o-| - \___o     o Vdd
    |           _| + /         |
   === Cs      | |__/           >|
    | 10uF     |                 |---/\/\/\---o SPK
    |          o Vbias          /|
    |                          |
    o--------------------------/
    |
|\  @
| ||@
|/  @
    |       
    0 
   GND


i wonder if it's possible to make the circuit such that the transistor
doesn't blow up if SPK is driven low for too long. check this:
http://www.winbond-usa.com/products/isd_products/chipcorder/applicationbriefs/apbr21a.pdf

somehow the DC path needs to be blocked or at least limited.

hmm.. i think there's really no better way than to use switching: that
produces the least amount of heat in the transistor. just be careful
to not drive it too long, and use minimal DC: start wave 'touching'
ground instead of symmetric around 2.5V.

i'm ordering some BC640, which are TO92(A), 1A dropin for A1015. i had
a BC516 PNP darlington on my list, but the BC639 is 1A. for the
switching loads i care about, i don't need high beta.


Entry: crap.. transistor won't switch off
Date: Sat Nov 24 20:50:04 CET 2007

using 78L05 regulator for the chip, but wanted to drive the speaker
straight from the 9V battery. problem is, i can't switch off the
transistor if i'm not using open drain output and a pull-up
resistor..

so i guess just stick with connecting the speaker to the regulator
output, and up the regulator a bit... tho it's 100mA, maybe it can
take a bit of peak current?

do i have everything now? i guess.. maybe an on/off switch, but that's
easy to do later. also, if possible, add a connector for the led, so
it can be brought out to the box.


Entry: got the first carrier on the mic amp
Date: Sat Nov 24 13:03:50 CET 2007

sort of a little mile stone.. but no time for celebration yet. it's at
600Hz, which seems low.. putting it higher gives less response. and
it's quite distorted. higer frequency, more distortion.

ok..

so the resonance frequency of the speaker, measured by moving it over
the table, is about 625 Hz. which is of course the reason why i get
such a good response at 610Hz :)

this will be really hard to get out, so why not use it? use either fm
or pm at that frequency, and adapt the filter / amp accordingly.

so what about this:

make the amp go from 450Hz to 1kHz, and use the resonance frequency of
the speaker as carrier wave.

22K 100nF (450Hz) - 1M 1nF (1kHz) - gain = 45

  init-analog pwm-on 1 freq ! 2 att ! 0 nwave


half of that seems to work fine too.. 305Hz. what about a golden ratio
FSK modulation scheme? that way the harmonics of the lower one won't
interfere with the higher one..

EDIT: sticking to one carrier seems best in light of this resonant
peak.


Entry: direct threaded forth
Date: Thu Nov 29 11:02:39 CET 2007

a couple of days of rest doing admin stuff.. going to amsterdam today
for the final sprint. some things that crossed my mind:

DTC FORTH VM

  * purely concatenative virtual machine code: implement literal as
    i.e. 8 bit literal, and 8 bit shift + literal: code never accesses
    IP. same for jumps.

  * unified memory model is probably more important than speed: it
    allows for other memory mapped tricks, since memory has a single
    access point. the real problem is that instruction fetch is built
    on top of the memory model. maybe this can be optimized somehow?


DEMODULATOR

  * bitrate ~ bandwidth^(-1)

    this can easily be seen in the response of a sharp filter: it
    rings a lot, so can't accomodate much time variation.

  * AM (data=power) -> PM (signal + data=phase). 

    i'd like to stick to a single carrier. the reason is that the
    resonant peak of the speaker is something that best can be used
    instead of fought. i didn't measure it, but it at least looks and
    sounds quite sharp. comparing waveforms on the scope, i would
    guess 12dB through both sending and receiving speakers.

  * sampling rate is only dependent on bit rate, not on carrier
    frequency: aliasing sampling can be used. this means that in order
    to accomodate more processing power on the same chip, the bitrate
    can just be lowered.


and the combination of both: since this is going to be quite
math-intensive, it's probably best to choose for a bit more highlevel
approach: construct a couple of decent abstractions, maybe some easy
to use fixed point math routines, instead of perfectly optimal
ones.. the chip runs at 10MIPS. if i aim at 100bps, that's 100
instructions per bit, if i aim at 10bps, that's 1000.. the idea is to
get it to work first.


Entry: math routines
Date: Fri Nov 30 12:21:49 CET 2007


time for math routines. some design decisions:

     * signed/unsigned
     * bit size
     * saturated/overflow

it would be nice to be able to reuse these later in the DTC standard
forth as math routines. i do have special need here, in the sense that
the input is only 8 bit.

the main problem is the multiplication routine. the standard has a
16x16 -> 32 signed multiplication.


2 approaches for the filter:
  * simple: 2nd order IIR bandpass
  * matched FIR filter as in PSK31


i have enough memory to perform FIR filtering. let's focus on trying
to understand the PSK31 demodulator. till now i only found code
examples, no highlevel pseudocode or diagrams.

here Peter G3PLX talks about AFC (automatic frequency correction):
http://www.ka7oei.com/fsk_transmitter.html#FSK31_Explained

with PSK apparently the frequency correction doesn't need to know
anything about the data, since the spectrum is symmetric around the
carrier. 

i'm not sure wether AFC is necessary in my scheme: all recievers and
transmitters are stationary, and there's no wind. on the scope however
i did see some slight variation in period, but this was probably due
motion of the speaker/mic (just sticking up by its pair of connecting
wires).

 "To get in sync. the PSK31 receiver derives it's timing from the 31Hz
  amplitude modulation on the signal. The Varicode alphabet has been
  specially designed to make sure there's always enough AM to keep the
  receiver in sync. Notice that we can extract the AM from the
  incoming signal even if it's not quite on tune. In PSK31 therefore,
  the AFC and the synchronisation are completely independent of each
  other."

So it's not completely true that the AFC doesn't need to know anything
about the data: data needs to be 'rich enough'. But the trick of
getting the AM straight from the signal is interesting. This means i
can probably proceed nicely from AM -> PM

Some alarming notions here:
http://www.nonstopsystems.com/radio/frank_radio_psk31.htm

 "Like the two-tone and unlike FSK, however, if we pass this through a
  transmitter, we get intermodulation products if it is not linear, so
  we DO need to be careful not to overdrive the audio. However, even
  the worst linears will give third-order products of 25dB at +/-47Hz
  (3 times the baudrate wide) and fifth-order products of 35dB at
  +/-78Hz (5 times the baudrate wide), a considerable improvement over
  the hard-keying case. If we infinitely overdrive the linear, we are
  back to the same levels as the hard-keyed system."

What i saw on my scope, is a strong 2nd harmonic, probably due to
non--linearity caused by the DC bias in the speaker. Using some kind
of feedforward correction based on a measurement it is probably
possible to correct this when it becomes a problem: the transmitter is
simple enough so all kinds of wave shaping corrections could be
introduced there.

 "The PSK31 receiver overcomes this (ED: side lobes due to square
  window) by filtering the receive signal, or by what amounts to the
  same thing, shaping the envelope of the received bit. The shape is
  more complex than the cosine shape used in the transmitter: if we
  used a cosine in the receiver we end up with some signal from one
  received bit "spreading" into the next bit, an inevitable result of
  cascading two filters which are each already "spread" by one
  bit. The more complex shape in the receiver overcomes this by
  shaping 4 bits at a time and compensating for this intersymbol
  interference, but the end result is a passband that is at least 64dB
  down at +/-31Hz and beyond, and doesn't introduce any
  inter-symbol-interference when receiving a cosine-shaped
  transmission."

 "PSK31 is therefore ideally suited to HF use, and would not be
  expected to show any advantage over the hard-keyed
  integrate-and-dump method in areas where the only thing we are
  fighting is white noise and we don't need to worry about
  interference."

So maybe it's not necessary yet? Since we're using a single frequency
in the first attempt, a demodulator that rejects nearby signals might
not be required.

Anyway. Conclusion: i need to have a look at the exact algorithm used
for matching + synchronization.


Entry: demodulator.f
Date: Fri Nov 30 21:07:31 CET 2007

i had some bottom up code (what can be done efficiently) using
8x8->16 unsigned multiplication and 24bit accumulation. this works
well for rectangular windows, but not so much for non-rectangular.

maybe rectangular is enough since we don't have interfering signals?
anyway, it might be wise to look at how to do a windowed one..

i guess the idea is like this: make the window obey some kind of
average property that can be removed using maybe a separate
accumulation of the signal.

it doesn't look that hard: ** is inner product

[ s(t) + s_0 ] ** [ w(t) + w_0 ]

so there are 3 correction terms:

   s(t) ** w_0  == 0
   w(t) ** s_0  == 0
   w_0 ** s_0

which requires the average signal s_0 as the only variable component,
which needs to be scaled with the window DC component (can be 2^...)
and a fixed offset.

so i can basicly use the same unsigned core routine for general
complex FIR filters: renamed the macros to mac-u8xu8.f, and added
complex-fir.f


Entry: drop dup
Date: Fri Nov 30 23:02:59 CET 2007

optimization: drop dup -> movf INDF0


Entry: implementing the filter loop : complex-fir.f
Date: Sat Dec  1 10:27:37 CET 2007

i have it down to about 31 instructions for an unsigned multiply
accumulate operation 8x8 -> 24, and an accumulation. both can be
combined after the loop to correct the offset.

offset compensation is implemented now, and all sharable code has been
moved to macros in mac-u8xu8.f

routines are tested and seem to work just fine. so:

    filter coefficients are centered at #x80, but the accumulator will
    shift one position to the left to compensate, so the filter
    coefficients behave as s.7 bit fixed point with inverted sign bit
    if the accumulator is seen as s.15.8

this means that: 

    11111111 -> +0.1111111
    00000000 -> -1.0000000


Entry: -!
Date: Sat Dec  1 11:49:04 CET 2007

\ value addr --
: -! >r negate r> +! ;


this subtracts the number on the stack from the variable, not the
other way around. note that this has the argument order of '!' not of
'-'

the reason for doing it like this is that this occurs the most:
subtract a value from an accumulator variable.


Entry: subsampling
Date: Sat Dec  1 13:43:35 CET 2007


A baud rate that sounds like 16th notes would be nice, which is about
8 Hz. A carrier of 600 Hz, this gives a ratio of 75. The sampling rate
needs to be > 16Hz, so let's take the one in the neighbourhood of
32Hz.

Care needs to be taken though when using aliasing: the frequency error
will amplify. Let's see.. using 10 Mhz, the subdivisions become:

2^20 -> 9.5 Hz     baud rate
2^18 -> 38.1 Hz    sampling frequency
2^14 -> 610.4 Hz   carrier frequency
2^12 -> 2.44 kHz   4 x carrier
2^7  -> 78.1 kHz   PWM frequency

the carrier/baud rate here is 2^6 = 64

going from carrier -> sampling frequency is a subdivision of
16. what's the error of the oscillators?

CSTLS10M0G53 has 0.5 % precision. times 16 that becomes 8.0 % which is
quite a lot.. so probably it does need continuous phase compensation??

another reason to not subsample is to get better noise performance and
better frequency rejection due to longer integration time. using 4 x
carrier frequency is still only 2.4 kHz which is at 2^12 subdivision,
or 4k instructions per sample, which is absolutely no problem. this
gives 2^8 samples per symbol.

another thing to think about: synchronization. this can be implemented
using time shifts or phasor rotation. the latter is probably not a
good idea due to problems with filter matching. so actually, the
carrier needs to be significantly oversampled, or at least mixed.. i
think i need to make a new table with variables..


Entry: synchronization
Date: Sat Dec  1 14:34:11 CET 2007

if there are enough symbol alternations present this causes
significant AM modulation which makes synchronization easy: sync to
the zero crossings. this means the preamble needs to be 01
transitions.. probably best to use simple async with 1 = idle =
transition and 0 = no transition.

next: 

 * AM send
 * AM receive


Entry: multiplication again
Date: Sat Dec  1 15:16:13 CET 2007

funny how this is starting to be an exercise in multiplication
routines :)

since i'm using unsigned multiplication for the filter for efficiency
reasons, i have to implement signed multiplication in cases where
correction can't be moved to outside of a loop, which is the generic
case.

the 16bit multiplication performs correction using conditional
subtraction. for 8 bit it's probably easier to use conditional
negation, since that doesn't require extra storage.

this sucks.. -1 * -1 overflows to -1..  maybe it's better to use a
representation that can actually encode 1, even if this means giving
up one bit of precision?

s1.6


Entry: userfriendly
Date: Sun Dec  2 10:37:42 CET 2007

i need to weed a bit in the userfriendlyness and SIMPLIFY the way some
things are used, because it seems as if some combinations cannot be
made. i wanted to create scheme code that generates forth macros, but
it looks like this is not so easy!

another thing is 'splitting' the host and target, so the host can run
some kind of query program in cat or scheme.. maybe the 'current-io'
parameter should be set back again in prj and scheme modes?


Entry: clicks
Date: Sun Dec  2 13:03:06 CET 2007

need ramp-up and ramp-down to prevent clicks. ramp-up time should be
in the order of 20ms = (50Hz)^(-1)

after ramp-up, carrier fade in should be used. this can use the
'attenuation' variable. OK. using 25ms ramp to bias.

now, how to intialize the phase?

  OOK: can start envelope with -128 (amp = -1)
                 carrier with -64 (amp = 0)

  BPSK: needs carrier fade-in.


looks like BPSK sounds smooth enough without envelope fade-in when
starting the carrier at phase = -PI/2

doing the same now for AM, so there's no problem with envelope
frequency = 0.


Entry: transmitter
Date: Sun Dec  2 15:08:19 CET 2007

time to get the transmitter sorted out, so i can make a standalone
device that sends out a known data sequence:

  * combine the framed rx/tx with sending/receiving
  * figure out OOK and BPSK transition based send modes

return to zero, i don't see the point in that, so transition based
seems good enough. let's say 1 = trans, 0 = notrans. this has the
advantage that an idle line is the richest signal, good for sync
purposes.

transition based is easiest to implement using the current code.  in
case transition based is not desired (i.e. because it accumulates
error), this can still be pre-coded as long as a transmission starts
with a known oscillator phase.


Entry: signal rates revisited
Date: Sun Dec  2 18:50:57 CET 2007

4 independent frequencies:

  * PWM TX rate: determines high-frequency + aliasing noise
  * carrier:     only important for path (i.e. speaker reso)
  * baud rate:   bandwidth -> noise sensitivity
  * RX rate:     selectivity (related to FIR length) 

(EDIT: carrier and baud rate are not independent wrt data filter
qualoty. see below)

important for the receiver are :

 - baud rate, which limits the maximal integration time (dependent on
   symbol length).

 - RX rate: enables longer filter lengths, which gives more
   selectivity and noise immunity.


it doesn't make sense, for constant baud rate, to up the RX frequency,
but keep the FIR length constant, so:

        FIR =  k . (RX / BAUD)

        RX = Fosc / OPS

where k is the number of symbols the FIR spreads over, probably 1 or
2. and OPS is the amortized number of operations per sample
(processing and aquisition). the filter is 32 in the current
implementation.

this gives about 300kHz at 10MIPS. looks like we have some
headroom..

anything more than 8kHz is probably not going to make much sense.


( I was thinking about noise and dithering, and that at this high
  frequency because of absence of noise there will be no 'extra'
  sensitivity due to the dithering at levels close to the quantisation
  step, but there probably will be extra due to pwm effects. So it
  looks like all small bits help.. )


EDIT: another variable i forgot to mention is symbol rate
vs. carrier. using a mixer, it is desirable to have large separation
between the two so a simple data filter can be used.


Entry: matched filter
Date: Sun Dec  2 21:22:33 CET 2007


differential BPSK data stream:
.   .___.   .   .___.
 \ /     \ / \ /
  X       X   X  
./ \.___./ \./ \.___.
  1   0   1   1   0

Using cosine crossfading as implemented in modulator.f is effectively
the same as using symbols 2 baud periods wide with a 1 + cos
envelope. this wavelet is the output filter which maps a binary +1,-1
PCM signal to the shaped BPSK signal.

This output filter needs to be matched in the receiver.

Now, about matched filters..

A matched filter in the presence of additive white gaussian noise is
just the time-reverse of the wavelet: one projects the observed signal
vector onto the 1D space spanned by the wavelet's vector in signal
space. This gets rid of all the disturbances orthogonal to the
wavelet's subspace.

When the noise is not white, the noise statistics are used to compute
an optimal subspace to project onto, such that most of the noise will
still vanish.

I don't have noise statistics, and I'm not going to use any online
estimation, which leaves me to plain and simple convolution with the
time-reversed signal.

I do wonder what all this talk is about 'designing' matched filters
for PSK31...


Entry: phase synchronization
Date: Sun Dec  2 22:27:44 CET 2007

I'm confusing 2 things:

  * bit symbol synchronization
  * carrier phase synchronization

Using a complex matched filter, phase synchronization can be done
entirely by using an extra phase rotation operation: it doesn't really
matter what comes out, as long as:

 - the matched filter's envelope is synchronized 
 - we're using I and Q filters

It's clear that mismatching the symbol clock has a lot less effect
than mismatching the carrier phase. When compensating the carrier
phase, we compensate what's aliased down after subsampling at symbol
rate.

This still needs 2 separate synchronization methods: bit symbol
synchronization (which sample point to start filtering) and carrier
frequency/phase compensation.


Entry: recording sound
Date: Mon Dec  3 10:35:52 CET 2007

let's go for this: a symbol is 256 samples. This allows easy buffer
management. This fixes the sample rate at 4.88 kHz

recording seems to work ok: it's at 8x the reso frequency of the
speaker: scratching it over a newspaper gives a nice saturated wave
with period about 8.

maybe it's time to add gplot in the debug loop.


Entry: IIR or FIR
Date: Mon Dec  3 15:39:49 CET 2007

IIR:
      * mixer + lowpass
      * sync mixer to carrier
      * start bit detection = zero crossing
      * no messing with blocks
      * only approximate matching

FIR:
      * possible to construct optimal matching filter
      * no phase distortion
      * synchronization more complicated (filter freq is fixed?)
      

It looks like IIR + PLL is really simpler to implement (same at every
sample block, no buffers necssary).

NEXT: have a look at how to implement a PLL.

( Actually... It should be possible to mix the assymetric tail of a
  stable IIR filter in the transmitter! Though not simple due to
  rounding.. Something like that can probably not be computed exactly,
  so this would require a bit more expensive transmitter.. )

Using a PLL it's probably best to first try to synchronize to a clean
carrier. Since a mixer is necessary as part of the processing chain,
that can be used to perform the correction.

                                     
-> [ MIX ] -> [ LPF ] ----> I --> [ AGC/HIST ] -> bits
      ^               --o-> Q
      |                 |
      \-------[ OSC ]<--/

The quadrature component can be used as an error feedback. This always
works, since it's not present in the signal.

reading this:
http://rfdesign.com/mag/radio_practical_costas_loop/

two things are mentioned to perform carrier recovery:
    * squaring + division
    * costas loop

note that's about an analog implementation.

so not all roses in the IIR world.. what about taking the best of
both? perform carrier recovery using a mixer + PLL and use a similar
approach for the data sampling recovery.


a nice place to go back to this paper:
http://www.argreenhouse.com/society/TacCom/papers98/21_07i.pdf

where the signal is sampled using a 1-bit dac, and the mixer has
values {-1,0,1}. after integration, an adaptive rotation is performed.


Entry: simplified
Date: Mon Dec  3 16:56:50 CET 2007

 * AGC: or the absolute value of a symbol buffer + compute shift count
 * INTEGRATE: sum the entire buffer (no sideband rejection)
 * sample at say 8 points per symbol

since there's no filter other than the analog 450-2kHz this should
perform pretty bad. but i guess it's time for a fail-safe.. use noise
modulation first :)

NM -> AM (async) -> PM (sync)

a genuine problem doing this al experimental is the program
sequencing.. there's a huge difference between being able to do
something per sample and having to store some for later..

EDIT: it doesn't make sense to write an AM demodulator without
thinking about the BPSK that will follow, so i need to do AM with a
separate mixer + LPF.

mixer seems straightforward. the remaining problem is the LPF. If i
can make that work with simple shifts, where's as good as there..


Entry: triangular window
Date: Mon Dec  3 17:33:12 CET 2007

however.. it is probably possible to use triangle windows and
'recompose' things later, since a triangle window is self-similar!

given a number of sample points, from this construct 2 numbers: one
weighted with ramp up, one with ramp down. these can be easily
combined, so one could shift the center of the window and recompute
easily.


Entry: interrupts
Date: Mon Dec  3 17:46:08 CET 2007

looks like the real question is wether or not to use interrupts. doing
this as state machines leaves too little room for block-based FIR
techniques. i'm also not very convinced about trying the AM first,
because i'm already trying to optimize the layout for that algo: i
need to go for mixer + IIR LPF, and implement AM in that framework.


Entry: data filter coefficient
Date: Mon Dec  3 18:31:07 CET 2007

The constraint is: we don't care about the delay, but attenuation
shouldn't be too big. What about this: pick the pole at half the bit
rate, and round upto the next power of 2.

  t <-   1/sqrt(2) = (1 - 2^(-p)) ^ t

EDIT: how to pick p ?

it's easier to use this approach, where we require the decay time to
be such that a the response will drop below the 1/2 threshold in one
symbol time:

  (1 - 2^(-p)) ^ t < 1/2

where t is the number of samples in a symbol. this is equivalent,
since the t in the previous formula is related to half the baud rate.

if t is large (in our case it's 64), the linear term is the one that
dominates the lhs, so the above can be approximated by

  (1 - 1 + t2^(-p)) 

which gives an expression:

  p = log_2 (2t)


Entry: AM vs PM
Date: Tue Dec  4 11:38:18 CET 2007

something i missed yesterday: demodulating AM with a non-properly
tuned mixer might give trouble.

no, this is not the case as long as both the I and Q components are
computed: it only gives a problem for PHASE (which will rotate on
mismatch) not AMPLITUDE.


Entry: data filter implementation
Date: Tue Dec  4 12:03:34 CET 2007


the easiest way to keep precision is to never loose any bits. the data
filter has the form:

   x += (1 - a) x + a u

where x is state and u is input, and a = 1 - 2^(-p)

the current settings give t = 512 (5kHz sample rate and 9Hz symbol
rate). which means p = log_2 (1024) = 10 as the approximation of the
bound. speeding up the filter by a factor of 2 gives p = 9.

it might be worth relaxing it even further to 8, so shifts are
eliminated.

( so, just out of curiosity.. is it possible to use unsigned
  multiplication? just doing this without thinking introduces a scaled
  copy of the original modulated signal in the output. if the lowpass
  filter allows, this might be not a problem: requirements are just 2x
  as strict. )

problem with signs: it might be simpler to work completely with
unsigned values since signs make multi-byte arithmetic more
complicated (need sign extension). a simple solution is to run the
multiplication as signed (to get rid of the component at the carrier
frequency) but run the filter accumulation as unsigned. the DC
component in the result is completely predictable and can be
subtracted later.

first experiment i measure something: noise is at around 5 and maximal
measured signal is around 150. that's a significant difference. now
it's time to map the 24bit range to something more managable.

now to be careful not to overflow the filter input: it seems
reasonable to ignore the lower byte.

looks like i have a bug in the signed 16bit multiplication
routine. EDIT: yep.. type TOSL replaced with TOSH


Entry: better debugging tools
Date: Tue Dec  4 15:07:04 CET 2007

i need a way to print reports from ram.. before this can be done in a
straightforward way, the interaction language (which will need to be
cat or scheme) need to be defined properly + some way of adding code
like this to the project needs to be defined.

what i need now is a way to inspect 24 bit numbers.. what about adding
inspectors to the code? these are forth words that send out data in
the form of a buffer. i could then make inspectors for any kind of
thing.

EDIT: yes.. i really have a good excuse to make proper debugging
tools. just fixed the prj console to be able to connect to the
target. was thinking about properly specifying interactive commands as
an 'escaped' layer over the target interaction.. basicly every
possible 'island' in the code needs to be extensible.. most
importantly: macros, interactive words, prj words, scheme code, ...

EDIT: considering the amount of time i'm loosing to get this thing
going, it might be wise to standardize on method.. i.e. all 16bit
signed fixed point or something.


Entry: double notation
Date: Tue Dec  4 16:15:02 CET 2007

there's some things to distinguish:

    1 x 1 -> 1      word (standard words)
 
    2 x 2 -> 2      _word (16bit variants of standard words used in DTC)

    1 x 1 -> 2      2word, 3word etc.. nonstandard, any combination
    1 x 2 -> 2      that makes sense
    1 x 3 -> 3
    1 -> 2
    ...


Entry: costas loop
Date: Tue Dec  4 23:09:21 CET 2007

Have a look at the HSP50210 datasheet. It gives a nice general idea
about how a PSK receiver would work: 3 tracking loops
(AGC,carrier,symbol), user selectable threshold, matched filter (RRC
or I&D), soft decisions.


Entry: saturation
Date: Wed Dec  5 12:27:13 CET 2007

It's important to be able to prevent wrap-around distortion. Some kind
of saturation mechanism might make this easiest: it's easier than
carrying around high precision data. So where to saturate? Most
straightforward is the LPF, but at first glance it's better done at
the point where power is calculated, since LPF seems to have enough
dynamic range.

The properties of a non-saturated word are:
    * sign word byte is #x0000 or #xFFFF
    * both words have the same sign bit

This can be reduced to:
    sign word + lower sign byte == 0

OK


Entry: weird LPF output
Date: Wed Dec  5 15:42:12 CET 2007

i think i need to focus on building some more debug tools
today.. something's going wrong and i can't find the cause. the
problem is amplitude modulation in the LPF power output, going from
100 -> 400, with a period of about 35 = 140Hz. and a component at 4 x
that frequency, not locked, which is probably the carrier.

i measure this with the modulated signal, and with an unmodulated
carrier.

go one by one: it's probably best to try to eliminate the DC offset,
so at least that is not drowning the signal component, which is a lot
smaller.. EDIT: this is already happening: sample is converted to
signed then multiplied.

ok.. questions

 * why is there a 1/8 Hz component in the power output? i would expect
   the power to be smooth.. not modulated -> this is just noise. the
   level is really low, so it's the accumulation of (2^(-8) * u).

 * why is there a 1/64 Hz component in the power output?
   EDIT: the frequency is a mixer mismatch = 1/8 - 1/8', where 8' is
   the not quite =8 measured carrier frequency.

 * what does the filter input look like?

dit some input signal measurement, and the first thing i notice is
that the carrier frequency is quite off. i get 28/4 is T=7 instead of
T=8. which would give a beat at 1/56. that might explain a lot...

ok.. i get it. the convolution of these 2 spectra:

     |  .  |     cos(w1 t)
    |   .   |    cos(w2 t)
        0  

gives:

 |     |.|     | cos(wd t) cos(ws t)
        0

with wd = w2-w1 and ws = w2+w1

the sinewave that gets folded near 0 will interfere with the signal
data! so this approach just doesn't work without synchronization!       

it looks like the only way to do this is to either have proper
synchronization, or use a band pass filter, not a mixer.

AM: first order lowpass with complex coefficient, followed by output
    power computation.

PM: requires AGC or cartesian->polar conversion for properly scaled Q
    -> phase feedback.

the quick and dirty way is to just filter the absolute value of the
input. then add a more selective filter. hmm.. i still need to kick
out DC, so better go for the frequency-selective filter.


Entry: 1/8 or 1/4 frequency filter?
Date: Thu Dec  6 17:12:32 CET 2007

it's probably easier to separate the problem in 2 parts: (1 -1 ; -1 1)
with sqrt(2) amplitude, compensated by a single arbitrary
multiplication to get the gain to 1-2^(-8). this requires at least
16bit. incorporating the scaling factor in the matrix seems to lead to
the same precision problem, but requires 4 multiplications instead of
one.

so what about a 1/4 filter? that's even simpler, and doesn't require a
sqrt(2) scaling factor, so the (1-2^-8) scaling can be done without a
multiplier.

so.. the lpf filter i had before can be re-used. the only thing to add
is to cross-add the filter states, and add in the input signal.
rotating the signal can be done using a 4-state state machine, which
will add/subtract the signal to/from one of the states. +x +y -x -y

give this approach it's probably also possible to reduce the LPF state
from 24 to 16 bit. check this. in a stable regime, using 2 bytes, the
high byte will have the amplitude of the input at the frequency, so at
least for strong signals it would be stable (gain = 1). looks like it
has only effect on noise and rejection.


Entry: too much carrier drift
Date: Thu Dec  6 19:16:16 CET 2007


so, to get a bit of full-circle understanding: why not mix a signal to
DC and filter its absolute value? looks like the thing i did wrong was
not the mixer, but the place where the smoothing is going on.

or: 
  * mix + filter: isolates a frequency region
  * full-wave rectify + filter.

2 filter operations are essential here, so it's probably easier to do
only one, and instead of full-wave use the amplitude/power of a
filter.


But but but... maybe the filter is actually too sharp? I measured the
carrier at 1/7Hz, expecting it at 1/8Hz.. i'm missing a parameter:
bandwith and time decay are related, but increasing the sample rate..

look: this is just a shifted one-pole filter: it's the equivalent of
passing the difference signal 1/7-1/8 to the lowpass filter. that
probably won't survive.. so i'm stuck with the same problem: the
carrier shift is much more than the bandwidth of the signal!

it's about 80Hz at 600Hz, while the signalling frequency is around
9Hz. this means i have to do something about it... it's either going
to be manual tuning, or adaptive tuning.. synchronous demod is
starting to look like the only solution. or i should just use the 2
filter approach of above:

  * wide filter to eliminate noise: it should be wide enough to
    capture the carrier tolerance.

  * narrow filter to perform demodulation after full-wave rectify.

it's starting to look like synchronous is going to be a lot less
hassle.. again.. what do i need? an AGC to normalize the Q output such
that i can use it as feedback to phase offset.

go over this again.. something's wrong.


Entry: cordic
Date: Fri Dec  7 09:36:31 CET 2007

the most elegant solution seems to be to use a cordic I,Q->A,P
transform, so both the AGC and PLL have proper data to work on.

For use in the demodulator, the constant scaling factor is not a
problem. What I would like to do is to perform sequential updates: use
Q to update I and then use the updated I to update Q. With a=s2^(-n)
this amounts to:

| 1  a |   | 1  0 |    | 1 + a^2   a |
|      | * |      | =  |             |
| 0  1 |   | a  1 |    | a         1 |

Which is no longer a scaled rotation. Correcting this looks like more
hassle than just performing the update in parallel.

I don't need a lot of phase resolution. 8 bit is definitely enough.

Hmm.. Is going to be a lot of work.. 


Entry: simplified PLL
Date: Fri Dec  7 11:21:40 CET 2007

What about using a 2 bit phase detector which just detects the
quadrant and accordingly adjusts the frequency?

 -2 | -1
----+-----    
 +2 | +1

With + meaning counterclockwise.  Since we're not using the Q
component, both directions of I should be allowed, so a better
approach is:

 +1 | -1
----+-----    
 -1 | +1

Filtering this signal and using it to increment the frequency gives
the right amount of feedback near the lock. In the phase diagram, what
needs to be done is to slow down the oscillator. The design parameters
here are:

     - smoothing of the phase error
     - gain of the phase error

I'm not too sure about oscillations though.. Maybe linear error
response is an essential element?

I guess i'm missing some experience here. Gut feeling says it should
be possible to design a PLL by filtering a 2 bit phase detector. Gut
feeling also says that this will lead to oscillations.

I'm off track again. These are the choices to make:
  - go for CART->POLAR transform with high resolution (i.e. 8 bit)
  - use AGC and Q component for feedback.

The latter seems simpler. Maybe i should try that first. Cordic isn't
as straightforward as i thought since it needs a barrel shifter. Which
could be implemented using the multiplier, but then why not use proper
coefficients?

So.. AGC.

Stick to the mixer algo, but figure out how to perform variable gain
so the error signal used to drive the phase adjustment is properly
scaled.

Estimate the gain using a filtered sum of absolute value of the I and
Q components.


Entry: PLL analysis
Date: Fri Dec  7 13:32:07 CET 2007

Using linear system theory: around the error=0 point, the system is
linear and behaves like a controlled integrator. We control frequency
(velocity) and out comes phase (position) which is the integral of
frequency. Such a system with a proportional controller is stable
because it is first order with negative feedback. It can be sped up by
increasing the gain. However, faster also means more susceptible to
noise of the control signal (in the PLL case the Q signal)

This is in absence of a disturbance signal. This can be modeled by a
signal d which drives the integrator directly. In the PLL case this is
the frequency mismatch. This will result in some permanent error. The
ratio between the 2 is determined by the error amplification.

Questions:

  * add or subtract from rx-carrier-inc ?
    -> depends on wether one wants to sync to +I or -I

  * how to prevent mixer drift?
    -> looks like the DC component of the error should not have any influence?


Entry: discrete control systems
Date: Fri Dec  7 14:00:14 CET 2007

Looks like the thing i'm confused about is the difference between
analog control systems and digitial ones. An analog 1st order
proportional control system can never overshoot, but a naive
discretization of this can!

The problem here is instability of integration methods.


Entry: the problem with the frequency offset
Date: Fri Dec  7 14:48:07 CET 2007

i think i found it: really stupid.. first i thought it was an
oscillator problem. didn't occur to me to try with 2 different boards
to see if that's actually the case. anyways, after trying, i got
exactly the same result. looking at the code i find this:

: sample>
    16 for wait-sample next
    0 ad@
    ;

which, if the processing takes longer than 1/16 of the clock period,
is wrong of course!

the solution is to solve this using the timer, or perform the sampling
in an isr. let's try the postscaler. OK.

need a break.. what i'm doing wrong is to use the integral of the
error to compute the frequency.. frequency should be just F_0 - e.

after the break.. looks like i'm still making too many mistakes: of
course, if i just restart the tracker at a random point of carrier
phase, chances are that there are going to be some transient
fenomena. i just need to run it longer probably.

OK: sync works to plain carrier.


Entry: synchronization to modulated carrier
Date: Sat Dec  8 11:30:05 CET 2007

i tried to following: use the sign of I to steer the direction in
which the feedback works. works ok for clean carrier, but in full
reversal this leads to problems.

looks like a conceptual problem.

maybe the synchronizer should be slowed down? in a sense that a symbol
transition, which moves through a zero feedback point (in which the
carrier is effectively not controlled), has no noticable effect on the
setting of the tuner, but when this transition is complete, full
feedback is in effect to pull the oscillator in sync again.


using just Q feedback, the PLL seems to stabilize around Q = -120,
with an amplitude of about 30.

say -128, that's -#x0080

#2000 -> #1F80

it's 1/64 th of the frequency, which is exactly the difference between
symbol rate and carrier: the PLL locks to another attractor..

a simple solution seems to be to limit the PLL frequency correction.

anyways, the sign stuff is necessary.


Entry: symbol synchronization
Date: Sat Dec  8 12:07:52 CET 2007

because i'm using locked synthesis and no non-synced downmixing,
the symbol synchronization can be derived from the carrier
synchronization. so maybe i should forget about syncing to the
modulated carrier?

pulling the oscillator in sync using a plain carrier might help a lot
actually. let's try a 7/1 test tone.


Entry: first packets: pll and reversals
Date: Sat Dec  8 13:25:20 CET 2007

apart from some problems related to gain (probably too much drive
which kicks the PLL out of sync: moving the things apart gives better
results.) it seems to work just fine.

looking at an I,Q plot i suspect the slow rise of the I signal is not
due to filtering, but due to loss of sync: Q gets thrown off, and the
PLL needs to re-sync. maybe it's more important to filter the error
feedback..

aha! it seems as if the PLL switches to the negative frequency
attractor. indeed. with wide spaced reversals it is clear that Q moves
from around -13 to +13

the problem is that by suddenly moving from subtract to add changes
the frequency of the oscillator from bias+corr to bias-corr. how to
solve this? aha.. maybe it's not necesary to flip the sign? since a
phase reversal in the I plane doesn't change the Q component?
actually, it does. switching of the sign compensation resynchronizes
the oscillator on transition to I = +A.

it looks like i need a controller with a zero error, which effectively
means a PI instead of a P controller. note that i already had an I
controller, but that's unstable.

i'm measuring an error of about #x10 / #x2000 = 0.2 % -- the spec
sheet says 0.5 % max. looks normal.

thinking about this PI controller: P + lowpass can't work, because
there is no zero-error. so i need an integrator. the problem is the
time constant / gain factor.

the error (Q) does seem to go to zero now. however, there is stil a
transition at the reversal.

now that i have a zero error, it's maybe best to multiply the I and Q
to obtain the error signal? for after dinner: i'm stuck with yet
another scale problem.. fixed point without a barrel shifter is
madness.. it might have been better to just implement the tools
necessary, even if they are inefficient: it is definitely doable
(which is what i wanted to prove really..) but it's difficult.

NEXT:
    - I * Q
    - AGC

preferrably combined such that I * Q and error feedback become simple.

i just saturated the error output to +-127.. i get nice results for I
amplitude around 100-150. but still: the Q component wiggles when the
phase transforms.

reading the costas-loop paper mentioned above: the 3rd multiplier is
called a phase doubler. it's only point is to make +-180deg both
stable lock points.

so, i'll write up the problem below.


Entry: more questions
Date: Sat Dec  8 15:04:31 CET 2007

why does the PLL response oscillate? the analysis by linear
approximation i made above showed it was first order.. something's
wrong there.


Entry: generic lowpass filters
Date: Sat Dec  8 16:24:02 CET 2007

it's no longer managable to have these special 1 * 2^(-8) filters.. i
need a special purpose 16bit lowpass filter, with saturation,
operating on proper 16 bit signed values, with possibly 8 bit
coefficients in a decent range.

it looks like there's plenty of room to do it in a proper
object-oriented fashion.

not doing it in proper object-oriented fashion, but a macro operating
on 4-byte state: 3 byte signed filter state, and 1 byte unsigned
filter coefficient: .00AA


Entry: AGC
Date: Sat Dec  8 22:24:17 CET 2007

it's not so straightforward, since it needs a division
operation.. currently, with the multiplication doubler (also with the
sign doubler) locking seems to work fine around 100-150 amplitude.


Entry: lock problems on transition
Date: Sat Dec  8 22:40:44 CET 2007

i still get the same problem: on transition, the phase is messed up
again. maybe the oscillator phase should rotate too? i'm
confused... at the point where the I component goes into transition,
the Q component gets kicked off.

the integrating controller works well: error goes to zero
eventually. i just need to figure out why the phase bumps..


something strange tho..  the Q spike only happens on a +1 -> -1
transition. the -1 -> +1 transition is clean. this smells like some
kind of wrap around bug..

sending #x01 bytes instead of #x11 bytes seems to contradict this:
spike on every transition.


Entry: emergency solution: AM
Date: Sun Dec  9 00:02:13 CET 2007

tomorrow it looks like the best thing to start with is gain control,
to find an optimal feedback coefficient for the PLL. once this works i
can try to find a bitrate that works with the phase error still
happening. then i could hand it over and try to fix the
sync/transition error.

normalization:
  * agc (division + filtering)
  * arctangent

previous conclusion about cordic artangent was that it's hard to do
without a barrel shifter.. i can probably unroll most of this by using
the multiplier and double buffering.

good thing is that this can be used for AM also, without the need for
quadratics.

EDIT:
actually, is should really just do AM by measuring the power. the
previous error (large carrier mismatch) is solved.


Entry: articles
Date: Sun Dec  9 09:15:49 CET 2007


R De Buda "Coherent Demodulation of Frequency-Shift Keying with Low
Deviation Ratio" -- IEEE Transactions, 1972, COM-20 pp 429-435.

S Pasupathy, "Minimum Shift Keying: A Spectrally Efficient Modulation"
-- IEEE Communications Society Magazine, July 1979, Vol 17, pp 14-22.


Entry: AM
Date: Sun Dec  9 10:13:57 CET 2007

i got very nice reception it looks like.

what about the following algorithm:
  * set threshold to an estimate of the noise threshold (say 50)
  * wait until something comes in: interpret it as a start bit
  * find the max amplitude during the start bit
  * start sampling 9 bytes, by waiting for half a symbol length, and
  compare to half the dynamic threshold


looking at some sampled data of #x55 + start and stop bits, which is
01010101, with 0 = ON, 1 = OFF, it seems that putting the threshold at
half is not a good idea.. also, the time it takes to reach from going
above noise threshold (50) to the peak of the start bit is exactly the
symbol length.

maybe it should be compared with a lowpass envelope?

tried this, but looks like LPF delay is going to be a
problem. however, it should be possible to keep the same filter, but
perform the comparison with delayed versions?

another possibility is to just save the sample points, and perform the
filtering at a later stage.

or.. it could just be compared to the previous sample point? if lower
it's the reverse? that will probably work just fine: this might give a
problem for stable 0 or 1..

next algo:

   * start sampling s - s_0 after detecting a start bit. s_0 = rise
     time to threshold level.

   * collect 10 samples.

   * postprocess


what i'm doing: s_0 = 0, and watching the output of the sampling with
a #x55 byte. it looks pretty decent. now trying the number station.

next approach: 

   * compare with previous (differentiate)
   * maybe hysteresis? 

differentiate is no good.

i'm probably fighting something else.. maybe the data rate is just too
fast? i had to move from 512 samples to 256.. so looks like something
else is going on..

what about this: change the special purpose lowpass filter so it takes
16 bit coefficients, and then reduce the filter pole a bit.


Entry: confused
Date: Sun Dec  9 16:22:43 CET 2007

let's see.. there's something wrong with my symbol rate. i thought it
was 512 samples, but it's 256. corrected for this, i can receive
signals. however, it seems the bandwidth is mismatched. so i have 2
calculations that are probably erroneous:

  * necessary bandwidth -> filter coeff
  * symbol rate at the transmitter

doing some manual experiments, i got the filter pole fixed at #x0100,
which gives very nice waveforms. making it bigger only increases the
noise, but doesn't seem to influence the shape too much.

so everything looks pretty good, but it seems there is too much
inter-symbol interference due to the assymetry of the receive
filter. i could try to hack around this by doubling each bit, but
keeping the envelope constant.


got now:
    * halved symbol rate (transition + stable)
    * 3 x bandwidth (100 -> 300)

now at least the filter makes it roundtrip from 0 to max amp.

now try to subtract startbit from each bit, then use sign.

this works!

reception seems quite robust. at least when it's not receving
bogus. so i need a way to eliminate the worst kinds of noise, which
are transients that trigger a start bit. these could then be used as
human input.

it's not very robust tho.. probably i need to compute the maximum, and
use half of that (or less..) as a threshold.

  looks like this is relatively robust:
        threshold = 1/4 of maximal power
        translated to amplitude this is 1/2


Entry: next
Date: Sun Dec  9 18:32:53 CET 2007

i think it's ok to forget about synchronous stuff for a while... also
speeding it up is for later maybe. what i need first is:

  * extra stop bit to eliminate transients
  * cleanup code for blocking send & receive.


Entry: krikit -> reflections on Forth and DSP
Date: Sun Dec  9 22:52:16 CET 2007

looking at the code i write, it is full of global variables (temporary
storage for multiple fanout). and inlined early-bound math ops,
operating directly on memory instead of the stack. also macros that
unfold to criss-cross variable access are much more useful here than
compositional forth.

the problem with DSP is that speed matters, and it's easy to get to
order of magnitude savings by early binding. so macros are
important. algorithms are often not terribly complicated. the stress
is more on mapping things to hardware.

now, i realize i'm stretching it trying to do DSP on a PIC18. it
misses essential elements like a barrel shifter, large accumulator,
and rich addressing modes. these things REALLY make a huge
difference. but, keeping data in memory (registers) makes things
relatively fast on a PIC18.

if the specs are clear (if the algorithm doesn't change)
implementation can be straightforward, though a manual endeavor. but,
what i've learned, experimentation REQUIRES more highlevel
constructs. i lost too much time and energy in mapping to hardware
before things actually worked.

which leads me to the following strategy: if experimentation on the
hardware is essential, experiment on hardware that 10x faster, or use
data rates 10x slower such that high level abstractions can be
used. for purrr on the PIC18 and 16/24 bit DSP operations this means:
USE A DTC FORTH! when it's done, core routines can be done in purrr18
or in machine language. what i need is:

  * confidence that 10x speedup is possible
  * confidence that slowing down ACTUALLY WORKS
  * patience and discipline to get it to work FIRST and THEN speed it up  

what i missed in this project is the availability of an easier to use
16bit forth, and a policy for doing fixed point math. the former would
have made the latter more easy to use.

and second, it is probably a good idea to start looking for a dataflow
language: one that 

  * automates the allocation of temporary buffers (variables).
  * enables abstract boxes (made of networks of abstract boxes)
  * automates iterated boxes (+ possible 'folding')
  * separates registers from functions (all feedback = explicit)
  * frp vs. static sequencing ?

so i'm not so sure anymore if forth is really useful for the
dsPIC. maybe in the sense that it should map to the 16bit arch just
like purrr maps to the 8bit arch, but leave the dsp-ishness alone:
provide only an assembler.


Entry: local names
Date: Sun Dec  9 23:15:27 CET 2007

which brings me to macros and local variables.. i'm using the wrong
tool for the job: i can't bind new names to old ones, like in
scheme. for example:

: bla state |
    state 3 +

...

the 'state 3 +' can't be bound to a single new name. this really
screams for a new language syntax and semantics. or at least enable
local macro definitions (there's no real reason why not..)

: bla state |
    : foo  state 1 + ;
    : bar  state 3 + ;

    foo @ bar ! ;

but.. that's getting ugly. what i want here is some form of
pre-scheme. downward closures.

(bla : state |
   (foo : state 1 +)
   (bar : state 3 +)

   foo @ bar !)

another thing: when allowing local variables, it makes more sense to
put them in front of the name, to correspond better to how they are
used.

  (square | dup *)
  (x square | x x *)

and i need to figure out how to solve the anonymous function
problem.. i.e. 'define' vs 'lambda'... these are conflicting.. what
about using

   : in a context that requires a named definition. i.e. a global
     definition or a local let

   | in a context that requires an anonymous definition, i.e. the
     argument to ifte. (| ...) is then equivalent to (...)

  (x square : x x *) vs (x | x x *)
  

a function definition can then be something like

  (a b c superword :
     (e : a 1 +)
     (f : b 1 +)

     a b + e +)

where local definitions are possible at the beginning of a definition


Entry: dtc forth
Date: Mon Dec 10 14:21:35 CET 2007

a unified memory model is not so hard to implement efficiently. but, a
point that could make a huge difference is to use this memory model
inside the interpreter. a trade-off between speed and flexibility. i
can imagine it being interesting to be able to test code in ram before
flashing it.. at the least, the option should be kept open.


Entry: RGB led
Date: Mon Dec 10 17:57:57 CET 2007

trying to figure out where to put the LED. 

  * all connected to analog ports
  * one extra digital connector with 220R resistor
  * all connected to pins, so they can be reverse biased for light detection

pinouts: (common anode)


 |
 ||
||||
4321

 4       3       2    1
 |   R   |   B   |    |
 o--|<|--o--|>|--o    |
         |            |
         o--|>|-------o
             G

the region that's free is between 21 and 26. the anode needs to be
connected to a pin that can be switched to analog. on the board the
best option here is 23/AN8. leaving pin RB0/INT0 free for debug net
might be a good idea. AN9-10-11 are then all digital to control the
LED cathodes, AN8 is analog to tolerate the analog voltage. this also
won't conflict with the necessary digital outputs already on the
board. the anode resistor could go to.

the RGB led is connected like this:

26 RB5      o--[220]---o
                       |
25 RB4      o-----o    |
                  |    |  
24 RB3      o--o  G    |
               B  |    |
23 RB2/AN8  o--o--o----o
               R
22 RB1      o--o


Entry: dsp language
Date: Tue Dec 11 09:33:07 CET 2007

what is necessary? i could take the PD sound processing as a model.

  * box = primitive | composite

  * composite = box + interconnect

  * things should be parametrizable in grids (from which an iteration
    structure is defined)

  * can we have lexical scope?

  * don't force serialization

  * don't force naming of intermediates, but don't restrict it
    either. (box combinators)

  * allow scheme (expression trees) to be a subset of the
    language. the exention is no more than a way to abstract 'parallel
    scheme'.

it would be nice not to go too far away from lambda abstractions. the
problem is multiple outputs. these could be multiple functions. so
what about common subexpressions? keep it manual for now..

maybe use scheme-like syntax based on 'values' but called
'output'. the latter will be more general than values: it can be
re-arranged in time. it's an essential observation.

not forcing the naming of intermediates can be problematic, since it's
the whole point: dsp code is very graph-like, and naming is more
efficient for this.. it looks like naming IS essential.

brings me to composition: a new box consists of 'node' sections which
name nodes. 'lambda' could be replaced with 'in' since it will name
the external inputs. all other nodes have to be named. 'not forcing
naming' can be implemented by special purpose box combinators.

nodes are different from locally created 'specialized' boxes.

names can be replaced by box expressions if they are tree-like (return
a signle value) otherwise they need to be named in a 'node'. similarly
'out' can be discarded in a definition. this allows the use and mixing
of scheme functions.

(in (a b c)  ;; 'in' is the parallel equiv of 'lambda'
   (box (mula (x) (* a x))    ;; create local specialized box (like 'define')
   (box (mulb (x) (* b x))   

   (nodes                     ;; naming intermediates
     ((q r) (div/mod a b))

     (out
      (+ (mula c) (mulb c))
      (- (mula c) (mulb c)))))))


so, concretely:

  'in'    is like 'lambda' but it has parallel outputs
  'nodes' is like 'let-values' 
  'box'   is like a local 'define'
  'out'   is like 'values' but defines parallel outputs


so the principles:

  1. the ONLY point of the language is to extend the many->one lambda
     calculus that can create expression TREES to something that can
     create expression GRAPHS.

  2. it is important that the lambda calculus is a subset which uses
     it's original lisp tree notation.
      * 'out' is redundant for single outputs
      * intermediates from single output boxes do not need to be named

i'd like to extend this to grid processing: systolic arrays etc: box
compositions that connect boxes in several dimensions, such that
iterators can be derived from a highlevel description.


Entry: driving led
Date: Tue Dec 11 11:17:14 CET 2007

driving the led during reception is going to happen at 5kHz, which
when using PWM is probably going to be too little. say 256 steps gives
about 20Hz. so what about using SD modulation? i wanted to try this
for a while, maybe now is the time.

yup. works like a charm. since red is less bright, i give it a double
time slot, which leads to a 4 phase state machine.

at receive sample rate there's some noticable flicker at low
intensity, at about 5Hz. it's easy to avoid by introducing a minimum
of 5 or 6 as color values.


Entry: more state machines
Date: Tue Dec 11 12:27:35 CET 2007

the send and receive functionality should also be implemented as state
machines. or.. stick to a single application thread, and run the other
state machines from the blocking operations? maybe that's easiest.

sending and receiving are mutually exclusive. currently there's only
the LED that works in parallel.


Entry: rx/tx interference
Date: Tue Dec 11 16:50:53 CET 2007

there seems to be interference with driving the led and reception. i
added "red blink" in the demo app whenever there is a bad reception,
however, this seems to completely throw it off..  (edit: not the led
but tx)

so i need to add some pauses probably. which brings me to: there is no
generic pause word, so i'm going to use just a double 0 for loop.

the interference seemed to be due to the absence of 'ramp-off' :
before switching to rx-mode the speaker was still being driven. i
added those and some pause, now it seems to work.


Entry: project scheme extensions
Date: Wed Dec 12 09:38:21 CET 2007

i need to move away from loading scheme extensions as individual
macros, but towards associating them to a project. they are
different. the distinction to make is:

 * macros from forth code: incremental, can be redefined
 * brood extensions: fixed per project

this of course leaves in the dark brood extensions as libraries.. it's
a hodge podge. what i could try is to keep the target namespace
management intact: typical forth style shadowing for both words and
macros and allow it to call scheme code.

what about a unified dictionary:
      * macros stored as symbolic code
      * ram addresses stored as macros

+ macros are allowed to postpone expansion if they reduce to single
constants?

it looks like the seed of the plan is there: it's simple and i can't
see any problems. the main difficultie lies in the difference of the
way the cat namespace works (declarative: no re-definition, all names
defined at once) and the purrr one (shadowining, incremental)


Entry: TODO list cleanup
Date: Wed Dec 12 09:55:06 CET 2007

DONE:

* fix the assembler: i'm running into word overflows, code is getting
  too big. maybe use a trick: whenever a word overflows, just add some
  new code after the code chunk, jump to there, and have that chunk
  jump to the original word with a far jump. as a quick fix: at least
  print the name of offending symbols to they can be manually patched
  to jong jumps.

* switch the assembler to a mutating algo so proper jump graph opti
  can be performed easily. i see no point for pure algos there.. asm
  is a black box anyway.


IMPOSSIBLE:

* if 'invoke' is a macro anyway, why not combine it with execute/b ?

  ANSWER: it's awkward to set the return stack to the word after
          invoke without using a call. that call might as well be
          execute/b

* nibble buffer is not interrupt-safe: the R/W thing is
  shared.. probably need separate R/W pointers! (FIXED)


REMARKS:

* make it possible for a macro to create a variable. more specificly:
  make it possible to create any couple of words and variables
  together. (this means a macro can create a macro.. probably means
  re-introducing some reflection).

  if the macro dictionary is merely a cache of a linear dictionary,
  with the linear dictionary containing macros, this kind of
  reflection should be possible to introduce without the disadvantage
  there was before: mutation in the dictionary hash.. there would only
  be shadowing, and 'mark' could handle macros too. syncing cache
  means (lazily) recompiling the macro cache.


Entry: mzscheme slow text
Date: Wed Dec 12 10:34:26 CET 2007

i just tried:

(define (all)
  (define stuff '())
  (let next ()
    (let 
      ((c (read-char)))
      (if (eof-object? c)
          (reverse! stuff)
          (begin
            (set! stuff (cons c stuff))
            (next))))))


(printf "~s" (length (all)))


tom@del:~$ time bash -c 'cat ~/brood/doc/ramblings.txt | mzscheme -r /tmp/text.ss'
606700
real	0m0.332s
user	0m0.319s
sys	0m0.012s

so it's at least not read-char..

maybe i need to write a fast tokenizer for forth using just read-char
instead of the yacc clone from mzscheme? probably the same goes for
sweb.

tokenizer has 3 states:
    * whitespace
    * comment
    * word

easy enough to just do manually.

it could be implemented as a 'read-syntax' word which adds source
location information to the symbols and comments read. a syntax-reader
is essential since they can be pluggen into the module loader system.


Entry: incremental static binding
Date: Wed Dec 12 12:40:22 CET 2007

about static binding, redefine and linear dictionaries: it's better to
have something that is predictable, but a bit rigid, than something
that's flexible but harder too use.

what i mean is redefining lowlevel words: it's possible to do so, but
dependency management then becomes manual. the rule is: later code can
never change bindings in earlier code, but it can redefine behaviour
for future code. this is dirty, but the simplicity is very managable
and it allows for predictable hacks. the only real alternative is a
proper dependency management system and name space isolation. david
and goliath.


Entry: sane conditionals
Date: Wed Dec 12 16:15:42 CET 2007

time to give up on the crappy >? constructs.


Entry: conditional optimization
Date: Wed Dec 12 16:28:22 CET 2007

what i need is a way to optimize away a conversion from flags ->
number -> flags, but without hindering the construction of proper flag
bytes.

the macros like '=?' can still be used as optimizations that need to
combine with 'if' immediately, but the others should definitely
produce flag bytes.


Entry: >z
Date: Wed Dec 12 17:29:29 CET 2007

i wonder why i'm not just using flag>c instead of >z.. since carry
flag is unaffected by drop. maybe to save carry flag some places?

0 -> carry = 0
any other -> carry = 1

that's just "255 + drop"

well.. it doesn't matter so much in that it's never inlined.


Entry: then opti
Date: Wed Dec 12 17:46:02 CET 2007

it looks like this is mostly broken. maybe since the introduction of
'drop save' elimination. i see that "z? if drop 123 ; then" doesn't
eliminate to one instruction.. see 'swapbra' and extend it to other
conditional execution macros.


Entry: dtc primitives
Date: Thu Dec 13 10:54:01 CET 2007

towards a standard forth.

 1. get it to crosscompile
 2. write a kernel in itself

the important things to note about the implementation is that it is
concatenative: there are no 'parsing codes', meaning, there is no
lookahead.

  * every word is an instruction
  * 'return' is marked by a bit

as a consequence, each word has only 14 bits of payload. two bits are
reserved to distinguish between data and code, and implement the
return instruction.

now the criticism: maybe it's best to ditch the return bit, since it
limits the addressable memory. with 14 bits only 16k words can be
addressed. the trade-off needs somei think it's best to ditch the return bit, since it prevents easy
access to primitives by just reading them from code.

i'm not sure where this can bite, but using the LSB as tag bit
(0=data, 1=code) and making execute ignore the tag bit allows the use
of 15 bit numbers, which can represent addresses.

maybe it's not such a good idea.. i'm a bit uncomfortable with not
having 16 bit width.

 statistics. a return bit makes
only sense if the words are expected to be short. padding is an
option, but awkward, since every label needs to be prepended by a nop
if it's not aligned.

rebuttal: tail recursion. this is the thing that's handled with the
return bit.. i forget a lot of thought already went into this
thing. tail recursion justifies the inconvenience of handling the
extra bit.

remark: a tagged data system can be built on top of this forth. i'm
not comfortable with giving up a 16 data/return stack in favour of a
14 or 15 bit tagged system.


Entry: signed/unsigned comparisons
Date: Thu Dec 13 12:45:45 CET 2007

two issues. are they the same or not, and what should the default be?

they are not the same: 

     pos neg >
       * always true in signed
       * always false in unsigned

unsigned: carry
signed: sign of result (might overflow)

it's a bit silly, but i think it's time i admit i don't fully
understand it.. carry in addition is simple. carry in subtraction is
also not so difficult, since subtraction is addition with negative.

a carry on addition means overflow: the word's not big enough. simple.

but what is a carry on subtraction? let's isolate some cases.

            result carry sign overflow
10 3 -        7      1     0    0
3 10 -       -7      0     1    0

100 -100 -  -53      0     1    1
-100 100 -   53      1     0    1


http://en.wikipedia.org/wiki/Overflow_flag

  The overflow flag is usually computed as the xor of the carry into
  the sign bit and the carry out of the sign bit.

In other words: addition adds one extra bit to the representation. In
order to not have overflow, for unsigned addition/subtraction this bit
needs to be 0, and for signed addition/subtraction this needs to be
the same as the sign bit.

So, for a signed comparison, take the sign bit of the result, and
assume there is no overflow. For unsigned take the carry bit.


Entry: dtc remarks
Date: Thu Dec 13 14:34:51 CET 2007


 * size or speed? in the end it should run on CATkit, which has little
   flash memory, so i should really go for size.

 * FOR..NEXT is not standard, so i can just make something up?


can't get for..next going.. debugging return stack stuff is
hard. wanted to have a quiet simple puzzle day, but it requires 'real
work' :)

about size vs speed. the primitives need to be fast, so they can be
used in STC code with the VM eliminated, but the VM needs to be
SIMPLE. the return stack really should contain the same stuff as can
be found in straight line code.

i'm going to eliminate some macros. hmm.. too much thinking because
it's already too much optimized.. i find it difficult to throw this
kind of stuff away.

what to optimize:
  * inner interpreter loop
  * maybe math primitives (used elsewhere)

not so important:
  * enter/leave + RS (once per highlevel word)


Entry: eForth / tail recursion + concatenative VM
Date: Thu Dec 13 16:05:24 CET 2007

why is not optimizing so difficult? i see factors of ten everywhere..

the vm-core.f i have is nice, but i'm still quite stuck at trying to
solve multiple problems at the same time: * interoperability between
STC and DTC: both primitives and brood.  * tail recursion

it needs to be simplified a lot.. in the same way that PF needs to be
simplified to get to a proper VM architecture: it's the same problem.

i can do with primitives what i want, but all CONTROL FLOW needs to be
based on 2 simple instructions: _run and _?run - the duals _execute
and _?execute are only for primitives.

so what's the definition of _run, such that it can be turned into a
jump..

IMPORTANT:

   conditional run is not the same as conditional branch..  this
   points to an inconsistency: things that JUMP are incompatible with
   the exit bit.


another problem is that 'immediate' won't work: no compile time
execution: a simplified forth. can i have a macro mode? before i can
implement these i really need to take a look at putting back
incremental extension in the language, this time without implementing
it using mutation.. (it starts to look like this cutting of the
reflective wire was a really bad idea..)


Entry: macro code concatenation
Date: Thu Dec 13 19:20:09 CET 2007

what i'd like to postpone expansion of constants until assembly. but,
i can't influence the meta functions from forth code.. this is another
one of those arbitrary complications.

what about:
  - putting macros in the project dictionary
  - by default, they are expanded
  - when present in data positions, they are evaluated

i can't see a reason why this wouldn't work. the only concern is
stability: each invokation needs to reduce. i.e. '+' in meta dict is
special because it's different from the '+' in macros (the latter
can expand to symbolic code containing '+')

  the problem i'm trying to solve is to get a minimal symbolic
  representation of things that are constants by delaying their
  evaluation, or by somehow recombining?

i.e.: if there is a macro

  : foo 1 + ;

i want the code "123 foo foo" to expand to the machine code:

  (qw (123 foo foo))

instead of

  (qw (123 1 + 1 +))

the thing that decides what to do here is '+' but can this decision
somehow be transformed to the point where 'foo' executes? if every
macro inspects its result, and if the result is ONLY the combination
of constants->constants, this combination can be made symbolic, since
it can be re-computed at assembly time.

i.e.

   (qw a) (qw b) foo -> (qw (c d e f))

can be replaced by

   (qw (a b foo))

because probably "c d e f" is not going to be very helpful to
understand where the constant came from.

this would enable the unification of:
   * constants
   * variables
   * macros
   * meta words
   * host code

does the subset of these macros need to be explicitly defined?
probably not. they are just macros, and qualify if they map qw's to
qw's.


Entry: partial reduction
Date: Fri Dec 14 10:16:22 CET 2007

maybe macros should be made greedy, such that when completely expanded
they reduce. what i mean is that "1 2 +" -> (movlw 3) but "abc 1 +" ->
(movlw (abc 1 +)). combined with the mechanism described above, this
could be the key to unification.

as a result, macros will be the only evaluation mechanism, which just
need to be provided with a symbol lookup. there are 2 phases of macro
execution:
  
  - phase 1:  compile to literals + instructions, names symbolic
  - phase 2:  compute literal values using resolved names


it looks like making the effect of 'meta' into a local effect is the
way to go. it would be nice to find a way to fix the 'postponing'
operation first, so at least generated assembly code looks nice.


Entry: meshy finished?
Date: Fri Dec 14 15:59:31 CET 2007

looks like we're at the end. got 8 devices talking to each other. so
time to make a "what learned?" section..


  * for DSP, use a dsPIC instead of a PIC chip, OR write a highlevel
    (but slow) set of primitives on PIC. i spent too much time in
    writing "fast" code that eventually didn't get used, or
    extensively modified to destroy the optimizations.

    DSP apps have the property that a lot of the code volume needs to
    be fast, which screams for a SEPARATE algorithm design and
    implementation/optimization phase. the problem here is on-target
    debugging. as long as the app scales time-wise (rate reduction
    without changing other variables) optimization can be postponed.

  * get it to work FAST, and start with the most difficult part, even
    if it means dirty hacked up proof of concept, then incrementally
    improve while keeping it working. don't spend time on things that
    solve needs that are not immediate if there are other immediate
    needs.

       - debug network: eventually didn't get used
       - the hardware layer: it delayed everything else

    the mistakes had quite severe consequences in the end. i could
    have gained 2 weeks by not making the debug network.

    the cause of the mistakes seem to be 

       - mismatch in skill (no analog electronics hands-on experience,
         and dusty theoretical understanding) but mostly misplaced
         confidence in non-tested skill.

       - underestimation of importance of debugging.

  * debugging deserves its own bullet. ironically, i lost a lot of
    time building a debugging tool. building that tool was a good
    idea, but i forgot a couple of steps:
       
       - underestimated the difficulty in getting the debug net
         working properly. this actually required an intermediate
         debugging phase to monitor the behaviour of both send and
         receive. i didn't anticipate these problems, which was a
         mistake. lesson to learn is to never underestimate the
         problems that can arise, even if the application seems really
         trivial.

       - doing high-bandwidth work (DSP) requires high-bandwidth
         debugging tools or at least a large storage space on chip for
         traces and logs. a solution here would be to make a separate
         circuit only for logging, or use a high-bandwidth host
         connection. an example could be a circuit that records to
         a flash card, or a USB connection to host.

       - need better host side software extension system for
         special-purpose debugging tools. it should be the same as the
         way the host system is written, so that tools can be moved
         into the main distro when polished. to make this easier, the
         number of extensible points needs to be limited such that
         they are better accessible. i.e. the console's need to be
         programmable.


so, to summarize:


     DESIGN then IMPLEMENT

     don't optimize and design at the same time if there is a lot of
     opportunity for optimization (i.e. DSP app on PIC18 where an
     order of magnitude of speed gain is easy to find). as long as
     time-critical cores are small, this is ok, but when the core is
     all there is, you need to get it to work first using a highlevel
     approach, and ONLY THEN make it fast.


     ELECTRONICS is DEBUGGING

     do not underestimate the difficulty of getting something right in
     reality, even if the logical model is trivial. programming
     problems seem to be about managing complexity, while electronics
     problems are about managing external influences, non-ideal
     behaviour, and tons of exceptions and hacks. these are entirely
     different. programming = abstraction, electronics = debugging.


Entry: meshy presentation -- technical
Date: Fri Dec 14 16:54:00 CET 2007

hardware

  goal = as simple as possible
     - 40mm speaker used as mic
     - input:  2 opamp mic bandpass amplifier + 8 bit A/D
     - output: switching transistor (PWM)
     - PIC18 @ 10 MIPS
        - prototype uses large chip (64kb - 4kb - 28 pin PDIP)
        - possible to downscale a lot (8kb - 256b - 18 pins SMD)
     - RGB led (single resistor, S/D alternated pulsed)
     
lowlevel software

  - purrr
     - Forth dialect
     - simple but powerful
     - bare metal vs. abstraction mechanisms
     - interactive (debugging!)
     - bottom up programming
     - metaprogramming (scheme)
     - emphasis on debugging

  - sound modulation:
     - OOK  (on-off keying)
     - BPSK (binary phase shift keying)
     - 10 baud framed bytes: 1 start, 8 data, 2 stop
     - 610Hz carrier (speaker reso)
     - speaker driven with 7 bit PWM @ 78kHz

  - demodulator
     - input sampled at 5kHz
     - downmixer (cross modulator) + lowpass filter
     - OOK: asynchronous, power detect
     - BPSK: synchronous costas loop


Entry: simplex LEDs
Date: Sat Dec 15 11:24:21 CET 2007

the most efficient way (wire-wise) to connect a bunch of LED is to
place them on the midpoints of simplexes, where you connect the
simplex points to +/- drive points: this makes it possible to switch
on 1 hop vertices, but 2 or more hop vertices stay off since they will
not reach threshold voltage. this structure is also called a "complete
graph".

http://mathworld.wolfram.com/Simplex.html
http://mathworld.wolfram.com/CompleteGraph.html

mapping this to a 2D or 3D structure in a nice symmetric way is not
that trivial. however, the most symmetrical planar arrangement is:

place the points in a circle. if the number of points N is odd, you
get (N-1)/2 concentric circles each containing N points, with a
criss-cross network below it. even works similarly, only one of the
circles has half the elements.

this structure can be wrapped around half a sphere. wrapping it around
a full sphere gives easy access to the control points, and gives a
spherical or cilindrical structure.

the coverage grows ~ n^2 so taking more points is relatively more
efficient. however, overall connection might get too complicated. a
different approach is to take some kind of 'primitive circle' which
can be unfolded in a line, for example the pentagram with 10
LEDs. transporting then could be done using a bus. i.e. a ribbon
cable. maybe it's possible to use a ribbon cable with pins?

using a linear solution, it might be possible to make something that
is composable. i.e. take an N solution, add a wire and some N
primitives and make an N+1 solution.

this turns out to be just cyclic permutations. for example, starting
with the 2-terminal primitive L2, it can be extended to a 3-terminal
primitive L3 by means of the primitive 3-permutation P3, and adding an
extra wire to P2, so:

L3 = L2 P3 L2 P3 L2 P3 = (L2 P3)^3
L4 = (L3 P4)^4

   
   in general: L_N = (L_{N-1} P_{N}) ^ N

this is probably a lot easer to do than networking, since it's basicly
braiding. a linear projection is easy to control, but i'm not sure if
it's really a good approach for construction.. if i find an easy way
to solve the permutation problem, then yes, it's a good thing.

simplification: it's probably ok to leave out the last permutation,
and compensate for it in software.

now, permutations and braids: they are not the same. transpositions
have no direction, and are self-inverting. a twist on the other hand
has a sign, and is not self-inverting.

braids can implement permutations while giving structural integrety.
for example the most typical 3-strand braid:

__   ____
  \ /
   \
_/  \   _
     \ / 
      /
_____/ \_

implements a 3-element cyclic permutation as a right crossing followed
by a left crossing (nomenclature: rotate the image 90 counterclockwise
and progress upward: direction is the strand that passes over the
other one.

compare this to a double right crossing:

__   ____
  \ /
   \
_/  \   _
     \ / 
      \
_____/ \_

this is a simple twist and provides no structural integrity, but
implements the same permutation.

can this somehow be used as a building block for the other cyclic
permutations? sure.. as long ass you work with twists from left to
right, and make sure the twist pattern gives you structural integrety,
the same logic applies: the result is just a cyclic permutation.


Entry: interactive mode
Date: Sun Dec 16 10:13:47 CET 2007

from interactive.ss :

  The end goal of Purrr is to have only 'live' and 'macro'
  interactions: the system should be powerful enough so excursions to
  the underlying prj: code is not necessary. This gives a separation
  between 'tool development' and 'tool usage'.

I've come to believe that this is not a good idea in general. It is OK
to be able to access the most basic host code, such as compilation,
upload and inspection, but for real work you'd want to automate those
and have a 'real' programming language behind it. In other words:
access to prj or scheme code is necessary.

  * it's ok to have a small collection of host words in interaction
    mode which are hidden using prefix parsing.

  * this set of mappings (parsing words) should be extensible: prefix
    parsing needs a simpler definition form.

  * the functionality behind those words should be extensible

Concretely this requires interactive.ss to be adjusted so it can
accomodate parsing code in a different way. Maybe it can be made
extensible together with the other parsing words.. The problem right
now is that it is a single method, and the way it's defined is
difficult to make dynamic (it's a scheme macro).

Actually, compile mode forth parsers are already registered in the
global namespace tree, so making them extensible can be done
incrementally by adding some more name spaces.


Entry: extensible interactive parsers
Date: Sun Dec 16 10:52:30 CET 2007

two conflicting views here:

  * currently interactive parsers are isolated functions, which is
    nice and clean.

  * what is required is extensibility and re-use.

the solution seems to be to put the components in a global name space,
which is used as the unified extension mechanism, and replace the
function with a stateful one that refers to the name space.

key elements here are 'with-member-predicates' and
'predicates->parsers'. these form a construct that needs to be
attached to the global namespace tree.

the former creates a collection of membership predicates.

the latter creates a map (finite function) from atom -> parser.

the problem with the current approach is the generality of the
parsers: they don't just map names to functions, but also create
'classes' with similar behaviour, so there is a level of indirection
that needs to be captured. the live parser map is

   * symbol -> parser  (parser primitive)
   * symbol -> symbol  (parser class)
 
if they are stored in this way, interpretation is quite
straightforward. the approach is:

   * provide alternatives for 'with-member-predicates' and
     'predicates->parsers' so they postpone their behaviour and store
     it in the global namespace.

   * provide an interpreter.

OK. implemented + tested.

Some further cleanup. Maybe it's best to not store symbols in the
dictionary, but parsers: use cloning instead of delegation? This way
the dictionary IS the finite function. The real problem is that macros
have a delegation method (function composition) but parsers (and
assemblers for that matter) have not.

so:

  Forth syntax parsers (lookahead) have no composition
  mechanism. Therefore cloning is used to give some form of code
  reuse. It used to be delegation, but this gives dynamic behaviour
  which contrasts with the static, declarative intent of the global
  name space, regardless of its implementation as a hash table.

and about ns:

  The global namespace is used as:
   * declarative symbol table (single assignment, mutual refs)
   * cache (forth macros should eventually be defined in state file)

Maybe forth.ss should be separated into generic forth style parser
macros and functions and the definitions of the parser words.


Entry: static composition and extension
Date: Sun Dec 16 11:19:16 CET 2007

i chose for a hierarchical dictionary as the main means of program
extension. the way it is used is not dynamic binding, but 1. postponed
static binding and 2. cache of a linear dictionary.

as a consequence, it can probably be completely replaced by mzscheme's
module composition approach, together with some means (units?) to
solve circular dependencies and plugin behaviour.

however, i see no point in changing this until the dependency on the
method that implements this linking part can be abstracted
away. currently that seems problematic, because the name store is
everywhere: it is the backbone of the system.

i find it very difficult to see what is the right thing to do
here. 1. i'm not using abstraction mechanisms provided by mzscheme to
do namespace management, which makes me miss some static/dynamic
checks, and is in general just a bad idea. 2. my approach is more
lowlevel so flexible to shuffle it around and find the right
abstraction. the thing is i'm not sure yet if i need this flexibility
(over the built in functionality).

the only way to really resolve the ignorance is to implement a toy
project which doesn't use the global namespace, and only uses mzscheme
units and modules.

Entry: future dev
Date: Sun Dec 16 15:48:03 CET 2007

  * fix problems in TODO (mostly peval)
  * finish 16bit DTC
  * dsPIC forth
  * lisp-like dsp functional dataflow language for PDP/PF/dsPIC
  * CATkit 2
  * sheepsint 8-bit synth engine (envelopes + FM)
  * E2 debugging
  * CATkit midi
  * USB


Entry: chaos and clipping
Date: Sun Dec 16 18:03:49 CET 2007

( analog intermezzo )

it's not so hard to understand the mechanisms of chaos in 3D switched
systems. however, for linear distorted (clipped) systems, it is less
obvious..

after some drawings it seems to me that the reason for this is rather
straightforward: clipping can be written as a switched system + a
switched system is easier to understand: it just switches betweeen
linear systems. therefore they also seem easier to design: 

  * one mode performs unstable oscillation in a plane + 1st order
    attraction towards that plane.

  * the other mode drives the motion away from the oscillator plane,
    such that when the mode switches again, the energy in the
    oscillator is 'discharged'.

the trick is in re-using the circuit as much as possible in the 2
modes.

anyways, back to distorted systems. can clipping be used to make
chaotic circuits simpler?


Entry: inspecting macro output
Date: Mon Dec 17 10:07:27 CET 2007

finding a common tail in 2 lists is necessarily quadratic. but i
probably don't need that, since i'm looking only for common subtails
in substacks.

i'm still looking for a good description of the problem.. the problem
of finding the common tail seems to be the one to give insight.

what about this:
 1. split input and output 'qw' atoms off
 2. check if remaining tail is the same

this is the only behaviour that's valid. once this data is obtained,
it could be peeled to isolate the behaviour of a macro, at which point
cold be decided to 'unevaluate' it.

now, what does unevaluate means?

... (qw 1) (qw 2) +   ->  ... (qw 3)

this could be replaced by (qw (1 2 +))

this is always the case: since the evaluation can be performed again
later. the only information that is extracted at this point is wether
the macro does anything else.

the change in macro code seems to be here:

    (([qw a ] [qw b] word)         ([qw (wrap: a b 'metafn)]))

the 'wrap:' form needs to be replaced by something that might return a
value if the variables contain numbers.


running into a small namespace problem.. trying to use scheme names,
but it might be better to leave the meta dict in there to do this kind
of stuff, but only call it from the macros. basicly, the stuf after
wrap: should be symbolic if the parameters are symbols, and computed
if both are numeric.


Entry: benchmarking
Date: Tue Dec 18 16:21:16 CET 2007

the current reader is problematic.. it's slow, and i don't understand
the reason. i don't think it's usage of streams, since it wat slow
before, and it's not read-char, since i tried that..

so... 

  1. make a test for the current reader
  2. replace it with a new reader
  3. build 'read-syntax'


first text: the problem seems to be somewhere else..

  (define f (forth-load-in-path "monitor.f" '("prj/CATkit" "pic18" )))

is virtually instantaneous like it should be..

so where did i get the idea that this is slow?

indeed:

  '(file monitor) prjfile prj-path forth-load-in-path

is instantaneous also.

otoh, 'forth->code/macro' isn't instantaneous at all..

compiling the code 'code/macro!' is instantaneous also. i think i got
it. why is the code/macro splitter so slow?

tracking down to forth.ss : forth->macro.code which uses
@forth->macro/code which uses @moses

it can't be @moses since that's just a filter.. so it's probably down
the stream in the macro processor. need to test that separately.

running into some inconsistencies.. probably best to switch everything
to syntax objects, including a syntax-reader.


Entry: read-syntax
Date: Tue Dec 18 17:54:50 CET 2007

from 

http://download.plt-scheme.org/doc/371/html/mzscheme/mzscheme-Z-H-12.html#node_chap_12

  (datum->syntax-object
   ctxt-stx v [src-stx-or-list prop-stx cert-stx])

converts the S-expression v to a syntax object, using syntax objects
already in v in the result. Converted objects in v are given the
lexical context information of ctxt-stx and the source-location
information of src-stx-or-list. If v is not already a syntax object,
then the resulting immediate syntax object it is given the properties
(see section 12.6.2) of prop-stx and the inactive certificates (see
section 12.6.3) of cert-stx. Any of ctxt-stx, src-stx-or-list,
prop-stx, or cert-stx can be #f, in which case the resulting syntax
has no lexical context, source information, new properties, and/or
certificates.

If src-stx-or-list is not #f or a syntax object, it must be a list of
five elements:

  (list source-name-v line-k column-k position-k span-k)

where source-name-v is an arbitrary value for the source name; line-k
is a positive, exact integer for the source line, or #f; and column-k
is a non-negative, exact integer for the source column, or #f;
position-k is a positive, exact integer for the source position, or
#f; and span-k is a non-negative, exact integer for the source span,
or #f. The line-k and column-k values must both be numbers or both be
#f, otherwise the exn:fail exception is raised.

(datum->syntax-object
 #f word
 (list source-name
       line
       column
       position
       span)
 #f #f)


EDIT:

why do i run into the need to have a port object that can put back a
character? scheme needs this too, so maybe the port objects need to
support putback?

it's the other way around: scheme ports support a peek operation.


looks like it works now, and the code looks clean.
next: create syntax objects.

this seems to be rather straightforward by using
'port-count-lines-enabled' and 'port-next-location'.

ok. seems to work now.


Entry: syntax cleanups
Date: Sun Dec 23 13:57:33 CET 2007

what about the '|' character for lexical variables?

things to be aware of:
  don't break code / or break it verbosely
  
again, i want to write a state machine.. i need to think a bit about
the abstractions used in forth.ss

'parser-rules' works well. the rest is hard to read. the problem seems
to be parsers that segment data, instead of taking a fixed amount of
data from the stream. these need state machines.

let's rewrite the def: parser as an example.

basicly this is forth-lex.ss, but then recursively.

OK. i've got a definition parser working which produces name, formals
list and body. now this needs to be passed upstream somehow. looks
like that is the next part to cleanup: macros can have formals, and
they need a symbolic representation for this, i.e. in the state file.

now the question is: should this be the (a b | a b +) syntax, which
requires another lexing step, or should it be an s-expression with
explicit formals list?

what about this: make lexing steps easier, and just use more lexing
steps.  forth handles parsing (recursive) at a later state than
lexing.


Entry: regular grammar
Date: Sun Dec 23 22:00:27 CET 2007

the essential property of a regular grammar is that, each production
rule produces at most one non-terminal. intuitively, this means there
is no "recursive" tree structure, only a sequential one: there is no
"replication gain".

so it looks like i need a way to express some of the state machine
parsers as simple regular expressions based on membership functions,
instead of the more specialized character classes.

 (vaguely related: note that the Y combinator is essentially a copy
 operation)


Entry: regular expressions
Date: Tue Jan  1 15:48:20 CET 2008

the data is a stream of tokens, so regular expressions can be
constructed in terms of membership functions and modifiers like '*' or
'+'. symbols can be converted to membership functions.

that should be enough? not really. need some form of abstraction: a
pattern can be a composition of patterns.

so maybe it is better to stick with the lexer language in mzscheme?
since what i am going to re-invent is ultimately going to be a generic
regexp tool. EDIT: looks like it's really character-oriented. maybe it
is a good exercise to try to write a lexer generator? can't be that
hard.. also, i run into this problem so many times with low-level bit
protocols that it might be a good idea to take a closer look: white
space is essentially the 'stop bit' in async comm.

  which brings me to the question: i think i read on wikipedia (i'm
  offline now) that regular expressions and FSMs are somehow
  equivalent. how is this?

how about forth-lex.ss: a specification not as production rules of a
regular language, but as regular matching patterns? what is the
problem i am trying to solve? find a function (or macro) that maps

  lex-compiler : language-spec -> token-reader

stream = token stream | EOF
token = word | comment | white

at the same time, i am trying to stay true to the forth syntax: simple
read-ahead. (keyword + fixed number of tokens) or delimited read
(keyword + tokens + terminator).

  note: there seems to be a difference between reading UPTO a token,
  or reading UPTO AND INCLUDING a token. is standard forth always of
  the latter form?

to answer the question partially: the current forth-lex.ss performs
segmentation, and thus is not of that form: it cuts INBETWEEN tokens.
but forth is. can i learn something from this? yes: cutting AT a token
makes the automaton simpler, since it doesn't require peek. let's call
that 'delimited' until i know the technical term.

i think the important lesson is that:

  1) forth should be delimited: this simplifies on-target lexing
  2) exception: first stage tokenizer in brood = segmentation

the latter is an extension to make source processing in an editor
(like emacs) easier by preserving whitespace and delimiting
characters. BUT, it should not introduce structures that 1) can't
interpret.

it looks to me that before fixing the higher level compiler and macro
stuff, the lexer should be fixed such that it can be replaced by a
simple, reflective state machine (true parsing words). looking at
forth, there are 2 reading modes:

   - read upto and excluding character
   - read next word (= upto and excluding whitespace)

by fixing some of the syntax (comments and strings) editor tools can
be made exact: a list of DELIMITED words will read upto and including
a delimiter.


Entry: rethinking forth-lex.ss
Date: Tue Jan  1 18:00:25 CET 2008

a proper markup language is necessary. one that will not throw away
information, but gives perfect parsing of source code. note that in
order to transform source code to markup, a tokenizer is
necessary.

the tokenizer is a form of 'unrolled' parser: it describes a
segmentation that CAN be parsed by a reflective delimited
parser. ('reflective' means words have access to the input stream and
can thus influence the grammar).

in order to make the right decision, it is necessary to have a look at
the standard word ." which quotes a string up to but excluding the "
character and prints it: this words interprets the first whitespace as
the delimiter, and any subsequent whitespace is part of the string. in
order to properly segment code, this behaviour needs to be respected.

instead of (pre word post) a different segmentation is necessary which
can properly encode eof. what about a word/white distinction?

(word    pos string delimiter)
(comment pos string delimiter)
(white   pos string)

another question: is EOF error or not, when it follows a word? i think
the answer should be YES: otherwise it violates concatenation of files
= file.

got forth-lex.ss simplified now.. it looks really familiar ;)
i need to give it the standard names, but this looks like it.

NEXT: add delimited parsing to parser-rules. this should capture all
parsing need, since there are no more non-delimited constructs. i.e.

  (parser-rules ()
    ((_ macro : name words ; forth)
        ---------))


Entry: declaration mode
Date: Wed Jan  2 19:15:54 CET 2008

embedded in standard Forth syntax is a "declaration mode" where all
definitions are interpreted as macro definitions instead of
intantiations of words.

i'd like to express the state machine that implements this mode using
an extension of the 'parser-rules' syntax, one that implements (a
limited set of) regular expressions.

let's start with a summary of current constructs (-> means "depends on")

  parser-rules -> @syntax-case -> @unroll-stx + syntax-case

where 'parser-rules' creates a function with parser prototype (stream
-> stream,stream) and @syntax-case is like 'syntax-case' but
applicable to the head of streams.

most of the real action is in forth.ss, where i'd like to eliminate a
number of constructs. the current way to collect a number of
definitions is using 'def-parser' which creates a definition parser
parameterized by a type tag. recently i wrote this as a straight state
machine. this i'd like to replace now with some regexp based matching
approach.

the key elements in a def parser are:

    * a definition is of the form   
         : <name> (optional | <formal> ... |) <word> ... ;

    * a list of definitions is terminated by the word 'forth'

previously i came to the conclusion to only allow delimited
constructs, which are clearly marked with a start and stop
marker. these constructs require no lookahead, and thus have a simpler
automaton implementation.

i'd like to use the '...' construct to indicate zero or more, just
like the syntax-case macro, but necessarily limited by a fixed marker
symbol. a '...' at the end of a match means pattern recursion.
optional constructs can be handled by multiple match rules. this makes
a def parser look like:

   (parser-rules (: ; | forth) 
     ((: name | formal ... | word ... ; ...) ((def name (formal ...) (word ...))))
     ((: name word ... ; ...)                ((def name () (word ...))))
     ((forth)                                (()))

can this form of ellipsis be mapped to the default meaning of multiple
occurances? this looks like an important question: a core difference
between tree and sequence matching.

question: what is better?
  * special meaning of '...' at the end of a sequence (self-recursion)
  * explicit recursion?

the def parser could be constructed as a 2-phase machine: one that
dispatches between staying in the mode and calling a single def parser
or exit the mode, and the def parser itself.

'...' could vaguely mean "multiple times", but there's a difference
between: multiple times upto XXX, or infinitely many. it looks like
explicit recursion is better than looping, so i'm going to drop the
special meaning. this brings a single def parser to:

   (parser-rules (: ; |) 
     ((: name | formal ... | word ... ;) ((def name (formal ...) (word ...))))
     ((: name word ... ;)                ((def name () (word ...)))))

now, what i can use is this:

  (syntax-case #'(a b c end bla) (end) 
    ((stuff ... end r) #'(r stuff ...)))

=> (bla a b c)


yep.. it looks like there's a fundamental difference between the tree
matching and sequence matching problem. maybe i need to give it a
special symbol. let's take *** to mean: collect upto following
terminator, so ... can still be used for tree matching.

   (parser-rules (: ; |) 
     ((: name | formal *** | word *** ;) ((def name (formal ***) (word ***))))
     ((: name word *** ;)                ((def name () (word ***)))))


what about a simpler approach? the only thing that needs to be done is
to collect syntax object between marks into lists. these lists are
easy to process with a @syntax-case parser later on. so the thing
that's necessary is a way to construct a stream parser that collects
up to a certain predicate. sounds familiar?

ok.. this leads to simpler code. i could use the current 'def-parser'
as a template for a more general delimited parser expression.

i think i can ditch '@split' now. it leads to convoluted code.

ok. 'mode-parser' is now written as an explicit recursion now. this
probably means i can start throwing out some stream processing
code. wait.. need to check the macros-with-arguments thing..

OK. fixed. commented out a lot of code from stream.ss that was related
to chunking/splitting.

so.. the lesson:

     * linear streams: use explicit delimiters for embedded sequences:
       simplifies parsing: no lookahead necessary.

     * convert delimited sequences to lists + use scheme's tree
       matchers


Entry: next?
Date: Thu Jan  3 00:48:28 CET 2008

* connect the syntax reader to the parsing/loading code.
* unify all evaluation to execution of macros + manage evaluation time


Entry: moving to stx objects
Date: Thu Jan 31 12:49:59 CET 2008

what needs to be done now is to:

* replace all compile words so they accept syntax object in addition
  to lists.

* convert all generators to syntax generators

* add print routines for them


so.. start in badnop.ss: string->code/macro (for compile mode, which i
can test now). i'm replacing forth-string->list with forth-string->syntax

got string->syntax stuff working. now trying the path/file
loader. this needs @syntax-case instead of @match.

except for the weird problem below which i worked around, it seems to
work now. printing works out of the box (snot).


Entry: weird @syntax-case problem
Date: Thu Jan 31 13:54:43 CET 2008

the 'load' symbol in this doesn't want to work. if i replace it with a
different name, it does.. what's that about?

    (@syntax-case
          stream tail (load-ss load)
          
          ;; Inline forth file
          ((load name)
           (begin
             (printf "load\n")
             (@append (@flatten (f->atoms (stx->string #'name)))
                      (@flatten tail))))
          
         ....


Entry: possible cleanups
Date: Thu Jan 31 15:56:06 CET 2008


 * asm buffer from tagged list -> abstract type?

   there's a lot of room for improvement in that department. it would
   allow some kind of instruction annotation that's not possible right
   now. i think were i to start from scratch, i would build it around
   this..
 
 * macro unification

   (from the TODO)
   
   unify dictionaries: put macros in the main dict as lists, store ram
   addresses as variables, and find a way to postpone compilation of
   macros to their corresponding values if they reduce to values (are
   constants/variables/labels...)

the former is cosmetics (atm), the latter is a tough problem, but can
lead to a gigantic simplification.


Entry: target name space unification
Date: Thu Jan 31 16:01:12 CET 2008

name space unification would mean that the dictionary stored in the
.state file contains not only addresses, but also macros (in a form
that's specific enough to recompile).

this form needs to include lexical variables. so a dictionary item is
either a number, or a macro. target words are then just macros:

((abc 123)            ;; literal / constant / ram variable / ...
 (go  3235 execute)   ;; code
 (bla abc def))       ;; any macro code

taking into account lexical variables this can be simplified to a
single format:

((abc () (123))
 (go  () (3235 execute))
 (bla () (abc def))
 (arg (a b) (a b +))) 

where the first parens are the macro lexical variables.
code that has no lexical variables is purely concatenative.


this requires quite a deep cut, but should lead to great
simplification.

fork point is here.


Entry: declarative namespace + cached linear dictionary
Date: Thu Jan 31 16:53:39 CET 2008


make dictionary abstract? maybe the most important point to ensure is
cache consistency. on one end, there is a symbolic representation of a
dictionary, on the other end there is a compiled version, which
resides in the NS (macro) part. how to ensure these are never out of
sync?

so the next step is to define what the NS object actually is. it is a
collection of namespaces, where each element is STATIC. the
IMPLEMENTATION allows mutation, but the use should be restricted to
single assignment. otherwise the cache is invalid.

the main function the NS object provides is PLUGIN behaviour: late
binding of some identifiers to allow the system to be composed of
several individual pieces, without needing the strict tree-based
structure of mzscheme's module system. maybe units are the right way
out, but right now i'm stuck with this more lowlevel model. what's
necessary is to define some proper interfaces to this:

  1) NS as graph binding (single assignment)
  2) NS as cache object for target macros

i made this remark before.

the first access pattern is easily enforced: never overwrite
anything. the second one is more difficult. need to google a bit,
looks like a popular pattern: cache association list with a hash
table.


Entry: caching an association list
Date: Thu Jan 31 16:54:05 CET 2008


the problem can be solved by making the operations abstract.

association list:
   * push
   * pop
   * find

as long as the access pattern contains no pops, the caching mechanism
is quite simple. on pop, one could re-generate. this is effectively
what i'm already doing, however, it's not guaranteed synchronized.

so.. the elements: 2 dictionaries:

  (macro)         ;; defined in core, and untouched by prj
  (macro-cache)   ;; cache of prj macros

for this to work, the code in (macro) should NOT depend on the code in
(macro-cache). this means the core macros are not allowed to have
pluggable code. this is only allowed in the static load part.

let's rephrase: macros are subdivided in 2 parts:

  1) declarative with cross-resolve (pluggable components)
  2) linear dictionary extension on top of this

does this in any way interfere with local name re-definitions?

i think i just need to try it out..

re-iterate the model from the forth side:

each compilation unit has a name space that can shadow/extend the
previous one. all extensions in one unit need to be unique.  this
model resembles incremental compilation per word (strict early
binding), but allows for cross-reference within one unit.

path: 
 * get rid of constants.
 * get rid of ram dictionary.
 * move macros to target dictionary.

constants already were eliminated. they can still occur in rewrite
macros that generate asm code though.

the ram dictionary is more problematic. it's probably best now to move
to abstract access methods for the dictionary. it does look like
that's the way out. pulling those changes through the assembler will
shuffle things quite a bit. macros can follow quite easily from there.

maybe it looks like this: in assembler.ss -> 'label 'word 'allot
represent the points where the dictionary is augmented. what will
happen here is that macros can be defined also, no?

there seems to be a conflict between allowing the definition of labels
(ram or flash) and allowing those of macros, when they are all
unified..

there is a difference however: as long as the thing which creates a
new macro definition, only dumps it in the assembler buffer, there is
no problem.. the entire buffer will be assembled with the current
macro definitions.. wait, there's something warped about that.

pushing through some changes, i arrive at the assembler. it might be
best to turn the running variables (rom and ram top pointers) into
real variables, and use the dictionary as a stack.


going to try to do some things at once:
 - allot needs to be rewritten in terms of ptr@ ptr!
 - adding new dictionaries won't work any more

fading out.. next = (code . 0) (data . 0) etc.. data is missing.

ok, cleaned that up a bit.. also made the running pointers mutable.


Entry: macros in dictionary
Date: Fri Feb  1 12:28:01 CET 2008

that's the next step.  now i need to think hard about where this can
go wrong, with the semi-separation i have.

basicly, the preprocessing step SORTS all names, to make sure macros
are active before the rest of the code is compiled. this shouldn't
give any trouble.

the thing to look at next is the path macro definitions
travel. probably it's best to parse everything in one go: formal list
(empty for concatenative macros). forth.ss is again the place to
be. looks like make-def-parser is the function to modify.

that modification seems to work. now adjusting badnop.ss and
macro-lambda-tx.ss to build a compiler function that uses the parsed
representation to build a macro.

the problem here is that it doesn't really fit in the rpn-compile
framework.

so.. i made it fit. the "body" for macro-lex: compilers consists of 2
elements. a list of formals and a body. this is the standard format
used in the state file. md5 sum still checks.

NEXT: move the 'macro dict into the normal dict.

ouch.. can't have "123 execute" as macro.. or can we? maybe that's one
that should be delayed.. i need sleep. this smells like the beginning
of something new.. a proper way to organize the code.

a question to answer: why did i violate source concatenation by
introducing locals? the answer is of course out of convienience, but
is there a real disadvantage? the macros themselves are still
compositional.. this is just about source.


Entry: name change
Date: Sat Feb  2 12:12:30 CET 2008

it's time to start thinking about a name change for the cat
lanugage.. problem is of course cat-language.com

i have 2 alternatives: KAT and SCAT. the problem with KAT is that it
sounds the same as CAT. the problem with SCAT is the same as the
problem with SNOT.. do i really care though? programming in scat could
then become scatology. i still think that's humor ;)


Entry: reflection
Date: Sun Feb  3 10:43:19 CET 2008

i was thinking yesterday about macro unification, and wondered wether
it might be better to go back to the accumulative model for name
resolution / redefinition.

the main problem before was that compilation of code had side-effects
(definition of new macros in the NS hash), which made it impossible to
evaluate code for its value only. however, there is probably a way to
put this accumulative behaviour back, by taking the assembler into the
loop: let the asm 'register' the macros.

the REAL problem i'm trying to solve is still macro generating macros
and the generation of parsing words. both are a opposed to declarative
code model, but in the end, the model isn't declarative at all.. it's
a bit of a mess in my head now.

GOAL:

      i need macro generating macros: limiting the reflective tower in
      any way will always feel artificial.

how to do that?

      * accumulative (image model) is the simplest, and the original
        way of dealing with this problem. however, it doesn't give a
        static language.

      * declarative (language layer model) is the cleanest way of
        doing this, but requires some overhead that might look as
        overkill.


can we have both? the declarative approach needs s-expr syntax to be
managable. it won't be Forth any more..

let's see.. image model: simplest, highly reflective forth
paradigm. declarative: cleanest for metaprogramming purposes.

i guess i need to isolate the exact location of the paradigm
conflict. what do i want, really? 

GOALS:

  * generating new names (macros) should be possible within forth
    code. currently, the only way are the words ':' and 'variable'.

  * cross reference should be possible. this currently works for
    macros, because they use a two-pass algorithm (gather macros
    first, then compile the code) and works for procedure words, also
    because of a two-pass algorithm (ordinary assembler).

  * linearity in chunks should be possible, which is the current
    model.

questions from this:

  - is it possible to unify the 2 different ways of emplying a 2-pass
    algorithm for cross-references?

  - how to move from a fixed 2-layer architecture (macros + words) to
    an n-layer architecture. is this doable without a language tower?
    is it desirable? (is reflection really that bad? does it conflict
    with automatic cross-reference?)


the more i let this roll around, the more a certain light goes to this
solution: split the problem in 2 languages. use a reflective forth
which 'unrolls' into a layered language description, and a static
layered s-expression based language that uses the same macro core.

this gives the convenience to use forth syntax and the reflective
paradigm, and at the same time the flexibility to use the language
tower when reflection is too difficult to get right, or the automatic
layering doesnt work..

so, the current question becomes: can the GOALS be kept by moving back
to a completely reflective machine (including parser!) which unrolls
automatically?

remark: it looks as if i really need the equivalent of 'define' which
would be really 'let'.. it all seems to boil down to scope (Scope is
everything!). a forth file should be transformable into a collection
of definitions and macro definitions. it probably makes a lot more
sense to see the dictionary as an environment which implements the
name . value map of a nested lambda expression.

let's see.. 

   the current model (macros are compositional functions) is really
   good. the remaining problem is scope: when to nest (let*) and when
   to cross-ref (let-rec).

another idea.. instead of looking from the leaf nodes and building a
dependency tree, what about starting from the root (kernel) node, and
build an inverse dependency tree? the linear model is the intersectin
between the two.


Entry: future CATkit
Date: Wed Feb  6 13:43:32 CET 2008

some possible roads to travel with CATkit, and associated problems:

* boot loader programmer: instead of going with the USB TTL cable, it
  might be more interesting to create a complete solution for
  programming with brood: one that can program any of the target chips
  straight from the factory. it's pretty clear to me now that freezing
  the bootloader spec is going to be really problematic: they are
  project-specific. building a single all-in-one programmer/debugger
  solution is the way to go. maybe the E2 ideas can be unified with
  this too?

* to make the programmer doable, it might be wise to start using
  available Microchip C code: which means being able to link Purrr
  code to a MPLAB or Piklab project. also for ethernet based pics this
  might be wise. time to get a bit less radical if i want to get
  things done..

* a fairly standard 16bit Forth language. i'm far removed from this if
  i first want to fix the internal representation back to a more
  reflective approach with automatic unrolling into nested namespaces,
  and integrated parsing.. (EDIT: not true.. since the Purrr18
  language should remain fairly stable, writing the Forth while doing
  the macro changes might work out just fine.)

* pre-assembled kits for Forth-only workshops. what is necessary there
  is to work for minimal cost: basicly shrink and eliminate
  through-hole components. however.. the big cost is really not the
  board if it has pots on it. the deal is: there's no point in
  competing with arduino.


Entry: overall design changes
Date: Fri Feb  8 11:45:31 CET 2008

assembler

  it's been fun, but it might be good to start outsourcing code
  assembly. especially regarding the future use of different
  architectures, and interfacing with object code formats. it fits
  better in C code generation too.


interaction

  this needs some thought, but at this point an abstract interface
  between the compiler and the target system is necessary. the road
  towards this consists of writing a double backend: one for PIC18,
  and one for ARM (philips) or MIPS (microchip 32bit). i'm thinking
  about moving most of it back to scheme, and phase out the cat code
  in prj.ss and badnop.ss


forth language

  i'm a bit in a ditch here.. the current attempt to unify the
  namespaces into a single nested macro name space brings up questions
  about maybe unifying the parser too.. however, looking at radical
  forth changes like colorForth, a move towards a rather fixed parser
  can be observed. in my approach, the parser takes out a lot of dirty
  forth-isms while at the same time keeping the syntactic convenience
  they bring, at the price of not being so extensible.. the core idea
  is still: the current functional macro approach is good, i just need
  to figure out how to organize the name space and keep everything as
  declarative as possible (relationships, not state changes).


Entry: CATkit 2
Date: Fri Feb  8 16:53:48 CET 2008

Keeping the current code in Purrr18 as the implementation language,
moving to an on-target interpreter seems like the only sane way to
decouple the CATkit community project from the evolution of
BROOD. CATkit/Sheep core could still be done in Purrr18, but the
availability of a straight no-hassle Forth would make things a lot
simpler. Clear separation of kernel / user also serves as a good
psychological barrier.

This has huge implications for the architecture. The 18F1320 won't be
enough. Probably a move to 18F2620 is necessary because of memory
requirements.

Using the current architecture though, there is a possibility to take
the following path:

 * create a different debug bus over the ICD2 connector
 * use the serial port for Forth console

Actually, that's not really necessary.. All this can be multiplexed
over serial. Another qestion is: does it make sense to have an
intermediate dtc layer like i have now, which essentially uses a
double implementation of the compiler (macros): one in brood and one
on the target? Really, the only thing to do is to replace machine code
with Purrr18 and for the rest build a standard console based Forth
machine.


Entry: stand-alone Forth
Date: Fri Feb  8 17:14:43 CET 2008

rationale:
  * more standard (documentation)
  * no dependency on Brood (decoupled from scheme + emacs)
  * no double implementation of compiler (host + target)

roadmap:
  - look at Flashforth and Retro Forth.
  - start building dictionary -> interpret mode -> compile mode
  - possible on 18f1320 ?
  - macro/immediate?
  - tail recursion?


Entry: goals
Date: Sat Feb  9 10:24:59 CET 2008

to prevent ending up in a random walk, it's time to clearly state some
goals on the PIC18 front.

  BROOD core + PURRR18: target audience is mostly myself, or people
  with assembler/electronics background. most important features are
  flexibility (focus on macros and code generation), speed and code
  size. BROOD is a tool for the "kleine zelfstandige".

  stand-alone PURRR: target audience is much broader. less emphasis on
  absolute control, more on simplicity, language stability and
  compatibility across platforms. it's the "configuration
  language". i'm thinking ANS + tail recursion + concatenative VM.

non-PIC18 things are quite open still. core needs more modularity (see
entry://20080208-114531)


Entry: pragmatics of macro namespaces
Date: Sat Feb  9 15:25:56 CET 2008

what about this: 

  * design an s-expression syntax that has all the desired properties.

  * make the name-value binding explicit and unique: this gives
    problems with multiple entry and exit points.

  * write a translator from forth syntax

  * regenerate the macro cache, each time the language nesting level
    changes.


(language <macros> <words>)

(language
 ((a () 1 2 3)
  (b () 4 5 6))
 ((help a b)
  (broem b b b)))

nested syntax: at each point the current language sees the enclosing
macros. a compilation step compiles code into macros containing the
addresses.

<macros> <defs>  ->  <macros+> <code>

each macro block begins a new language layer.

time is not right yet. maybe i should do the forth first?

no.. i need to start breaking things and building them back up to get
more insight on how to disentangle before changing the current code.


Entry: breaking macro storage
Date: Sat Feb  9 17:40:04 CET 2008


simply replacing '(macro) with '(dict) now..

  secondary: prj.ss is really hard to understand. maybe more of the
  cat code should be moved to scheme? or at least to a more functional
  approach.. the state management is still difficult to understand.

looks like this just works for the monitor. now why is that? i
expected it to break somewhere..

it indeed breaks somewhere: interactive mode. looking up words doesn't
work. time to move that to a more abstract implementation in target.ss

next thing that broke is 'mark'.

  prj.ss: is so dirty because there's a lot of mutation going on, and
  the naming of words is really inconsistent. this really needs
  cleanup.

  another hidden assumption about "org" in bin->chunk. the problem
  seems to be that absence of 'org' leads to problematic asm blocks.

what about structured asm? i read something about this in olin
shiver's comments about a summer job he did implementing a scheme
compiler.. maybe that's what i need to go to? anyways.. there's a lot
lot lot of work cleaning up data representations.

  the whole ifte/s and run/s business is a bit rediculous.. it doesn't
  feel natural, and requires deep thought each time. i think it's time
  to ditch the way state access works, and move most code to
  functional programming with prj.ss doing nothing but state
  management (no control logic!)


Entry: state management / the point of prj>
Date: Sun Feb 10 14:50:08 CET 2008

something really smelly about it. i think i'm better off with true
mutation in the scheme sense, instead of working around it the way is
done in prj.ss

the base line is: this prj> mode should be usable for DRIVING THE
TARGET. the whole functional state business is overkill: most code can
really be made functional, and possibly more understandably written in
scheme. whenever state recovery is necessary, it can be moved to the
functional domain (i.e. assemble and compile as they are now..)

the problem i'm trying to solve is discipline: not gratuituously using
global state. maybe i should read some haskell tips, since this is the
way haskell programs seem to be written: a bulk of pure functions and
a central state management module through monads.

let's see some important properties:

  - the interactive forth layer translates to prj scat code
  - the macro code is purely functional code with a threaded asm state
  - staying close to scheme keeps things simple


other remarks:

  - base and prj are different. this is clumsy.
  - there are 2 namespaces: NS and the prj state namespace.
  - prj already behaves as true mutable state. is permanence necessary?
  - atomic failures

preliminary conclusion is: scat code is important as intermediate
layer between scheme and forth, both for interactive and compile time
use. the compile time part needs to be functional because it makes
computations easier: compilations should be really just functions. the
interactive part however is intrinsicly stateful: ultimately it
manages the state of the target and the current view (debug UI).

the only place where current scat/state approach is useful is atomic
state updates. these however, can be replaced by purely functional
code and a transaction based approach: each command is a state
transaction and either fails or succeeds. compositions of transactions
should maintain that property. aha, holy grail identified:

   COMPOSABLE TRANSACTIONS


maybe i just need to start reading again. this is very related to COLA
(combined object lamda architecture) and the recent transactional
memory stuff in haskell.


Entry: transactions
Date: Mon Feb 11 09:35:00 CET 2008

the way it works now: every console command that updates the state
store in snot.ss is a transaction. if it fails, the previous state is
maintained. something like that can be implemented differently.

what i'd like to avoid is to have to copy NS in the current
implementation. a possibility is to transparently replace part of the
NS tree with an association list. then parameters can be used to make
a copy.

it looks like the let* / letrec problem wants to propagate deep into
the structure of the entire program.. why is that?

maybe i should start using a persistent object model for the store?

ok.. this is shaking up the roadmap again. TODO:

  - fix the problems with macro unification
  - implement reverse macro lookup properly
  - think about making evaluation time concrete (entry://20071217-100727)
  - work towards a cleaner state representation


about haskell and monads: looking at state management, monads somehow
solve the bookkeeping of 'current' data. this can take many forms, but
two crystallized constructs are: global and dynamic environments,
which in scheme would solve most problems involving the passing of
data outside of function arguments. thanks to the type system in
haskell, the red tape can be hidden, and all is implemented using just
functions. 

EDIT: being able to use state restore on failure on the command line
level is really nice. this should not be given up. however, once the
target is being modified, errors can't be fully recovered.


Entry: variables
Date: Mon Feb 11 10:17:08 CET 2008

running into trouble with recursive variable expansion. the problem is
that a variable is this:


 #`((extension: name () 'name)  ;; macro quotes name 
    'name #,n buffer)   

which uses:

 (([qw name] [qw size] buffer)   ([variable name] [allot 'data size]))

and this in the assembler:

  (define (variable symbol value)
    ;; FIXME: no phase error logging?
    (dict-shadow-data (dict) symbol value))

so eventually, the name will get shadowed. the problem now seems to be
that there's some recursive lookup that messes things up?

lets try a test case.

   variable broem
   broem  \ <- infinite loop

ok.. conceptual error or just small bug?
just small bug: forgot parens around 'name in (extension: name () ('name)
which gave (quote name) -> recursive call


Entry: intermezzo -> snot + interrupt
Date: Mon Feb 11 10:22:48 CET 2008

this is getting on my nerves. it's been fixed a while ago in mzscheme
cvs, but maybe i should just go for 3.99 atm? see if it breaks
things..

went pretty well. had to replace some reverse! by reverse, and use
mutible pairs in the decoder.ss

another thing that changed is manual expansion of user paths
(tilde). this is a bit more problematic.

another thing that gets on my nerves is the absence of stack
traces.. what am i supposed to do with this:

  ERROR:
  car: expects argument of type <pair>; given {#f . #<procedure>}

ok.. it is pretty deep: the srfi-45-promise uses mutable pairs.
fixed + fixed the plt sandbox code and sent mail to plt-scheme list
fixed break stuff in brood + snot.
breaks work now.


Entry: more fixes
Date: Mon Feb 11 16:14:53 CET 2008

the 'empty' needs to be fixed. something wrong there.  doing reverse
asm would be an interesting next step + moving some code to hex
printing.


Entry: moving more code to scheme in tethered.ss 
Date: Tue Feb 12 13:57:00 CET 2008

  * mzscheme with modules is quite a nice namespace management tool to
    write nontrivial programs. the big flat namespace with
    late-binding plugin behaviour in brood is a bit messy. maybe i do
    need the extra bit of mz handholding, and move plugins to
    parameterized code?

  * i really miss closures when writing cat code. names and nested
    scopes are important, and trading in a bit of conciseness for
    names (and absence of stack juggling!) is a good idea. with
    closures and macros, scheme is malleable enough to reduce red tape
    where necessary. my personal preference is moving: cat is not a
    good implementation language compared to scheme.

  * the cat intermediate language is interesting to simulate
    interactive forth: translation is really straightforward. gluing
    scheme and forth together, this layer serves well: adding scheme
    functionality to cat is straightforward + translating forth to cat
    is too.

this leaves me with the following problem to fix: ts-stack is a word
that is used to plug in the target stack bottom + pointer location. do
i keep it like that?

it looks like these things are best solved using parameters: that way
the scheme code will work too. maybe i should make a list:

  * connection (lazy-connect.ss)

candidates:

  * stack location
  * flash program/erase size
  

Entry: porting to mz v4
Date: Fri Feb 15 10:27:21 CET 2008

yeah, reading docs can bring clarity ;)

  doc/release-notes/mzscheme/MzScheme_4.txt

i got a bit confused about the whole scheme and scheme/base thing
while reading some web server docs. the biggest change seems to be the
use of optional and keyword arguments in lambda expressions.

do i make a full port? probably best to not keep too much legacy in
the brood core.. i need the upgrade for sandbox.ss fixes, so maybe
it's time to jump to 4 completely. as expressed in the release notes,
the keyword arguments can be problematic for legacy code..


Entry: big changes
Date: Fri Feb 15 10:52:12 CET 2008

OK..

i think i know what i need to do, but it's a big job: i need to get
rid of the NS namespace, and split the code into:

   * purely functional
   * parameterized

the line between the two isn't clear-cut. parameters are things that
are "mostly constant". i.e. communication ports, file paths, ... to me
it looks like this is the most important line of name space management
in scheme code. (in haskell, the problem of code parameterization as
automatic threading of data is solved using monads)

the problem with parameters is that they break referential
transparency, which is a great property for testing.. i think in most
cases, a transparent function can be wrapped in a parameterized
one. i just need some moderation here: every use of a parameter, deep
in the code (like 'here' in the assembler) makes things more specific,
but might be the right thing to do.

so, basicly, code can be dynamically layered: the assembler
i.e. doesn't USE the target dictionary as a parameter, but gets it
passed as an argument by the interaction system (which i.e. does has
it as a parameter). in contrast, the assembler, internally, might use
dictionary as a parameter, but the code outside of the assembler
doesn't need to know that.

getting rid of NS namespace, and moving to module name management
instead means:

  * more code is static (tree dependencies)
  * plugin behaviour (graph dependencies) need to be solved explicitly
  * simpler: map everything straight to scheme compilation, with names:
   - lexical
   - module-local (with prefix to separate from scheme)
   - toplevel (might be used for plugin behaviour / units?)

that looks significant enough to call it brood-5


Entry: eliminating the state dialect
Date: Fri Feb 15 11:20:19 CET 2008

anything that can be done on brood-4 before making the jump to
abolishing NS? yes: moving to parameterized project data while keeping
the transaction-like workflow intact + solve transaction thing with
target memory maps.

ROADMAP:

  - move more compiler code to scheme.

  - eliminate the prj <- state implementation, but make sure
    transaction behaviour is maintained (association lists or
     hash tables?)

  - move assembler and parser to separate dictionaries (or keep them
    in NS till later?)  

  - move CAT code to module based namespace.


Entry: plt scheme study
Date: Tue Feb 19 16:44:25 CET 2008

maybe it's best to look a bit closer to the plt scheme language now
that V4 is coming out. some things i'd like to know more about are:

  * mixin class system
  * delimited control

mentioned on http://en.wikipedia.org/wiki/Plt_scheme

in addition, it would be nice to get more of the drscheme
functionality in snot, such as proper stack traces, module browser,
syntax-level refactoring.

i'll take http://zwizwa.be/darcs/sweb as the case study for
this. brood's a bit to hairy atm.

trying to make sense of:
http://www.cs.utah.edu/plt/delim-cont/

it looks like understanding this will bring me closer to understanding
the problem in brood with "undo" at the console, and the transaction
based model i'm chasing after. yeah, vague..

reading the paper. chapter 2: the operators: shift, control, reset.
hmm.. i'm missing a lot of muscle to read that one..

ltu to the rescue:
http://lambda-the-ultimate.org/node/606
http://lambda-the-ultimate.org/node/297

  "Good stuff! But keep in mind that, as the cartoon in the slide
   says, control operators can make your head hurt..."

no shit..

to summarize vaguely what the 2 points are about:

  - delimited control: partial continuations: don't jump outside of
    context.

  - mixins: somewhat related to generic functions.

about the delimited continuations, it might be best to read the plt
doc on "prompt" and some related things on continuation marks and
stack traces. for mixins, i'm reading this:

http://www.cs.utah.edu/plt/publications/aplas06-fff.pdf

from a quick skim i don't see how it's related to generic functions
though.. mixins seem interesting though i don't see the difference
with multiple inheritance. maybe that inheritance is linear instead of
tree-structured?


Entry: expression problem
Date: Thu Feb 21 23:23:46 CET 2008

http://groups.google.com/group/plt-scheme/browse_thread/thread/3aaacdc5169e5889

Mark's reply was pretty clear, and this:

  The PLT folks have used the expression problem as a springboard for
  thinking about big issues like, what does it mean to be a software
  component, and what are appropriate ways for reusing and extending a
  software component.

Is then modules/units/classes/mixins..

Swindle might be indeed a good thing to have a look at next. The whole
deal of multiple dispatch, so central to PF, is in the end something i
need to understand better.

about multimethods: cicil is mentioned here:
http://tunes.org/~eihrul/ecoop.pdf

http://citeseer.ist.psu.edu/219067.html
compression of dispatch tables?
(about PF: there's probably a way out using small number of types or
compile time type inference..)


I'm reading ``Modular Object-Oriented Programming with Units and
Mixins'' now.

The slogans make a lot of sense:

  * UNITS: Separate a module's linking specification from its
    encapsulated definitions.

  * MIXINS: Separate a class's superclass definition from its
    extending definitions.

Maybe i should give it a try?


Entry: units
Date: Fri Feb 22 01:06:40 CET 2008

looks like units + modules are going to be enough to organize brood
without the need for a NS hash table. how to exactly chop it up is
still a bit of a mistery. maybe start with the plain CAT code, then
organize the macros in a similar way, then find a way to translate
forth code straight to s-expression.

what if i start with separating out the assembler as a unit? in the
end i'd like to be able to use externally provided assemblers / C
compilers.

in doing so, abstracting the data types that are passed between
assembler and linker might be necessary. these are assembly opcodes,
dictionary and compiled target words + linker data.


Entry: call by need
Date: Fri Feb 22 12:18:56 CET 2008

was trying to quickly hack up a solution in scheme that emulates
makefiles and i realized it's actually call-by-need, which is again
the same as the dataflow serialization problem (pd). which can be
extended to early reuse by transforming it into a linear language
(i.e. forth).


Entry: delimited continuations
Date: Tue Feb 26 13:26:36 CET 2008

best to start here:
http://pre.plt-scheme.org/docs/html/reference/Evaluation_Model.html#(part~20prompt-model)

i think i sort of get it.. the analogy of stack frames, but more
general since they can be tree-structured (just like
environments). all the operations on continuations are then
compositions of these trees, with restrictions on how far back in the
tree continuations can be captured, and rules on composition that
makes sence in light of these restrictions.


Accessing a tree as if it were a stream and ``updating'' in-place
without mutation..

http://lambda-the-ultimate.org/node/969


Entry: errortrace
Date: Thu Feb 28 11:50:17 CET 2008

http://pre.plt-scheme.org/docs/html/errortrace/installing-errortrace.html

this work when using it like this:

Welcome to MzScheme v3.99.0.13 [3m], Copyright (c) 2004-2008 PLT Scheme Inc.
> (require errortrace)
> (enter! (file "/tmp/test.ss"))
 [loading /tmp/test.ss]
 [loading /usr/local/mz-3.99.0.13/collects/scheme/base/lang/compiled/reader_ss.zo]
 [loading /usr/local/mz-3.99.0.13/collects/syntax/compiled/module-reader_ss.zo]
> (a)
error: bla
/tmp/test.ss:8:12: (error (quote bla))
/tmp/test.ss:6:12: (b)


 === context ===
/tmp/test.ss:7:0: b
/tmp/test.ss:6:0: a
/usr/local/mz-3.99.0.13/collects/scheme/private/misc.ss:63:7

the file /tmp/test.ss is:

#lang scheme/base
(provide a)
(define (x) #f)
(define (a) (b) (x))
(define (b) (c) (x))
(define (c) (error 'bla))


now, to incorporate it in snot, it looks like there's a combination
needed with prompts. indeed.. the error printing works fine when
wrapped in 'prompt', and execution continues thereafter.

http://pre.plt-scheme.org/docs/html/reference/cont.html#(mod-path~20scheme~2fcontrol)

first thing to note: 'prompt' and 'abort' i can add those in sweb
instead of the current combination of parameters and call/ec.

second: prompt is readily applied in the repl in brood, at run/error
in host/base.ss

it works for host/purrr.ss by replacing the toplevel error printer by
a prompt. probably can do the same in snot.

hmm.. it's not in snot that the prompt should be. i did add some
marking to the code that prints 'language-rep-error' in case the
underlying rep (provided by the program!) doesn't print the error
itself. so in brood the error should be printed, and preferably INSIDE
the box context.

"console.ss" is loaded in the snot context from "snot.ss". the latter
file registers the different languages using the 'register-language'
snot function present in snot's toplevel. ("snot.ss" is not 'require'd
but 'load'ed)

what i'm interested in is frames that run up to the sandboxed
evaluator, so maybe it should be implemented in snot/box.ss ? see snot
ramblings for more..


Entry: continuation marks
Date: Thu Feb 28 17:20:17 CET 2008

http://www.cs.utah.edu/plt/publications/icfp07-fyff.pdf

currently continuation marks are used to make some kind of scat
language trace through the code. basicly, i can put anything there i
want. it's reassuring that the basic mechanism is available. (also,
this idea is very related to some dynamic variable hack i tried in
PF.. don't remember if it's still there..)

something strange that i didn't know about exceptions: apparently the
handler is executed in the context of the 'raise' call! that explains
a lot. no.. this is not the case:

(define param (make-parameter 123))
(with-handlers
    (((lambda (ex) #t)
      (lambda (ex) (printf "E: param = ~s\n" (param)))))
  (parameterize
      ((param 456))
    (begin
      (printf "B: param = ~s\n" (param))
      (raise 'boo))))

gives:
B: param = 456
E: param = 123

ok: i'm confusing the lowlevel 'handle' with the highlevel 'catch'.
the paper mentions how to implement 'catch' on top of 'abort', but
also talks about interference of prompts, and the use of tagged
prompts to work around that.

so the bottom line: exceptions and prompts do not collide, because the
prompt tag used to implement exceptions is not accessible. this does
mean that an exception can jump past any arbitrary prompt.

question: how does this work in sandbox? apparenlty sandbox re-raises
exceptions: see the internal function 'user-eval' in 'make-evaluator*'
in scheme/sandbox.ss : the value that comes from the channel is raised
if it's an exception.

something is still don't understand about mixing of prompts and
exceptions. if i don't wrap a prompt around the evaluation in
host/purrr.ss exceptions will terminate the program, so the prompt
seems to terminate propagation and trigger the printing of the
error. however, doing this down the chain in snot doesn't work like
that..

a prompt with default tag wraps the toplevel, so the whole
continuation is also a partial continuation (upto that prompt).

hmm.. then i read this:

  "The default prompt tag is also part of the built-in protocol for
   exception handling, in that the default exception handler aborts to
   the default tag after printing an error message."

note this says 'default exception handler'. so if there's one above
the prompt, that one will be called instead of the default handler.


Entry: roadmap
Date: Thu Feb 28 14:45:25 CET 2008

adjusted roadmap:

  * get base language working without NS + put in separate module.
  * figure out how to use units for plugin behaviour

then follow up with entry://20080215-112019

it looks like understanding the namespace issue by first moving the
core component to a more native namespace management system is a key
element. the rest should then be mere disentanglement.


TODO:
  separate SCAT as a different project
  separate it from NS


Entry: eval vs. require
Date: Sat Mar  1 19:56:18 CET 2008

the key insight (finally) seems to be that the current 'eval' based
approach needs to be replaced by 'require', or an underlying mechanism
that allows module based namespace management. everything that now
goes through the NS hash can be done with module namespaces.


Entry: module namespaces
Date: Mon Mar  3 00:27:20 CET 2008

everything reduces to scheme code in modules, which makes things
easier to extend. (also for parsers?)

(define increment
  (lambda s
    (apply base.+
           (cons 1 s))))


the idea is that 'increment' can be imported as 'base.increment', or
anything else, using prefix imports. there's no need to specify the
target namespace unless there are clashes between scheme and the
functions defined in the module, which can be avoided by not importing
scheme bindings, and separating definition of base. primitives (which
has scheme available) from definition of composites. composite modules
then only contain definitions which map some namespace ->
(un)prefixed.

to this end, a similar aproach can be used as the 'find' plugin in the
rpn syntax currently used for NS linking. the 3 elements: syntax,
source ns and dest ns can be specified like before. (just make a
namespace translator?)

problem solved? probably only units left: plugin behaviour needs to be
handled explicitly.


Entry: language tower
Date: Mon Mar  3 00:38:37 CET 2008

scheme
base    snarfed scheme functional rpn
state   macro primitives
macro   forth machine wrappers
forth

why so many?

they all solve a single problem in a very straightforward way. base
snarfs functionality from scheme, state is a lifted base + threaded
state, and macro implements the greedy machine map + peephole
optimizer using a threaded state model.

misc hacks from plane notes:

- auto snarf through contracts
- use #lang scat/base for base->base maps (is purely declarative
  language possible?)
- decouple module as unit to speed up compilation during incremental
  dev. (fake image based dev)
- get rid of @ stx for streams (scribble) / find standard streams lib
  / use lazy scheme. (brood is pure FP so why not)
- use parameters for compiler object (also for NS stuff?)


Entry: parameterized transformer
Date: Wed Mar  5 14:30:34 CET 2008

instead of using a compilation object, it might be more convenient to
use parameters in the transformer environment to define functionality
for the basic syntax operations.

maybe best to write the rpn code from scratch in scat/rpn/


Entry: scat ready
Date: Thu Mar 20 08:38:42 EDT 2008

looks like the lowest layer of rpn code + namespace management is
done. made a nice extension that allows parsers to be written as
syntax transformers (like it should!).

until the representation part is finished and ready to be ported to
brood, the process is documented in the dev log at
http://zwizwa.be/ramblings/scat


Entry: cps forth
Date: Fri Apr 18 11:31:43 EDT 2008

is there any meat in cps forth? or is this just a way of interpreting?
probably..

cps replaces "CALL" and "RETURN" with "GOTO with parameters". it does
need first class functions though.


Entry: parsing C
Date: Fri Apr 18 12:50:15 EDT 2008

http://eli.thegreenplace.net/2007/11/24/the-context-sensitivity-of-cs-grammar/

of things to do.. i need to have a look at piumarta's packrat
parser. that would be a very interesting addition to brood.


Entry: scat progress
Date: Fri Apr 18 14:07:35 EDT 2008

is going really well. i'm as good as done, except for the interactive
part which needs a bit of re-org. the name space management is a lot
better now. making things a bit more static didn't really hurt.


Entry: flashforth
Date: Thu Apr 24 16:08:58 EDT 2008

going through the flashforth tutorial, and it seems mikael has been
busy. with some optimizations here and there. it's nice to have an
example like that.

this does bring me to the optimization vs. simplicity trade-off. it
seems difficult to stay at either extreme.


Entry: brood-4 end of line
Date: Tue May 13 00:56:56 CEST 2008

this branch is dead, and no longer compiles due to dependency on scat
is now stand-alone. i'm trying to roll this branch back to where
it was last working without scat, for demo purposes.

the scat split is tagged 'pre-scat-split'

the only problem is to re-synchronize the zwizwa lib to get it running
again. the idea is this:

  - branch from the 'pre-scat-split' point
  - re-add code from zwizwa lib to brood
  - publish that as brood-4 for demo purposes
  - see if snot still works

in the future: tag all projects that depend at the same time!!