Staapl dev log. This is a day-to-day development log & soundboard. You probably want: home http://zwizwa.be/staapl edited blog http://zwizwa.be/ramblings/staapl-blog Related project logs: entry://../meta entry://../libprim --------------------------------------------------------------------- TODO Trivial to fix: - list all commands - snot forth console PIC18 features: - I2C bus monitor - live access to multiple microcontrollers - usb driver: basic abstractions PIC18 apps: - interrupt based CRT display controller - serial keyboard interface for KBsheep - PC reset logic connected to WRT54 Possible extensions / Non-trivial things to fix: - build System09 / rekonstrukt using xilinx tools - reflective forth bootstrapping on top of staapl (reusing primitives!) - figure out why it compiles so slowly - write more documentation - assembler addressing modes - dsPIC - 6809 - assembler text output - external emulator hooks Entry: introduction Date: Sun Jan 28 12:00:00 GMT 2007 [ this used to be the blog header ] Staapl consists of: * Scat: a set of macros for PLT Scheme implementing a family of dynamically typed concatenative languages usable inside scheme code. * Coma: a COmpositional MAcro language: an extension of the Scat language with data types representing target code, and a specification syntax to define target code pattern matching primitives. * Control: Extension of the Coma language with Forth-style control primitives based on conditional and unconditional jumps, useful for low level programming. * Comp: a compiler that instantiates Coma+Control macros to produce a code graph structure + performs a-posteriori optimizations. * Asm: a straightforward multipass relaxation assembler with arbitrary expression evaluation in terms of target addresses and a high level opcode definition language. * Forth: parsing extensions for representing classic Forth syntax + PLT scheme language layers. * Pic18: uses the Purrr template specialized to the Microchip PIC18 architecture, to implement a PLT Scheme #lang. The idiomatic forth language and the ideas behind the peephole optimizing compiler have remained fairly constant since the 1.x version. The Forth dialect is implemented as a collection of macros: functions which operate on a stack of assembly instructions, and a stack of nested control structures. The thing which changed throughout the versions is the use of better (higher) abstractions to implement the basic ideas, and factor the design into simpler components. Note that this project is much more about functional programming and PLT Scheme than about Forth. The Forth dialect is used mostly as a practical macro assembler on top of the clean Coma architecture. However, the end result is nice to play with PIC18 chips, and it is used for real-world stuff. Entry: monads Date: Sun Jan 28 14:43:30 GMT 2007 EDIT: i clearly didn't get it here.. much of the monad stuff i talk about is not what you find in Haskell. the thing is: a stack that's threaded through a computation behaves a bit like a monad. it solves some same practical issues, especially if you start to use syntax transformations that can convert pure functional code to code that passes a hidden top element. but it doest have the power of a generic bind operation. i talk about this later though. i think i finally start to get the whole monad thing. in layman's terms: it is centered around splicing together (using 'bind') functions that take a simple object to a container. .in what i'm trying to accomplish, this is just compilation: take in some code and output an updated state. maybe i should give up on the whole CAT thing after all, and concentrate on using scheme and some special structures to actually create a proper language and macro language. i already have a way to write concatenative code in scheme without too many problems (see macro.ss) the layer is probably just not necessary.. scheme is more powerful, and everything i now do in CAT i could try to move over to a virtual forth: write everything from the perspective of the forth itself.. something like: 'spawn this process on the host'. another thing that's wrong with CAT is the lack of decent data structures. it's overall too simple for what i'm trying to do: proper code generation on top of a simple basic language. let's go back to my recently restated goals for BROOD: * basis = forth VM on a microcontroller for simplicity * write cross compiler in a language similar to forth * use as a FP approach for the compiler language the middle one can take on different forms. i still think it is very beneficial to have an intermediate language to express forthisms. but this language can just be embedded in scheme stuff. so let's just start to build the thing, right? Entry: CAT design Date: Sun Jan 28 17:28:56 GMT 2007 [EDIT: Sun Oct 7 00:20:32 CEST 2007 - the pattern matching grew to a 'quasi algebraic types' construction - from forth -> machine code there are now a lot more passes - the shared forth elimination is made machine specific.] the design is quite classic forth, but it might be simplified a bit. CAT consists of the following pipeline: (1) (2) forth --> peephole optimized assembler --> absolute bound machine code currently (1) is the compiler while (2) is the assembler. it might be more interesting to actually split it up in two parts. introduce a peephole optimizer that can separate out the forth compiler to a higher level and the assembler to a lower, machine specific level, making it a bit more like a frontent/backend multiplatform compiler. also, several things could be made declarative. peephole optimization is basicly pattern matching. currently CAT implements it as a pretty much imperative process: if last instruction is dup, then drop undos dup etc.. give the target we are using, it is possible to completely write the assembly language in forth style. so, summary: split peephole opti in 2 parts: - shared forth elimination (as a result of macro expansion) - machine specific assembler opti Entry: declarative peephole optimizations Date: Sun Jan 28 17:38:31 GMT 2007 basicly, this is a rewriting system. currently i use a tree structure for this (ifte). this is a list of transformations: ( [(dup drop) ()] [(dup save) ()] [(save drop) ()] [(1 +) (inc)] [(1 -) (dec)] [((lit?) +) ()] ) what do do with things that do not fit this? for example literals.. i really do need predicates. actually, i should make a list of all optimizations to make it a bit more clear. currently the code is way to dense. Entry: rewriting Date: Mon Jan 29 00:26:40 GMT 2007 funny.. google search on rewriting lead me to the pragmatic programmer. maybe i shouldn't read joel on software. especially not his rant about never scratching a whole project and starting over. anyways.. there are some serious things wrong with the way i'm trying to solve the compilation / optimization problem. i'm using a massive tool and am still writing in the gerilla hack style most forths are written in. i have proper data structures now, so why not use them? why not make some minilanguages that do special tasks. it would be interesting to start moving functionality over to the lamb core as soon as possible. most of the code is optimization though, so.. i'm curious about this rewriting business.. looks like there's something to learn there. i know it works in a naive way, since that's what i already have. but i'm curious if this can be taken further. with faster static code it might be possible to do a whole lot more (non deterministic stuff). some problems to face are literals and some.. one thing that worries me is 'how to prevent loops'. i know, if things get smaller, there can be no loops, but i can imagine some more fancy expand/collapse rules that might start looping with a naive approach. looking at the optimizations, most are about reducing stack juggling and moving it to register transfers. this is almost universal on all machines. i do need to think a bit about a sort of 'base line forth' that will be the end of the optimizations, such that eventual compilation is straightforward. this seems like an elegant solution. Entry: purely compositional approach (joy) Date: Fri Feb 2 12:35:56 GMT 2007 whenever program text is read, it is immediately compiled. each symbol is replaced by it's particular function, and each constant is replaced by a function that pushes the constant. sym -> (lookup sym) data -> (lambda stack (cons data stack)) to get back the value of a data atom (practical issue), you pass any list, apply the function, and pop off the data. this can be done at compile time. so, can we do with data structures composed entirely of functions? probably yes.. probably this isn't even such a bad idea.. it looks like it is not a good idea to map composition to single lambda expressions, but to have an interpreter for it instead, so we can implement things like CAR efficiently: it is possible to implement CAR on an abstract function which represents a list by 'testing' it on a stack, however, this is a lot worse than just getting the left element of the first pair.. then, why not represent constants by constants instead of their wrapping functions? back to square one.. Entry: jit compiler + parser Date: Fri Feb 2 15:24:47 GMT 2007 if i'm absolutely sure that function names are static, it's possible to use a jit compiler without sacrificing this semantic property: leave them as symbols until they are encountered, then compile them. this would also eliminate the problem of forward/backward declarations etc. this seems to work very well in the simple first experiments.. the parser seems to work too: parse code = list of things. if one of the things is a list, parse it and wrap it in a lambda. so what about closures? Entry: rewriting Date: Sat Feb 3 10:52:54 GMT 2007 i'm working on the rewriting, and it looks like this is ideal to use a compositional mini language for.. so i've quickly extended the 'run' function to take a 'compiler' argument which will resolve symbols to functionality, still using the jit compiler. the 'compiler' term could be used to give context (lexical) dependent information about symbols. however, it should really be tied to the stored code then.. the idea is to represent pattern matchers as ordinary composed code, but using a special compiler (macro) to instanitiate them. so this gives the list of problems for today: - solve lexical compilation issues - how to 'execute' from within a primitive lol. this is again exactly the same thing as i already had: each forth word is a macro :) so the problem reduces to the lexical thing (namespaces) and how to compile a generic pattern matcher into a macro. Entry: do i really need lambda? Date: Sat Feb 3 11:24:04 GMT 2007 what i need is local names, just for the sake of code organization and different sublanguages. i don't need lambda really. i don't need runtime binding of symbols to names. the whole idea of combinatory/compositional/concatenative languages is to eliminate variable names... Entry: macro semantics Date: Sat Feb 3 12:37:36 GMT 2007 i have something like this now: (define (macros name) name) (define-resolver register-macro macros) (register-macro 'nop (lambda stack stack)) (register-macro 'dup (lambda (asm . stack) (pack (pack `(movwf POSTINC) asm) stack))) which can be executed as > (run (parse macros '(dup dup)) '(())) (((movwf POSTINC) (movwf POSTINC))) > so in this, compilation is the execution of one program to produce another program. let's stay in the forth syntax as long as possible, and rewrite this to: (define-syntax forth (syntax-rules () ((_ output () (rwords ...)) (pack rwords ... output)) ((_ output (word words ...) (rwords ...)) (forth output (words ...) ('word rwords ...))) ((_ output (words ...)) (forth output (words ...) ())))) (define (macros name) name) (define-resolver register-macro macros) (register-macro 'nop (lambda stack stack)) (register-macro 'dup (lambda (out . stack) (pack (forth out (POSTINC1 movwf)) stack))) > (run (parse macros '(dup dup)) '(())) ((movwf POSTINC1 movwf POSTINC1)) > Entry: dynamic code Date: Sat Feb 3 12:45:23 GMT 2007 note that once something is run once, it will be compiled in place and can never be accessed as data again. it is important to make it impossible to do things like (1 2 3 + +) dup run and then use the 2nd copy. this is easily solved by first creating a copy though, but sort of defeats the current way the JIT compiler works. maybe i can make sure that the 'run' word, which is the interface to the internals, always makes a copy of a list whenever it encounters dynamic code? to summarize - parsed lists are safe. they are always pure code and can never be interpreted as pure data. - anything that's 'run' at run time is not, so here a copy needs to be made. the solution here is that 'parse' is necessary to run symbolic code: (1 2 3 + +) parse run and 'parse' already makes a copy of the list, since it is functional. NOTE: (define (list-copy x) (append x '())) so i defined interpret <- (parse run) Entry: bind and stuff Date: Sat Feb 3 13:17:03 GMT 2007 i think it's starting to dawn on me.. the disadvantage of functions like the above is that state accumulation is explicit: there is this chunk of accumulated state on the top of the stack while none of the functions actually need it to be there. enter the 'bind' concept. in order to get rid of these arguments, you define a macro according to: ( -- code ), but automaticly lift it to ( code -- code ). ok.. this seems to work. what i have now is a way to generate simple substitution macros. Entry: rewrite macros Date: Sat Feb 3 15:54:08 GMT 2007 the next step is rewrite macros. this should be done in two steps. in order to make a single 'intelligent' macro, different patterns need to be combined into one fucntion, and one function needs to have information about different patterns. a sort of 'transpose'. - make a list of rewrite patterns - compile it into code rewrite macros are easier understood as operating on output forth code. i don't know. ok.. time to be stupid then. state the previous solution, then abstract it. previous solution was explicit (dup drop) -> () drop is a function that discards anything that comes before it and produces a value (without other side effects: it's important to write macros so that the last operation is a mutation) so drop needs to be intelligent. ok. it's easy enough to implement this in exactly the same way as in CAT/BADNOP. however, there should be a more highlevel construct that eliminates the explicit if-then things. Entry: compositional languages suck Date: Sat Feb 3 17:07:00 GMT 2007 it's a feast to use them to glue things together, but more complicated things are easier expressed using lambdas.. i think the approach of writing the core algorithms in scheme with full fire power, and keeping the language itself mainly for interaction, is a valid one. compositional languages are cool because they lift you from the burden of having to name things, and allow you to think in terms of structure (more geometricly) vs. random connections in parametrized things. Entry: more lifting Date: Sat Feb 3 17:44:56 GMT 2007 i ran into a new class of functions. i already had . -> code which are just constants. now i have code -> code which are code transformers that need to look at the current generated code state (never the source code!) i added the default resolver for macros to be a quote to the forth output stack. i do need to change the way other types are handled though. it looks like this is better solved earlier in the process. to keep the JIT compiler like it is, the parser could be adapted to already compile constants to quoting procedures. this works nicely. Entry: now for the meta stuff Date: Sat Feb 3 21:16:12 GMT 2007 some questions remain. how to generate more boilerplate for some kinds of peephole optimizations, and how to check if it is actually possible to optimize towards a 'core forth' that can be straight compiled to assembly. let's find out by systematicly porting some macros. the main question is: what with arguments? 123 ldl means load literal in the top of stack register. Entry: cat snarf Date: Sun Feb 4 00:16:56 GMT 2007 porting stuff from old cat to new cat. seems to work really well. not having state on stack to deal with makes things a lot easier.. but for badnop this means the database needs to be designed in a proper way. maybe for the assembler we use some kind of dictionary as a state? the thing is.. i'd like to keep as much of the 'functional OO' that was present in CAT. this makes it possible to do parallel stuff and backtracking in a very easy way, especially now that it's kind of fast. Entry: intermediate language Date: Sun Feb 4 10:51:20 GMT 2007 since i am optimizing for a register machine, it might be best to write the rewriter in terms of a the register machine primtivies i used in BADNOP. the main thing to decide is: is it easier to optimize code like this: 1 2 + or this (dup) (lda 1) (dup) (lda 2) (call +) it's definitely easier to do the former.. maybe i should implement the assembler now, so i can see this a bit clearer. the problem i'm trying to solve is: rewrite forth in such a way that assembly becomes trivial. things which make this problematic are folded constants: constants that are already bound to a machine operation as a literal. maybe i should just write them as 'pseudo forth code' but group them, like this: 1 2 + -> dup (drop 1) dup (drop 2) here every grouped instruction is meant to be replaced later by one machine opcode. the advantage of this approach is that there are no 'self-quoting' things in the code after a first pass. considering the targets i'm using don't have an instruction to do dup+ldl in one go, i guess this idomatic approach is a valid one. it is probably better to do this in more phases: 1 forth based semantic substitution (rewrite) 2 conversion to idomatic representation (compile) 3 direct mapping from idiomatic cells to assembly code (assemble) because 3 can be made invertible, it's possible to easily decompile, flatten and semanticly optimize back! Entry: pattern matching Date: Sun Feb 4 13:41:49 GMT 2007 having a look at the plt pattern matching code. i really need this kind of stuff :) the basic thing is (match x (pat expr) ...) when x matches one of the pat, the correspionding expr is evaluated with symbols of pat bound to values in x. ok.. seems to work pretty well. but i still need to find out how to reverse a pattern. Entry: next Date: Sun Feb 4 16:06:45 GMT 2007 * find out how to reverse patterns * lift rewriters above 'rest' => compile patterns * assembler * state Entry: compiling patterns Date: Sun Feb 4 16:09:43 GMT 2007 i need to make my own pattern language to compile substitutors from a more highlevel definition like ((dup drop) ()) ((a b +) (,(+ a b))) ((a b xor) (,(bitwise-xor a b))) ((a not) (,(bitwise-xor a -1))) ((a negate) (,(* a -1))) ((dup dup (drop a) !) ((,a sta)) ((dup (drop a) !) ((,a sta) drop)) using syntax-case. hence the latter 2 expressions will be merged into one '!' rewriter macro. as a preparation i can already try to see if all macros fit in this category. yes, they do. but i need to solve the problem of solving type matching first, since the arithmetic above only works if the numbers are immediate. type matching is part of the match.ss language, but i didnt figure out yet how to also bind a matched item to a name.. i think i solved the rewriter problem for the pattern language above. just need to sort out some macro issues, probably best use syntax-case with some explicit rewriting. Entry: syntax transformation Date: Sun Feb 4 22:48:32 GMT 2007 1. one pattern i've been trying to solve is this (definitions (some (special) structure here) (same (special) structure there)) it's easy enough to write the first transformation, but how do you do the next one without having to explicitly recurse using names etc.. ? in other words: "now just accept more of the same". this seems to be the answer: (define-syntax shift-to (syntax-rules () ((shift-to (from0 from ...) (to0 to ...)) (let ((tmp from0)) (set! to from) ... (set! to0 tmp)) ))) one ellipsis in the pattern for every ellipsis on the same level. or something like that.. need to explain better. 2. what's the real significance of the argument of syntax-rules ? "Identifiers that appear in are interpreted as literal identifiers to be matched against corresponding subforms of the input." 3. how to get plain and ordinary s-exp macro transformers using define-syntax? i was thinking about something like this: (define-syntax nofuss (syntax-rules () ((_ (pat ...) expr) (lambda (stx) (match (syntax-object->datum stx) ((_ pat ...) (datum->syntax-object stx expr))))))) (define-syntax snarf-lambda (nofuss (args fn) `(lambda (,@(reverse args) . stack) (cons (,fn ,@args) stack)))) but that doesnt work, since nofuss is not defined at expansion time.. but it should work. there has to be some way of doing this. Entry: pattern language Date: Mon Feb 5 11:30:59 GMT 2007 seems to work. some problems remaining though: - default clause - literal/parameter - fix 'rest' - type specific match done Entry: remaining problems Date: Mon Feb 5 21:26:33 GMT 2007 two big problems remaining. the assembler and state storage. the assembler is a bit nasty. lots of tiny rules to obey.. i wonder if i can make something coherent out of this. the state store is tightly coupled to the assembler. here i can probably do another trick of accumulating the dictionaries using some binding functions. what are the tasks of the assembler? -> creating instruction codes from symbolic rep -> resolving all names to addresses (2 pass?) -> making sure all jumps have the correct size the result of assembly is a vector, and some updated symbol tables. the input is optimized and idiomized forth code that can be straight translated. it would be nice to use nondeterministic programming for choosing jump sizes, but that's probably overkill: moving code down is probably easier. some observations: * if i only shrink jumps instead of expanding them, the effects are always local: no bounds are violated, but the solution might be sub-optimal. for backward jumps, the correct size is known, only forward jumps need to be allocated. * a proper topological representation which indicates where jumps go and where they come from is a good thing to have: a single cell has: - list of cells that go from here: max 2 - list of cells that come here - instruction + arguments * maybe that's overkill. since it can always be generated if analysis is necessary.. what about this: - assume incremental code: old code is not going to be jumping to new code. - within a code block under compilation, forward jumps are possible, so they need to be allocated: use maximum size. however, they should be rare: function definitions could be sorted before hand? - recursively work down from the current compilation point, and adjust all jumps. backtrack if necessary. this can be done in a list * yep.. it's probably simplest to just perform the 2 ordinary steps of backward then forward address resolution, and add as many passes as necessary to resolve the shrinking. * there are 2 x 2 x 2 types of jumps wrt. a single cell collapse abs/rel x start:before/after x finish:before/after -> relative : adj if they cross the border -> absolute : adj if they end past border Entry: tired Date: Mon Feb 5 21:44:26 GMT 2007 probably all a bit over my head atm. feeling a bit sleepy. maybe do a bit of cleaning. like writing fold in terms of match instead of if etc.. (define (fold fn init lst) (match lst (() init) ((x . xs) (fn x (fold fn init xs))))) (fold + 0 '(1 2 3)) good coffee :) i'm feeling a bit ambitious.. instead of writing a flashforth based standard forth, it might be feasible to try to write a functional programming language for the micro. Entry: ((dup cons) dup cons) Date: Tue Feb 6 03:43:13 GMT 2007 just added 'compose', then i found out the quine in the title doesnt work any more. it does work in joy, so what's up? the problem here of course is that consing 2 quoted programs does not give a quoted program: these are abstract data types and not lists.. manfred must be using some explicit definition of cons on quoted programs somewhere, or i don't really understand his list semantics. the problem with mine is that quoted programs are in fact lists, but they have a header containing a link to the symbol resolver. so, am i missing something about Joy? is the: "quoted program is list" necessary? have to check that. in the mean time, i can get to quines by defining 'qons'. there is a possibility of embedding lex information inside the list, so instead of (lex a b c) -> ((lex a) (lex b) (lex c)) which might even be better, since it allows for mixing of dicts. this also makes it possible to use a simpler interpreter, since the lex state doesn't need to be maintained. hmm.. tried, but to tired. but something is rather important. having lists and programs on the same level as in joy is nice, but requires a single semantics. since i already almost automaticly introduced 3 kinds of sematics for symbolic code (cat, forth rewriter and forth compiler) this is not really feasible. so lists and programs should probably be separate entities, where programs are abstract and not dissectable, but compose and qons are defined to do run-time composition. list program concat compose (qoncat) cons qons where quons will take 2 quoted programs, and concatenate the quotation of the first with the second, or (a) (b) -> ((a) b) however, if i change the interpreter as mentioned above, the two columns will be identical, and programs can be manipulated just as lists, without giving up any other functionality. so, good to learn, CAT not really Joy because - i'm using fully quoted lists instead of quoted programs to represent data lists: there is a clear separation between code and data. - joy probably uses numbers directly in the lists, which i can't due to different number semantics for cat and purrr. for me, they have to be incapsulated in numbers. - parsing from symbolic -> code is explicit because of different semantics: this allows reuse of interpreter for different mini languages. Entry: interpreter cleanup Date: Tue Feb 6 11:33:15 GMT 2007 instead of using set!-car, it might be better to use delayed evaluation for the instruction type: everything evaluates to a procedure. that way it stays functional. ok. this seems to work: () and nil are the same now. plus there is a structural equivalence: since compilation from symbol -> procedure/promise is 1-1, the size of a compiled program (list size) is equal to the original code list size. ok.. now ((dup cons) dup cons) still doesn't work!! the reason is that nested lists in code get executed.. is this still valid? something smells here.. let's first change the other parser/compilers.. ok, if i swap around the quoting such that new definitions are always wrapped in a closure that will call 'run', i should be safe. this works, but i run into a difficulty: i cannot unquote a program wrapped in a closure (interpret quoted stack) the solution to this is to change back the semantics of 'parse' -> it will return a quoted program instead of an executable one, and put the unquoting in 'def'. i had to change this: (define-word run (code . stack) (run code stack)) to this (define-word i (code . stack) (run-quoted code stack)) (define-word execute (code . stack) (run-unquoted code stack)) to only take quoted programs, not pure closures. 'execute' is only there for completeness, since it is rarely needed (it is equivalent to (nil cons i)) maybe that's again one of the key points? meaning, to make a very clear distinction between quoted programs = aggregation of primitives, and primitives. ok.. this looks like it's working. this makes the language a bit more introspective. it looks like the quine works too now. this was quite surprisingly non-trivial! so what do i learn? ------------------------------------------------------------------- it is necessary to explicitly distinguish QUOTED PROGRAMS from PRIMITIVES. the latter is a black box, but the former is a list of primitives. this structure is NOT recursive! ------------------------------------------------------------------- having quoted programs obeying the list interface adds very flexible introspection. this probably means that the difference from Joy is purely syntactic now. tiens.. ((dup cons) dup cons) broke again... ok. the reason is that after constructing a program from another program, you need to 'compile' it before it can be run with 'i', so the quine is relative to 'interpret' and not to 'i'. i'm going to switch back to my previous notation and use 'run' instead of 'i'. the conclusion here is: -------------------------------------------- { QUOTED programs } is a subset of { LISTS } -------------------------------------------- i think this is not the case in Joy. whenever you operate on a quoted program using list construction, you need to 'compile' it to a program again. this is a projection from the set of lists to the subset of quoted programs. so 'compile' really needs to be a projection, meaning (compile compile) and (compile) are equivalent. it is possible to change this simply by having run-unquoted cons everything to the stack that's not executable. however, keeping this explicit allows more wiggleroom in the semantics of different sublanguages. or maye better: it is cleaner, since there is no 'default' behaviour (the way the 'cond' is setup in run-unquoted allows overriding.. that's not so clean). so, final word. the interpreter implements: * interpretation of a list of primitives as NOP,TC,RC * lazy evaluation of primitives (JIT: delayed compilation) all the rest needs to be implemented in the source transformers. NOTE: [dup cons] is the Y combinator, sort of.. Entry: assembler Date: Tue Feb 6 16:48:07 GMT 2007 alright. the assembler. finding instructions is the trivial part. the hard part is finding addresses of jumps. 1. resolve backward references (+ find instructions) 2. resolve forward references if there are multiple size jump instructions, and there are relative and absolute jumps, extra passes can be added that resolve efficient allocation of these. allocating all forward references with maximum range, and adjusting them one by one seems to be the best approach: 3-N. shrink forward references if necessary. things get complicated if forward small offset relative jumps are used in compilation, since constructs to work around this are necessary. i need to find a way to abstract this kind of behaviour. basicly mappinng A O-O-O-O-O-O \_______/ to ___ / \ B O-O-O-O-O-O \_____/ or the other way around. for PIC18 going from B to A reduces the jump distance by 2, since the long jump is eliminated. it can probably be just kept at 1 & 2 for now, with all jumps equal size, and fix that later. Entry: bookkeeping Date: Wed Feb 7 09:49:20 GMT 2007 the other problem is the bookkeeping. for the assembler i basicly need a symbol table, which means the dictionary object from cat can be reused, and some binding operations need to be devised. there are things to separate: - labels (write accumulate, read random) - 'here' (read/write random) - asm output (write only) it is probably easier to do most of this in scheme, together with some syntax. let's see. maybe best to do everything in 3 steps: 1. assembly to polish notation, keeping symbol names 2. forward symbol resolve 3. backward symbold resolve the last 2 are stateful, the first one is just pattern matching. Entry: lifting problem Date: Wed Feb 7 10:25:31 GMT 2007 how to call a generic prototype function within the body of a to-be lifted prototype? this is still one of the bigger problems i had when writing old cat: cannot execute from stack macros! Entry: screen scraping Date: Wed Feb 7 12:41:41 GMT 2007 ok, that was fun. using emacs macros to convert text pasted from the pdf datasheet into lisp code :) but it doesnt work very well though. i think i should just get the data from gpasm, hoping it's a bit more structured. (in the end i just typed it in). Entry: great success! Date: Wed Feb 7 14:51:36 GMT 2007 writing the assembler, and i'm realizing something. scheme is really cool :) but i'm not sure if scheme is the core of what i'm finding cool. i think it's pattern matching. since a compiler is mainly an expression rewriter, this comes as no surprise in hindsight. the biggest mistake in the previous brood system is to try the problem without pattern matching constructs. brood's approach (and previous badnop) is really too lowlevel. for expression rewriting, lexical bindings are a must. since the permutations involved are mostly nontrivial, performing them with combinators instead of random access parameters is a royal pain, and the resulting code is completely unreadable. i think this can be distilled in yet another "why forth?" answer, but in the negative. if the task you are programming involves the encoding of a very tangled data structure, then a combinator language is a bad idea, since you have to factor the permutation manually. so it's about this: forth is bad at encoding fairly random or ad-hoc permutation patterns like you would find in a language compiler/translator. and, don't forget: match & fold are your friends! Entry: assembler working Date: Wed Feb 7 21:14:25 GMT 2007 at least the part that's not doing symbol resolution. now for the interesting part: the assembler has some state associated: - dictionary - current assembly point which has to be dragged along. i was wondering how hard it might be to solve this with some closure tricks in scheme.. Entry: lambda again Date: Thu Feb 8 09:47:23 GMT 2007 trying to get my head around this lambda thingy.. there are a couple of problems, the most important one is the decision of whether lambda should be form or function. * form: everything is compiled at compile time. this means lambda has to be a parser macro, and the only way to do that consistently is to have it be a prefix macro. this would introduce the semantic simplicity of the language by intruducing syntax. * function: lambda does runtime compilation, in which case the lexical environment has to be bound to the compiled representation of the lambda call. it also introduces runtime compilation. speed wise this is no problem, since all dictionary lookups are postponed till later anyway, but conceptually it is different again. maybe the latter is the lesser of the 2 evils. 'lambda' still needs to be a parser exeption since it needs to capture the parse environment. so lambda is really delayed parsing. maybe that makes more sense. ok, following this: - the argument needs to be symbolic, not a quoted program. (raw source) - nested lambda's will work - the run time part is called 'apply' now, what does a compiled lambda expression look like? '(A B C) '(foo B bar) lambda -> (bind-C bind-B bind-A foo B bar) that's the easy part. now, where is the storage? clearly, storage is a runtime thing, so we can change the code to: (alloc bind-C bind-B bind-A foo B bar) now 'B', for which code is generated at compile time, needs to know where to find this storage. what about just putting it on the top of the stack, and modifying all code that's not accessing the parameters to ignore the bindings? some problems here with passing the lexical state to subprograms.. wait: this is always done by 'parse'. it's ok to think about lexical scope as dynamic scope of the parser. but... passing stuff on the data stack is kind of dangerous, since all subforms which have lambdas will do the same, so how do the inner forms find the values of the outer variables? the only real solution is probably to have the interpreter pass around an environment pointer.. maybe that's a good point to just stop, and leave out lambda entirely. Entry: monads Date: Thu Feb 8 19:18:49 GMT 2007 i guess it's safe to say that 'bind' really is 'lift' as i defined it: take a function that maps values outside into the monad, and turn it into a function that can be composed. Entry: lambda again Date: Fri Feb 9 10:37:59 GMT 2007 let's see.. what does lambda do? actually two things: * functions as values (delayed evaluation) * locally (lexically) defined names i already have the first one as quoted programs. so the problem i should be solving is not the lambda problem comprising both subproblems, but only the latter subproblem: lexical variable binding. this is forth's "locals". some more ideas: write the interpreter in Y-combinator form (CPS?). this would allow the interception of invocations, basicly allowing any kind of binding of the state that's passed around. maybe this is the interesting problem for today? btw. i ordered friedman's "Essentials of Programming Lanugages". First edition got it very cheap on amazon. Now reading "The Role of the Study of Programming Languages in the Education of a Programmer." Done. Gives me a bit of good faith that i'm on the right track. I just need to study and experiment more.. and learn to smell ad-hoc solutions. One of the things the paper mentions is that it is a good thing to learn to implement your own abstractions / language extensions / ... and to invest some time into learning the general abstract ideas behind language patterns, mainly (automatic) correctness preserving transformations. It looks like the approach Friendman suggests is kind of radical. I'm doing this from a Forth and Lisp perspective for quite a while now, but it looks like i am getting stuck in certain simple paradigms. Rewriting BROOD kicked me out of that and made me think about better approaches, adopting pattern matching, a static language and lazy compilation. The idea with PF as one of the BROOD targets is probably a good idea. It's going to be a hell of a problem to tackle though. things to try: - convert the dynamicly bound code in BADNOP to something i can run on the new core.ss : this approach seems like a nice one and i can't really say why.. there's the idea that dynamic binding is bad, but it's quite handy from time to time (i use it in PF C code all over the place). why is this? and what should be the proper construct? - see what CPS can bring. for one, it should make control structures a lot easier to implement. so THAT is what i was looking for. obvious in hindsight. but how to do this practically? Entry: re re re Date: Fri Feb 9 16:20:40 GMT 2007 so next actions. 1. is scoping important / feasible / desirable? 2. should i solve the assembler purely monadic? one great advantage of NOT using static (or dynamic) scoping is the independence of context. it does make a whole lot of sense to actually just write the components as simple functions, and combine them later. what i have already is the core of the assembler, generated as simple n-argument functions generated from an instruction set table. these functions return a list of opcodes generated from this instruction. currently this is executed as: (define (assemble lst) (map (match-lambda ;; delay assembly (('delay . rest) rest) ;; assemble (((and opcode (= symbol? #t)) . arguments) (apply (find-asm opcode) arguments)) ;; already assembled (((and n (= number? #t)) . rest) `(,n ,@rest)) ;; error (other raise `(invalid-instruction ,other))) lst)) instead of writing this as a map which is independent, i should write it as a for-each (an interpreter which accumulates state changes). ok that was easy enough: the interpreter is split into 2 parts: one that does pure assemblers (independent of state), which are the ones generated from the instruction set table, and one that does impure ones. now for the disassembler. it's probably easiest to organize this as a binary tree decoder. the argument decoding could be done working on the binary representation string. Entry: values Date: Fri Feb 9 20:23:22 GMT 2007 i never understood why 'values' would be useful. well, i think i understant now.. to compose 2 functions A and B A (x y z) -> (x y z) B (x y z) -> (x y z) one would need to write (apply B (A 1 2 3)) , with A returning a list using values this becomes something like ;; values (call-with-values (lambda () (call-with-values (lambda () (values 1 2 3)) (lambda (x y z) (values (+ x 1) (+ y 1) (+ z 1))))) (lambda (x y z) (values z y x))) ;; lists (apply (lambda (x y z) (list z y x)) (apply (lambda (x y z) (list (+ x 1) (+ y 1) (+ z 1))) (list 1 2 3))) i'm not convinced about the values thing.. lists are easier for debug: they don't requires special call. i think what's easier to read is a straight composition, where every function passes a list to the next one, which is then appended to a list of arguments, like this: (chain `(,ins ()) (dasm 1) (dasm 2)) maps to (apply dasm (append '(4) (apply dasm (append '(4) `(,257 ()))))) (define-syntax chain (syntax-rules () ((_ input (fn args ...)) (apply fn (append (list args ...) input))) ((_ input (fn args ...) more ...) (chain (chain input (fn args ...)) more ...)))) (chain `(,257 ()) (dasm 4) (dasm 4)) ok. i got the disassembler body working. now still need to do the search.. this binary tree search looks fancy bit is it really necessary? might even be simpler actually. ok. i need some binary trees for that.. just made some code, but it's kind of clumsy: the tree is created on the fly if some nodes do not exist. less efficient, but easier to do is probably to generate a full tree, and then just use set to pinch off a subtree somewhere. ok.. dasm seems to work. some minor issues with parsing multiple word instructions though.. will have to change the prototype. so the next step is to move some code to runtime, and to unify the dasm and asm: basicly they do the same: convert between bit strings and lists. the real 'problem' is the permutation of the formal symbolic parameters into the order they occur in the bit string. Entry: asm/dasm cleanup Date: Sat Feb 10 09:34:48 GMT 2007 fix the multiple instruction problem: it's probably easier and cleaner to have one symbolic instruction correspond to exactly one binary word. all the targets i have in mind are risc-like. multiword instructions are then handled as multiword opcodes. once this is done, the asm and dasm pack/unpack could be combined into one single 'interpreter'. ok. maybe it's best to stop here. it's not 'perfectly clean' but i guess what's left of dirtyness can easily be cleaned up when i encounter another instruction set that's not compatible with this approach. another thing i need to consider, or at least need a 'reason for ignorance' for, is: "why am i not generating pic assembly code?". the reasons are 1. full control, 2. have dasm available in core for debug. 3. easier incremental assembly & linking. adding support for text .asm output is rather trivial. ok... next: branches the two passes, fairly simple. 1. backward branches can be immediately resolved. 2. forward branches need to be postponed. this is a combination of the directives 'relative' 'absolute' and 'delay' Entry: PIC18 compiler Date: Sat Feb 10 12:44:07 GMT 2007 time for the crow jewel :) but first, i need to clean up the core.ss register code to accept an abstract store with default. ok done. i don't like the way i've got the generic register compiler and the PIC18 compiler completely separated. it is good to share code, but in this case, the sharing can probably be done better by just copy/pasting the patterns, or at least, inserting them from a common include. what about keeping the register compiler as a general purpose example and figure out how to do proper sharing once i have different architectures running? yep.. i think it's best to keep that idomatic compiler for other experiments, and go straight for proper pattern matching peephole optimizer. Entry: more PIC18 compiler Date: Sun Feb 11 09:10:13 GMT 2007 i think i made a mistake by writing it as just a pattern compiler.. this thing should be a proper language with recursion, otherwize i can't implement recursive substitution macros and other lanugage patterns: one machine that maps forth straight to asm. the only preprocessing stage should be the reducer, which folds expressions like '1 2 +'. even better, this reducer should be part of the compiler too, so that expanded macros benefit directly from this. summarized: separate reduction and expansion phases might lead to suboptimal performance: it's probably best to condense all this into a single phase, and make an extensible pattern matcher. this would be the same design as before. there are more of these: the little interpreters for macro mode etc.. it was pretty good already it seems. just the global variable thing was a mistake. ok.. it probably pays to make the pattern matcher programmable. add a minilanguage there too. NEXT: * control structures * extend pattern language the latter is not so trivial in the current implementation since a nice thing to have would be a 1 -> many mapping. i could use a special 'splice' word for this though. maybe it's best to work around this though. anothering i'm thinking is: now that i'm no longer afraid of this pattern matching business, why don't i write my own? this would make it possible to do some of this at runtime, making it a bit more flexible for additions etc.. time to taka break. ok.. what i have is 2 conflicting operations: a pattern replacement and a reverse. this needs to be sorted out properly: what exactly do i want the programmable part to do? ok.. it seems to work now. needs cleanup. i'm really curious about runtime though.. probably these are all written in terms of the syntax expander, and need to be syntax? Entry: merging dictionaries Date: Sun Feb 11 14:10:36 GMT 2007 i'm trying to port the intelligent macros now. long standing problem.. should you merge macro and bottom language dictionaries, or keep them separate? i think the best way to go is to manually import or link what you need. about variables and allocation. i think it's easier to just use variable names for this, and shadow them when they are changed. then after a compilation is done, the whole dictionary could be filtered. the other option is to use a functional store like before, which might be a good idea anyway. NEXT: - functional store (it's cleaner, and might come in handy later) - conditionals + optimization - for loops + loop massage Entry: stateful macros Date: Mon Feb 12 01:25:53 GMT 2007 let's see if i actually learned something.. basicly, i have two options now. to write all the macros as explicitly handling the asm buffer, or to have them spit out just a list of instructions. i don't think there is any code that has to look back to the past asm state: all words that do that are written as pattern matching partial evaluators. so, let's write all control structs as producers, just like the other macros. so. i think i sort of disentangled the problem: ------------------------------------------------------------------ If there is a lot of state that has to be dragged along, split all operations into classes that operate only on substates, or have a simple, consistent way of operating on state, like concatenation. Then, lift all these subclasses to a common interface that can be composed. ------------------------------------------------------------------ The thing i'm using is really the Writer Monad. Entry: monads Date: Mon Feb 12 00:53:53 GMT 2007 about a year ago i made a decision to use a functional dynamic store to solve the problem of state, because i didn't understand the idea behind monads. this was a mistake, but i guess a necessary one. i probably wasn't ready for the ideas at that time. now i think i sort of get it. monads (haskell style) are about dragging along state implicitly. the irony is, i implemented that! what i did was to have an implicit state objected being dragged along as a top of stack element, invisible to some computations. this is the 'State Monad'. the mistake is: this too general. it's better to use a smaller gun to solve the problem at hand on a more local scale, instead of basicly using a state machine model (albeit one without destructive mutation). the small gun is mostly related to the 'Writer Monad'. the operation that's made implicit is 'append'. i call this 'lift-stateful'. this, together with some other state dragging (if the data stack is not used, it can be dragged: some operations, like the pattern matching peephole optimizer, work on the produced code as a stack.) the thing that's really interesting though is this: if you start to think about forth as a compositional language, then this whole monad thing is nothing more than a way to 'lift' words so they can be composed in linear code. basicly. if the things you want to compose are operations A x B -> A x B, but what you have is operations like A -> A B -> B A -> B A -> A x B B -> A x B A x B -> A ... together with a higher order function (hof) that will correctly lift them to A x B -> A x B, then what you're doing is abstracting away the trivial parts of such a map in this hof. for the writer monad, the trivial part is 'append'. replace 'trivial work' with 'hard work' and you get this: http://lambda-the-ultimate.org/node/1276#comment-14113 "By using a monad, a simplified interface to the necessary functionality can be provided, while the hard work of maintaining and passing the context is handled behind the scenes." so, what i need to do is to work out some abstractions so i can perform this kind of magic in straight cat without having to resort to scheme code. Entry: backtracking Date: Mon Feb 12 08:35:51 GMT 2007 in 2.x there are the for .. next macros that perform an optimisation for which a decision has to be made early on. does it make sense to use 'amb' for this? probably yes, becasue explicit undo is going to be more expensive than just going back to a previous point and re-running the compilation.. the tricky part is to keep it under control :) in an interactive interpreter, where state can be accumulated on the stack, having lingering continuations in the backtracking stack might be dangerous, since 'fail' effectively erases all changes made since the last success. i've provided 2 lowlevel words: kill-amb! reset the backtracking engine amb make a nondeterministic choice from a list the code in amb.ss supports (possibly infinite) lazy lists in case i ever need them. so. let's make 'amb' binary. this way it's easier to implement lazy amb by embedding another call to amb in one (both) of the alternatives. yep. this looks like a better idea. haha. keep it under control! i've just been chasing a 'bug' where amb apparently didn't return properly, however, it was just waiting for input: the continuation had a 'read' in it, and the fail depended on a previous read, so it just wanted that read again. so conclusion: ------------------------------------------- be careful with amb and non-functional code ------------------------------------------- i fixed the 'cpa' "compile print assembler" loop to read lines instead of words, so at least the backtracking is ok on a line base. Entry: commutation Date: Mon Feb 12 16:44:30 GMT 2007 there are a lot of places where just swapping the order of instructions might be beneficial. i ran into a bug where it is not possible, although on first sight the operations seem independent: ((['movlw f] 1-!) `([decf ,f 0 1] [drop])) because 'decf' has an effect on the flag that's used in the macro for 'next', this is not always correct! drop, being movf, sets the Z,N flags. however, decf sets the carry flag, so this could still be used. however, i've disabled the optimization.. Entry: next actions Date: Mon Feb 12 16:52:47 GMT 2007 - conditions - variables - constants in assembler a variable allocation is just a dictionary operation, so it really should be an assembler step. i need to think about that a bit. something's wrong... Entry: bored Date: Mon Feb 12 23:18:29 GMT 2007 let's play a bit. generators.. a generator is easiest understood as something which, when activated, returns a generator and a value. in other words: a generator is a lazy list. (((3) 2) 1) is a finite generator manfred von thun has an interesting page about using reproducing programs as generators: http://www.latrobe.edu.au/philosophy/phimvt/joy/jp-reprod.html i wonder how to do this in lisp? suppose fn is a state update function (fn init) -> generator (define (gen fn init) (lambda () (cons init (gen fn (fn init))))) in cat it's quite simple too (gen (2dup run swap gen) qons qons) ;; (init fn -- gen) as mentioned by manfred in this is related to the Y-combinator. http://www.latrobe.edu.au/philosophy/phimvt/joy/j05cmp.html basicly, a generator or lazy list is a delayed recursion. so in cat, applying 'run' to a lazy list, has the same result as applying 'uncons' to a list. Entry: misc ramblings Date: Tue Feb 13 12:06:14 GMT 2007 i'm going to change terminology a bit so it's more Joy like, if only for the reason that it makes joy code easer to read. duck -> dip http://www.nsl.com/papers/interview.htm There is a ... combinator for binary (tree-) recursion that makes quicksort a one-liner: [small] [] [uncons [>] split] [swapd cons concat] binrec then for-each: i need to find the more abstract pattern, which is 'fold'. what about a fold over a lazy list? Entry: lazy lists Date: Tue Feb 13 12:22:52 GMT 2007 right now i use them in (amb-lazy value thunk), where 'value' is returned immediately, and thunk will be evaluated later. the question remaining is that of interfaces. if i say "a lazy list" do i mean thunk or (val . thunk) ? (there is another question about using 'force' and 'delay' instead of explicit thunks. for functional programs there is no difference, but for imperative programs there is. maybe stick to thunks because they are more general.) i think 'amb-lazy' should be seen as a 'cons' which contains only forcing, and leaves the delay operation to the user. i provide 'amb' to construct a full list from this. unrolled it gives: (amb-lazy first (lambda () (amb-lazy second (lambda () (amb-lazy third (lambda () (fail))))))) for generic lazy lists: maybe using 'force' and 'delay' is better, since it allows for 'car' and 'cdr' to trigger the evaluation. this enables the definition of lazy-car and lazy-cdr without fear for multiple evaluations that have different results, and it still allows for non-functional lists. ok.. cleaned it up a bit, and moved most of it to lazy.ss lazy operations have a '@' prepended to the name of the associated strict operations. i have @map, but @fold doesn't make sense since it has a dependency in the wrong way. i should also change ifold to something else.. there has to be a proper lisp name for it. i renamed it to 'accumulate'. makes more sense. (accumulate + 0 '(1 2 3)) the corresponding lazy processor makes sence, but only if it returns the accumulator on each call. so it's more like 'accumulate-step'. it's better to just create the @integral constructor, which gives a new list from an old one. Entry: had this idea Date: Tue Feb 13 21:28:02 GMT 2007 can you do something like: 1. resolve label 2. oops can't do. save 'cons' but continue 3. run all pending conses with the obtained info. now, this isn't much different than storing all unresolved symbols in a table and later fix them, only this stores actions. (don't set a flag, set the action!) more specificly, suppose there's the input x x x y z z z where y is not resolvable. the way to solve this is to have y run z z z and then try to resolve y and concatenate the results. basicly just swapping the order of execution.. something that could be done is to make the assembler essentially 2-pass, where the first pass performs normal assembly, but on the fly creates its reverse pass which just resolves the necessary items and works its way backwards. talking about overengineering :) a simple 2 pass is probably good enough. but still.. this is more efficient, since the reversing which would happen in an explicit 2-pass is not necessary + the scanning of things already compiled can be avoided. so: x1 x2 x3 lx y1 y2 ly z1 -> (... z1 (ly y2 y1 (lx x3 x2 x1 (...)))) Entry: backtracking -> an argument against dictionaries as sets Date: Wed Feb 14 10:50:57 GMT 2007 another thing what i didn't think about.. what's the actual cost of the continuations? i don't think it's much, because the data is mostly shared: asm is just appended to until it's completely finished, and the code list is just run sequentially. there's no rampant data copying going on: the garbage is created only at the compile end. so, it might actually be better to NOT keep dictionaries stored as sets, but just shadowed association lists, to make backtracking memory efficient. (in case i want to create lots of choice points). the redefining of 'current allocation pointers' tends to re-arrange and copy things on functional stores.. Entry: bit instructions Date: Wed Feb 14 15:58:04 GMT 2007 there are a lot of bit instructions that are better handled in a consistent way. one of the problems with the assembler is that bit set and bit clear have different opcodes. i think it makes more sense to handle them as one opcode + argument. all bit instructions are polar, take another 'p' argument, so they can be easily flipped as part of the rewriting process. the extra argument is placed as first one, to make composition easier. bcf bsf -> bpf btfsc btfss -> btfsp bc bnc -> bpc ... ok, it seems to be solved with a set of pattern matching macros, and a factored out label pusher :) ;; two sets of conditionals ((l: f b p bit?) `([bit? ,f ,p ,p])) ;; generic -> arguments for btfsp ((l: p pz?) `([flag? bpz ,p])) ;; flag -> conditional jump opcode ((l: p pc?) `([flag? bpc ,p])) ;; 'cbra' recombines the pseudo ops from above into jump constructs ((['flag? opc p] cbra) `([r ,opc ,(invert p) ,(make-label)])) ((['bit? f b p] cbra) `([btfsp ,(invert p) ,f ,b] [r bra ,(make-label)])) then we have the recursive macro (if == cbra label) and the pure cat macro (label == dup car car swap) a lot more elegant than the previous solution. i like this pattern matching approach. NEXT: * variable and other namespace stuff * forth lexer * parsing words * intel hex format Entry: forth lexer + parsing words Date: Wed Feb 14 21:40:13 GMT 2007 which is of course really trivial. see lex.ss i'm not doing '(' and ')' comments again, just line comments '\' i think i know why i always had problems with my ad-hoc parsers and word termination etc.. splitting in lexing and parsing makes sense, because the first one is purely flat, while the second one can be recursive. it helps when in the 2nd phase there are no more stupid problems with word boundaries.. parsing words are, well, extensions of the parser :) since these will make things move away from straight 1->1 parsing, the parser needs to be rewritten as a recursive process / fold. ok, the scaffolding is there: written in terms of reverse/accumulate. now i need to really think about how to solve the 'variable' problem. -> how to solve parsing words? -> where to do the actual allocation? Entry: ihex Date: Thu Feb 15 00:26:02 GMT 2007 this used to be written in CAT, but was a mess. it's one of those simple things that are hard to express in a combinator language because they drag along so much state if you want to do them in one pass. again, they are about merely re-arranging data! maybe i should just try it again, but using a multipass algo, just to see if i learned something.. on the other hand, this would be nice to have as scheme code, so i can use it outside of the project. ok.. it seems to work fine. got some binary operations for free that can be used in the loader too. Entry: parsing words Date: Thu Feb 15 09:55:27 GMT 2007 so. i need: : variable 2variable constant 2constant the thing which is different from the previous implementation, is that i have a separate compile (parse) and execute phase, so parsing words cannot be compilation macros. on the other hand, parsing words are always about quoting things, mostly names, so probably a simple list of names mapped to number of symbols is enough. limiting the number of symbols to one makes it even easier. sort of got something going here with variables and constants, but there's another problem: Entry: dictionaries Date: Thu Feb 15 12:01:35 GMT 2007 i'm using a hash table to store 'core' macros: those that are fixed. however, a forth program can create macros, so these need to be defined somehow.. maybe make that a bit more strict? the same goes for constants.. i'm using fixed machine constants in a hash table, and some user defined stuff in other places. this needs some serious thinking.. constants can be implemented as macros which inline literals. so the only remaining question is: how to handle macros? macros are really compiler extensions. they are a property of the host, not of the target code. it would be really inconvenient having to split a project into two parts, so i should aim for macro defs inside source files. however, a clear distinction needs to be made between host and target things: target properties are related to on-chip storage == addresses host properties are related to code generation only the result is that there are 2 possible actions on a source file: - reload macros + constants - recompile = realloc code and data to track the state of a project, the only thing that needs to be saved is the source code + a dictionary of target addresses. all the rest (macros) can be obtained directly from the source code. actually, this is a lot better than the old approach, where macros are stored in a project state file. Entry: new badnop control flow Date: Thu Feb 15 12:28:56 GMT 2007 in = project sourcecode out = compiled target code + dictionary 1. PARSE EXTENSIONS Read all source files and extend the compiler to include the macros and constants defined in the source files. This effectively builds a new special purpose compiler for the code in the project. 2. COMPILE CODE Convert all code definitions and data allocations to a form that is executable by the CAT VM, and run this code. This generates optimized symbolic assembly. 3. ASSEMBLE CODE In a two-pass algorithm, convert the symbolic assembly to binary opcodes, allocating and resolving memory addresses. This process uses the current dictionary, reflecting the state of the target, and produces a new dictionary and a list of binary code. Entry: parse extensions (borked) Date: Thu Feb 15 13:13:17 GMT 2007 and hupsakee, i'm writing a parser state machine again! amazing what a not-so-good night's sleep does.. let's do this a bit more intelligent using my favourite one-size-fits-all hammer pattern matching! seriously, the syntax is really simple, so i shouldn't be writing a state machine, just a set of match rules. one thing though. how to extend it? previous brood needed parse words to be written explicitly. i should do that now too.. just a dictionary of parse words, that output a chunk of cat code and the remainder of the input stream. Entry: forth parser - different pattern Date: Thu Feb 15 18:45:52 GMT 2007 ok. got some sleep. the thing is that this is a different pattern than all the other things i've been doing. the previous pattern matching code for the assembler is basicly a partial evaluatior, which looks backwards instead of forwards. so this needs new code! in short i need a different kind of parser or a preprocessor to map forth -> composite code. let's try to arrange the thoughts a bit since i feel i'm not seeing something really obvious.. i have an urge to write the parser as a state machine, or as a pattern matcher. both of them seem to lead to code with a similar kind of complexity, but with some obvious redundancy. i can't see the higher level construct. ok.. what i'm missing here is elements from SRFI-1 it's quite clear what i want to do: generic list pattern substitution. so basicly, the prototype is: (in) -> (in+ out) with (out) being concatenated. let's call this the 'parser' pattern, and write an iterator for it. ok. it needs a bit of polish, but the idea is there i think.. Entry: ditching interpret mode Date: Thu Feb 15 21:33:13 GMT 2007 what about ditching interpret mode and relying fully on partial evaluation? i can use the following trick: the partial evaluator does NOT truncate results to 8 bit during evaluation, only after. so in principle, there is a complete calculator available with full numeric tower. maybe it's good to create some highlevel constructs for the partial evaluator. literals are still encoded as symbolic assembly, which is ok, only somehow a bit dirty. this is effectively a second parameter stack.. to make this more explicit, the macros 'unlit' and '2unlit' are defined. these will reap literal values from the asm buffer and move them to the parameter stack. the implementation of these macros is split into two parts: a pattern matching part, and a generic macro part '_unlit'. Entry: more parsing Date: Fri Feb 16 10:23:00 GMT 2007 so the basic infrastructure is there, now i just need to figure out how to put the pieces together. this host/target separation needs some more thought. the problem i'm facing atm is 'constant'. this should define a constant as soon as it's parsed, but the value comes from partial evaluation which happens at macro execution time! maybe i shouldn't really care about this 2-pass stuff.. i can just compile code for it's side effects, being the definition of macros.. another thing, which is related to the comment about the asm buffer being a second parameter stack: why not compile quoting words as literals instead of loading them on the data stack? this way a simple pattern matching macro can be used to implement the behaviour of parsing words.. i have to be careful though, since this arbitrary freedom must have some hidden constraint somewhere.. the hidden constraint is of course: literal stack encoding is machine-dependent! it's actual assembler dude! maybe keep it the way it is, however, 'forth-quoted' feels wrong. also the combination of literals coming from the asm buffer, and the symbol coming from the stack, feels awkward. but it does seem to be the right thing.. anyways.. it seems to work now. Entry: dictionary Date: Fri Feb 16 14:07:09 GMT 2007 so the only thing that's remaining is the runtime dictionary stuff: variables (ram allocation) and associated things. mark variable names as literals during parsing. done. i'm still not sure whether the muting operations are such a good idea.. maybe a separate macro parsing stage is better after all.. as far as i understand, the thing which makes this difficult is the way that 'constant' works: it's dependent on runtime data (partial evaluator), so the definition needs to be postponed.. what about using some delayed eval here? or i can use the same trick: reserve the name so it can be treated as a literal, but fill in the value later? so, on to the fun stuff.. dictionaries. basic functionality seems to work using the 'allot' assembler directive. Entry: parse time macro definition Date: Fri Feb 16 14:53:23 GMT 2007 what if i can: - define all macros - reserve all constant/variable names (which are just literal macros) during parsing only? and fill them in whenever the data is there? the problem is how i'm handling 2constant now.. this can be fixed with a gensym. ok. this looks doable, but not essential. something for later. Entry: forth loading and machine memory buffer Date: Fri Feb 16 17:53:05 GMT 2007 two things i just did: - added a function to load symbolic forth code - draft for memory stuff need to figure out where to do 'load' load is a quoting parser, then just executes.. Entry: optimizations - need explicit unlit Date: Fri Feb 16 23:40:27 GMT 2007 i'm running into several conflicting eager optimizations, which is normal of course.. i was thinking about making this a bit more manageble, by prefixing operations that have a lot of different combinations with virtual ops that will just re-arrange things for the better.. the most occuring mistake is to combine a dup with a previous instruction so the lit doesn't show any more. i think in 2.x there is an explicit 'unlit' that puts the drop back.. ok. this pattern matching is definitely an improvement for writing readable code, but it does pose some problems here and there.. TODO: - intelligent then - better literal opti (unlit) - port the monitor - device specific stuff - code memory buffer - host side interpreter Entry: optimization choices Date: Sat Feb 17 10:37:12 GMT 2007 instead of having 'stupid' backtracking, it might be easier to do 'intelligent' backtracking. this means: at some point a choice is made, but if at a later time it is realised this choice is the wrong one, then this particular choice needs to be changed. the pattern i encounter is this: 1. do eager optimization 2. realize later this optimization was not optimal 3. undo previous optimization to perform better one every time there is an 'undo' this could be solved by an automated backtracking step. what about a sort of 'electric save' ?? (it would also be interesting to somehow 'cache' the choices that have been made in the mean time, so when a whole subtree is executed again, the right choices are made first..) interesting stuff :) it looks like the search space is not really a tree, but more like a snake line: 10010011001, where at some point one of the choices is deemed wrong, for example 10010x11001. the remaining part 11001 then needs to be re-done, but using the same pattern might be an interesting optimization. another thing is the storage of choices. backtracking needs a stack to operate. well, i already have one! the asm buffer serves that purpose quite adequately. this also solves the problem of the backtracking using mutable state. on the other hand, working purely algebraicly does have the advantage of simplictly, but it requires the explict construction of inverse operators. Entry: literal opti Date: Sat Feb 17 11:33:49 GMT 2007 instead of making pe operate on DUP MOVLW, let's make it work on MOVLW only, so the extra SAVE is not necessary. hmm.. i'm going in loops. the thing is that i'm using the literals in the asm buffers really as compile time stack. simply making the partial evaluation respect 'save' would enable to keep that paradigm. otherwize the DUP in front of MOVLW (DUP MOVLW) needs to be handled explicitly every time. this then needs to be handled by a recombining DROP operation, which is really no different from handling SAVE properly... so back to the original solution. to keep everything as pattern matching macros, i could also run an explicit recombination after the literal operations.. quick and dirty. wait a minute. i can just dump code in the asm buffer, and add a bit to the pattern macro to check for this, and execute it. then the only problem is: quoted code or primitives? probably primtives are best, since they are already packed into one item, and don't need 'run'. ok. that seems to work just fine :) Entry: monitor Date: Sat Feb 17 16:22:13 GMT 2007 ok.. seems i'm almost to the point where i can compile the full monitor code. some things are missing, like the chip specific configs, but i can see that the partial evaluator is going to help quite a lot to keep things simple: more things can be configured in the toplevel forth source code file instead of a lisp style project file. something that needs to change though is support for 'org'. this probably means that assembly code needs to be tagged somehow. ok. org is simply solved by embedding (org ) in binary code. Entry: intelligent then Date: Sat Feb 17 19:59:12 GMT 2007 since i don't exactly remember what the code does, and i can't read the old 2.x code just like that, let's decipher it. the problem is something like this: l4: btfsp 1 TXSTA 1 0 r bra l5 r bra l4 l5: which comes from begin tx-ready? until which expands to begin tx-ready? not if _swap_ again then the important part is the 'then', which should decide that it should flip the polarity of the skip and the order of the two jumps IF the first one corresponds to the symbol on the stack. this works not only for braches, but for any single instruction following after the forward branch. ok. implementation. this doesn't fit the pattern matching shoe, since the label on top of stack needs to be incorporated in the check. however, it is possible to just compile the 'then', and perform the optimization afterwards, which is possible using a pattern matcher. ok. this works. i don't check the label though.. should do that, or prove that it can't be anything else.. Entry: reverse accumulate Date: Sat Feb 17 20:27:38 GMT 2007 now, something that has been getting on my nerves is the reverse tree stuff.. there is absolutely no reason for it. the original reason was to split code into chunks of forth idioms, but i sort of lost that... this whole reverse tree stuff makes things to complicated so it has to go. temporarily i will take out the reverse-tree function. ok. this seems to have worked. a lot of code is a lot simpler now.. no there's still a bug. fixed. Entry: tip Date: Sat Feb 17 23:44:33 GMT 2007 (require (lib "1.ss" "srfi")) yep. sometimes it takes a while to figure out the small things.. another thing: srfi 42 is about list comprehensions (loops & generators). seems worth a look. Entry: time to upload Date: Sun Feb 18 08:37:04 GMT 2007 looks like stuff is in place to start dumping out hex files. so, i need to make an effort to not fall into the same trap as before: it would be nice to have cat completely on the background, and do everything from the perspective of the target system. the easiest way to do this is to use the current debug interpreter, and plug in a proper 'interpret' mode for interaction. yes, here there is some confusion. what about interpret mode? do i switch to compile mode explicitly? i kind of like the colorForth approach where there is only editing and commands, no command line editing. everything between : ... ; is always compile mode. the tricky stuff is what's before that, because i completely rely on conditional compilation for constants etc.. but, constants are really the only exception. if i make an interpret-mode equivalent of constant, then i could fake that. oth. a proper compile vs interpret mode might be a better solution. it is definitely cleaner. so we converge on this? -> compile mode = exactly the same as what's in files -> interaction mode = all the rest implemented as 2 coroutines. Entry: state Date: Sun Feb 18 08:59:04 GMT 2007 at this time, it becomes rather difficult to maintain all the state on the stack, so i probably need to move to a more general state monad. basicly what i had before in 2.x, but without executable code. fist, let's see about what state needs to be accumulated: - assembler buffer - target dictionary - forth code log? data necessary in different modes: compile: asm buffer assemble: asm buffer + target dictionary interpret: target dictionary i can probably avoid explicit monads (i don't know how to really do that: have to lift a lot of code!), and just use a main driver loop that runs the applications with the dictionary dipped. what i have is a proper class based system: - classes are cat dictionaries (implemented hash tables) - inheritance is based on chaining these dictionaries. - objects are association lists. so that's for later. i'm in no need for objects with encapsulated behaviour. the only thing i need is a local scope, so it's really just used as a data structure. this means i can start writing the main loop of the program, which is basicly written as a method bound to state. the thing i need to be careful about though is tail recursion. this works with 'invoke'. now that i'm here.. looks like this is an interesting way to implement the assembler too, by writing an object that's a list, and using a 'comma' opcode to compile instructions. thinking about this, there are really 3 major ways of symbol binding: - method: aggregate - lexically nested - dynamicly nested ok. brace for impact. going to do the asm 'object' thing. ok... unresolved yet. this is too convoluted, precisely for the reason of recursive calls. i'm still thinking dynamic binding here.. but there's something to say for the idea.. trying again.. TODO - i need a better way to create a compiler for compositions: (register, parse, name) - should have a state base clase with just: self self@ invoke ok. done. Entry: passing state to quotations Date: Sun Feb 18 15:32:47 GMT 2007 now for code quotations. how to recurse? the problem is that if quotations are executed using 'run', they will not obtain the state, so they need to somehow be wrapped such that running them passes alog the state. is that at all possible? yes. using some kind of continuation passing.. instead of wrapping the code simply in 'dip' so: - quoted programs need to be parsed recursively - they need to be modified such that running them results in the object being loaded on the stack. - it is not possible to override every word that performs 'run' to incorporate this behaviour. - this trick is only LEXICAL no dynamic binding of words, only dynamic passing of state. same old same old.. this goes way back :) the problem is of course in the shielding. as long as every primitive is really shielded from the state, there is absolutely no way to access it. so (blabla) dip is not a good apprach. it should be hidden but accessible, and not shielded. let's do this manually for now: when you want to use quoted code in a method definition, you have to explicitly parse, compile and invoke it. the default will be globally bound code only, and shielded execution for simplicity. the alternative is to compile quoted code as a method (recursive parse-state). this is kind of strange since the invokation has to happen manually. no 'ifte' for example. so unless i find a way to solve the 'ifte' problem and other implicit 'run' calls, there is no way to do this automaticly: this is really a modification to the core of the interpreter. so i am going to let go of the scary bits, and conclude: * only flat composition done automaticly * recursive composition possible using 'invoke' * quoting method code is done manually using special parser/compiler so it all remains pretty much a state monad. some special functions can be thrown into the composition to act on the state through some interface, while the rest is 'lifted' automaticly. Entry: fixing amb Date: Sun Feb 18 16:15:23 GMT 2007 postponing the real work, i can try to fix amb to make it operate only on the assembler store. what i need to do to make this work is to return the continuation explicitly. so amb will do: amb-choose ( c1 c2 handle -- c1/c2 ) + effect of handle here handle will store the continuation on a stack somewhere if c1 is chosen. if this continuation is called, c2 will be chosen without handler. ok. looks like it's working. still need to strip out the continuations in the assembler though. done. Entry: the app Date: Sun Feb 18 18:14:22 GMT 2007 time to write the main loop. - based on the store monad containing: - asm buffer - forth input stream (per line) - state memory - written from the target perspective - compile mode / interpret mode ok.. seems i'm at least somewhere. now i need to think about the design a bit more.. the state stuff is encapsulated in a small driver loop, the rest is still functional. Entry: byte continuations Date: Sun Feb 18 19:07:52 GMT 2007 i was thinking about a way to use more highlevel functions in the 8bit forth. obviously, a jump table can be used to encode jump points as bytes. but why stop there? the return points can be mapped also, giving the possibility of encoding return stack in bytes too, as long as code complexity is small enough. the compiler could do most of the bookkeeping. this would make sense in a setting where the code is simple, but the number of tasks is big. since that needs a ram-returnstack, which is better implemented as a byte stack anyway. Entry: application Date: Mon Feb 19 09:30:44 GMT 2007 some remarks. bin needed? probably not.. just keeping the assembler and generating assembly on the fly is probably best. the basic editing step is: - switch to compile mode, enter/load forth code - switch to interpret mode -> code is uploaded cat should only be for debugging ok, so CPA = forth compile mode. this is to edit the asm buffer using forth commands. the asm buffer is stored in the 'asm file. in CPA mode it is possible to test the assembly by issuing 'pb'. however, this doesn't use the stored dictionary.. need to fix that. ok, what i have now are 2 modes, switched using ctrl-D * compile mode = compiled forth semantics ONLY not even special escape codes for printing asm etc * interpret mode = simulated target console. target is seen as what it actually is + some interface to a server. the language used is forth lexed, but piggybacks on cat words. looks like it's working fine this way. let's keep it. Entry: literals again Date: Mon Feb 19 12:05:39 GMT 2007 ok, i need to do this properly. back to the unlit strategy. basicly: try to recover literals one by one, instead of massive combined patterns. let's try this: lits asm>lit asm>lit ok. seems to work. still needs some explicit code that might be optimized, i.e. the literal patching. but i can live with it like it is now.. Entry: inference Date: Mon Feb 19 14:27:02 GMT 2007 it should be possible to infer some more about the state of the stack given there are no jumps from arbitrary places, which is a sane assumption. Entry: another day over Date: Mon Feb 19 18:01:19 GMT 2007 and i'm running it in the MPLAB simulator. it generates correct code at first sight. so, time to hook it up :) still some features missing: one of them is proper byte/bit allocation. so TODO: - host side monitor - state save/load ok, i'm getting bytes back from the monitor running on the chip. time to start writing the monitor code. Entry: dynamic code Date: Tue Feb 20 00:41:11 GMT 2007 cleaning up a bit now. funny, what i need now is dynamic code :) anyways. it's easy enough now that i have a general purpose store. all kinds of hooks can be added here, which can be saved later. they all go in symbolic form. to make them full circle (symbolic words in symbolic words) i need to add some kind of explicit interpreter probably.. Entry: parse Date: Tue Feb 20 09:55:04 GMT 2007 to wake up today, i'm going to change all the 'parse' stuff to 'compile', since that's what it really does: parse+compile. 'bind' would be better maybe. thesaurus. well, 'compile' is relly quite understandable.. so let's keep that. maybe i better make compile = (bind + parse), and turn 'bind' into a proper CAT function? this way the whole semantics and parsing thing can be handled in CAT code. the other thing to think about is CPS. does it make sense to use that? i'm still thinking about run vs invoke. maybe it's better to just keep it explicit until my current approach takes more shape and patterns fall out.. change 'unquoted' to 'primitive' parse: ( source binder -- compiled ) find: ( symbol -- delayed/primitive ) i changed names to the following protos: a couple of syntaxes: cat-parse state-parse a lot of namespaces: cat-find -find ok, need to do clean up this stuff later.. maybe tonight. TODO: - fix the toplevel interpreter stuff + reload - on reload, macros should be reloaded from source files also. means compile + ignore asm. - fix proto of binder (+ parser?) - CPS with dynamic variables? Entry: duality Date: Tue Feb 20 13:54:52 GMT 2007 something interesting happened here.. 'state-parse' is now implemented as a delayed parse operation, which exposes the semantics: parse: list of things -> list of primitives find: thing -> primitive generalizing find's symbol -> primitive semantics. i could probably find a better name, but let's stick to this since it's all over the place. from now on 'find' means: map a "thing" to primitive behaviour, and 'parse' means: map a collection of "things" to a LIST of primitive behaviours, representing the functional composition of these primitives. in case of a 1-1 relationship between source syntax and compiled code in list form, parse is really just (map find source). this is one of the properties of CAT source code. so there is something very simple hidden in all this.. --------------------------------------------------------------------- * PARSE: handle the structure or SYNTAX of source code. this will translate source code to to a very basic COMPOSITE CODE representation, which is a list of primitive code elements, effectively reducing any form of syntax to a simplified one. in doing so, parse can use 'find' recursively to translate primitive source objects to primitive machine code. * FIND: handle the meaning or SEMANTICS of source code. this will translate a source code atom, and translate it to PRIMITIVE CODE, possibly using 'parse' recursively to translate atoms comprised of structured source code. this is the source code / compiled code duality. parse code collection <-> interpret primitive code list find semantics of code atom <-> run primitive machine code --------------------------------------------------------------------- here 'machine code' is the code representation of the underlying machine, which in this case is scheme, with primitives represented as functions operating on a stack of values. this is just eval/apply in disguise. the difference being that for lisp, the functionality is represented by the first element in a list, while here it is a composition. eval: (head more ...) == (apply (eval head) (list (eval more) ...)) Entry: next actions Date: Tue Feb 20 17:13:23 GMT 2007 run time state? or where to store the file handler? do this non-functionally, since it's I/O anyway.. why not? that seems to work. got ping working too. and @p+ next couple of things should be really straightforward, but i am missing one very important part: I CAN'T USE QUOTED PROGRAMS!!! so i need to do something about that.. again, as far as i understant, the problem is in 'run'. if you hide information by 'dipping' the top of the stack, there is no way to get it back, unless you can bypass this mechanism somehow. the thing that has to change is the interpreter. ok. it should be possible by doing something like '(some app code) compile-app (for-each) invoke making sure that the dict gets properly tucked away. the nasty thing is this is dependent on the number of arguments the combinator takes. (invoke-1 swap run) (invoke-2 -rot run) invoke is bad for the same reason... something is terribly wrong with the way i'm approaching this.. no solution. too many conflicting ideas. 1. i need combinators to "just work" 2. i need to be able to run non-state code properly possibilities: - patch all quoted code -> parsed as state code - do not patch combinators maybe i should just try? this is crazy... i just don't get it. heeeelp! i don't know how to solve it.. but i can work around it :) basicly, the problem i have is that i can't use higher order functions in combination with the state abstraction: basicly, because the abstraction effectively uses a different kind of VM. to solve it, i need to either accept i have to change the VM, or just make the data i'm using persistent. there are several options: * turn the n@a+ and n@p+ into target interpreter instructions. this just makes them static, so i do not have to use references to dynamic state in the core routines. might be the sanest practical solution. * just forget about the functional approach to the dynamic state and store this in a global variable. a bit drastic, and i will probably regret that later, since it feels like giving up on a good idea at the first sight of real difficulty... i will go for the first one so i can at least finish the interaction code.. this has the advantage of making the monitor itself a bit more robust, since it will provide full memory access. one thing i didn't think about though: making ferase and fprog primitives will make them a bit less safe (ending up sending random data). i should add a safety measure. ok, that seems to work. Entry: monitor update Date: Tue Feb 20 21:52:02 GMT 2007 triggered by some unresolved conflict between hidden dynamic state and the interpreter, i made most of the functions in the monitor available as interpreter bytecode. this makes it a bit more robust and apparently a whole lot faster also. still to fix is some kind of safety measure to prevent the erase to be triggered accedentally by some unlucky combination of input data. a password if you want :) Entry: monitor progress Date: Wed Feb 21 11:46:43 GMT 2007 got most of it working this morning. next actions: - variable/bit alloc - save/restore state - sheepsint core compile + macros i do rely a bit on parsing macros in the original sheepsint 3.x code. that's not so good. time to thing about working out some abstractions a bit better. for isr: flag high? if flag low handler retfies then now variables/bits ok, no bits.. do that later, sheepsint doesnt use them: explict allocation. next: - state loading on startup - interrupt handlers Entry: getting tired Date: Wed Feb 21 23:55:34 GMT 2007 yes, time to get it done.. overall, i'm quite happy with the result. it's a lot better than the previous two. i can't really see much further from here, other than elaborating towards higher abstractions (different language), and fixing some jump related simple optimizations. the bad guy is quoted method code, which has a strange conflict of concepts. more on that later. another thing i miss is inline cat code, i.e. for generating tables. i think i better do this in a different file, and only in scheme: no more intermediate cat-only files. 1 1.1 16 table-geom then the lack of proper run-time semantics is kind of weird. the partial evaluator replaces this, but in an implicit manner: not everything is accessible, and the bit depth is different. about literal opti: still not completely happy, since the patterns should do the literal preprocessing automaticly. looking at pic18.ss gives me a warm fuzzy feeling :) most of the knowledge is encoded in 2 patterns: assembly substitution patterns and recursive macros. language support is encoded in 2 more: some asm state monad and writer monad. the thing which would help a bit is reducing the redundancy for the rewriter macro specification. the way it is right now is very readable, but maybe a bit too much clutter. on the other hand, it might be a bit overengineering. Entry: monads again Date: Thu Feb 22 00:57:22 GMT 2007 http://en.wikipedia.org/wiki/Monads_in_functional_programming Alternate formulation Although Haskell defines monads in terms of the "return" and "bind" functions, it is also possible to define a monad in terms of "return" and two other operations, "join" and "map". This formulation fits more closely with the definition of monads in category theory. The map operation, with type (t -> u) -> (M t -> M u), takes a function between two types and produces a function that does the "same thing" to values in the monad. The join operation, with type M (M t) -> M t, "flattens" two layers of monadic information into one. The two formulations are related as follows: (map f) m ≡ m >>= (\x -> return (f x)) join m ≡ m >>= (\x -> x) m >>= f ≡ join ((map f) m) -- isn't that what i'm doing? 'map' is my 'lift', it lifts a function operating on only a stack to one operating on a stack + state information. 'join' is i.e. concatenation of lists in the writer monad i'm using for assembly, 'return' i don't use? yes i do. it's how i initialize state, i.e. by loading an empty assembly list on the stack, and how some functions return a packet of assembly code. http://citeseer.ist.psu.edu/wadler92essence.html the basic idea in monadic programming is this: a function of type a->b is converted to one of type a->Mb (monadic form) i.e. assemblers: a function '(movlw 123) is converted to '((movlw 123)) 'bind' is there to compose 2 functions in monadic form. in the example of assemblers, 'bind' will do the concatenation of the assembly. Entry: higher order pattern matching Date: Thu Feb 22 09:56:59 GMT 2007 meaning: match pattern generation based on templates. it seems to work, but involves double quoting, which is a bit hard to wrap your head around.. there's one thing i've been trying to understand for a while, is how to do this: `((['dup] ['movf a 0 0] ['lit] ,word) (,'quasiquote ([,opcode ,',a 0 0]))) without having to use the "quasiquote" symbol. maybe i should have a look at paul graham's "on lisp" again... ok, i think i got it: ;; ORIGINAL: explicit quoting of the quasiquote symbol `((['dup] ['movf a 0 0] ['lit] ,word) (,'quasiquote ([,opcode ,',a 0 0]))) ;; WORKS: using a name binding to avoid double quoting `((['dup] ['movf a 0 0] ['lit] ,word) (let ((opc ',opcode)) `([,,'opc ,,'a 0 0]))) ;; MAYBE WORKS: pattern generated is (quasiquote (unquote (quote ;; thing))) instead of (quasiquote thing) `((['dup] ['movf a 0 0] ['lit] ,word) `([,',opcode ,,'a 0 0])) yep it works.. ---------------------------------------------------------------------------------- the trick is to generate this: (quasiquote (... (unquote (quote thing)) ... )) instead of attempting to generate: (quasiquote (... thing ...)) ---------------------------------------------------------------------------------- to really understand this, it might be interesting to implement quasiquote. http://paste.lisp.org/display/26298 another thing to note: this merging of quoted/unquoted stuff is what the syntax macros actually do a lot better automaticly.. Entry: interpret mode Date: Sat Feb 24 10:15:38 GMT 2007 i got the synth core to run. next actions: - interpret mode - setting interrupt vectors - note table - figure out line voltage + impedance - identify interpret mode seems to +- work i'm using overriding: if a word is not in the target dictionary, it is executed on the host. maybe this will lead to some obscure problems? maybe i really need to separate the 2 a bit better, and use an explicit debug mode. then state association. i should have an 'identify' command, so a connected target chip can tell the host which state file to load. but how to implement this? i could reserve some space in the bootsector for this actually. so, applications.. i was thinking about keeping the monitor independent. i don't think this is a good idea, since the boot code is really application dependent. so an application is everything, including monitor. Entry: syntax macros Date: Sat Feb 24 21:45:46 GMT 2007 been playing a bit with macros.. i don't really understand them fully though.. especially the use of local syntax in syntax expansion etc.. also, it's really better to move the preprocessor hook out of pattern.ss DONE. Entry: 16bit code Date: Sat Feb 24 23:19:13 GMT 2007 looking at the sheepsint controller code.. there is no real reason to not make it 16-bit. all computation i'm doing is on 16bit numbers, and the overhead of switching everything to 16bit is probably minor. todo: - 16bit interpret mode - way to map symbols Entry: toplevel workflow Date: Sun Feb 25 08:41:42 GMT 2007 mainly about program organisation. a program consists of these parts: 1. boot block (first 64 bytes) 2. monitor 3. fixed application code 4. variable application code a project is a directory. 2. should be made as standard as possible, and i really shouldn't care about the size, since that only matters for code protection. 3. should, if possible, stay on the target too (mark). 4. should have a 'scratch' character. empty = erase till previous 'mark', but no further than monitor code. -> replace dictionary with saved dictionary -> round 'here' up to the next 64 block -> erase from there DONE also, the reset vector should jump to #x40 DONE and i need to find a way to update the monitor code on the fly. -> either copy the monitor as a whole (since it's place-independent) -> copy a minimal copy routine. as good as DONE reloading core with minimal effect on state = 'core' setting interrupt vectors : should save 'here' etc.. -> interesting, since it really involves a run-time assembler stack. maybe it does make sense to de-scheme the assembler.. Entry: state monad Date: Sun Feb 25 10:46:05 GMT 2007 http://www.ccs.neu.edu/home/dherman/research/tutorials/monads-for-schemers.txt let's see. the problem i had was to use 'for-each' in state code. because of the way the state needs to be passed, all higher order functions need to be aware of it. i just need a special 'for-each'. the way around this is to use the stateful functions ONLY to access the data, and use pure functions to do manipulation. i.e. 'logic' vs 'memory'. currently in the interpreter it works fine. this can seem like a drag, but in fact it is a good thing: functions are not unnecessarily infected by state. so, monads are about order of execution, and really central to a compositional language, where there is only order! this in contrast to lambda languages, where there is an intrinsic parallellism in the order of evaluation. about interpretation: if you see a concatenative language as a series of sequential operations, it is 100% serial (the way it is implemented). however, if you see it as composition of functions, there is no evaluation order, because there is no evaluation, only composition. i need to look into list comprehension etc... Entry: call conventions : de-scheming Date: Sun Feb 25 10:56:48 GMT 2007 instead of doing some real work today, i'm going to have some fun. make the interpreter more reflective, meaning, converting all important routines to operate on a stack. actually, that might not be a good thing.. i'm using non-stack functions for convenience, so primitives are simpler to code, and i can use the lambda abstraction instead of combinators. Entry: time to do some work Date: Sun Feb 25 13:36:34 GMT 2007 - interrupt vectors DONE - a/d converter for board - 16 bit interpreter - constants and variables Entry: alan kay oopsla lecture and stuff Date: Sun Feb 25 21:36:56 GMT 2007 - 'core' should not distroy any DYNAMIC state AT ALL, only static background. - every restart is a failure. - need better debugging. i need to be more observing of things that are annoying, and fix them immediately instead of some short-term goal. the thing i'm trying to do is to build a better tool, not to finish some product. i need to always try to distill the important core idea, instead of bashing away to 'just make it work'.. Entry: variables Date: Mon Feb 26 09:52:27 GMT 2007 i've got a problem with variable names: the dictionary does not make a distinction between flash and ram names, but the interpreter does need to treat them differently. this is solved properly by using two dictionaries, or a nested dictionary. probably 2 dictionaries is better.. requires a little rewrite though. ok. i started to rewrite the assembler to separate assembly from dictionary operations, so it's easier to make the dictionary an abstract object. then i need to change to recursive operations '(ram here) instead of 'here etc.. seems to be fixed. the implementation is abstracted, and currently solved as a simple sub-dictionary. maybe move this to 2 separate dicts.. Entry: analog -> digital Date: Tue Feb 27 08:25:45 GMT 2007 i don't know what's wrong, but it doesn't work properly. but first, documentation. let's copy and paste the previous one. Entry: reasons Date: Tue Feb 27 12:12:44 GMT 2007 all i want is lisp, but: - cat is terse - cat is more editable - forth works on small things - forth with linear data is predictable the first to are from the point of writing software, and interacting with a system, the last two are practical solutions to needing a lot of programming power under constraints (small or RT) something i learned though: it's bad to waste time writing combinators. in BROOD 3.x i solved this by writing some core things in scheme. basicly they are combinators: interpreters for certain kind of code propagating state. in PF and previous forth experiments i ran into this problem several times: trying to express something which really needs 'hidden state'. mostly i solve this using global variables. which is ok as long as there is no pre-emptive multitasking going on. SCRIPT = toplevel organization: large amount of trivial code ALGORITHM = small amount of nontrivial code the idea is to use a scripting language to glue together algorithms, while the nontriviality of the algorithms is hidden, and the connectivity between them is made managable by the features of the scripting language. Entry: linear lists -> PF Date: Tue Feb 27 12:18:46 GMT 2007 yes. it does make sense to rewrite malloc. malloc is not what i need if i'm using linear data structures. i don't need free, only free lists. and yes, it would be cool to have access to the page translation table too :) Entry: compilation is caching Date: Wed Feb 28 01:41:21 GMT 2007 compilation is really caching.. maybe i should find a way to add dynamic loading of code without full image reload, by using a custom made 'promise'. one that can be un-cached whenever a new word (or group of words) is defined, so code can be re-bound. more about the caching.. this means that symbolic code is really the only representation of code. the compiled representation is an invisible optimisation, and should be hidden from the programmer. if i replace all atoms with a struct containing their symbolic version and a possibly cached behaviour, i can re-interpret on the fly.. this should give all the benefits of late binding, without the drawbacks of having to reload the whole image all the time.. however, cache invalidation probably needs to do this anyway: invalidate a whole dictionary of code, unless all references can be found somehow.. probably not. so what's the difference? what would a proper cache offer? uncompilation for one.. it's probably good to keep the symbolic data and environment around.. Entry: no more quotation Date: Fri Mar 2 14:08:38 GMT 2007 quotation sucks.. and it's really not necessary if i install a default semantics. my previous argument was: no default semantcs (no defaults!) because i need more than one.. however, everything will run on the VM as primitives, so there is no real good reason to have no defaults: the symbolic representation might be "the bottom line", with compilation viewed as optimization/caching. what needs to be done to fix this? i probably need a better object representation, a more abstract one. an object has properties, one of them being its cached rep. so.. what is an object? - syntax, form.. this is the 'data' part - semantics in the form of an associated interpreter object optimizable properties: - cached semantics. this is really just OO. need to look at smalltalk.. maybe it's good to have some ideas propagate. data=object data, interpreter=class summary: the idea is to parse into something which retains the symbolic representation, so semantics can be late bound, and compilation is still possible, but is done with memoization. clearing the cache is then possible by scanning the entire memory from the root and invalidating some bounds. this trick can also be used in PF. a linear language with late binding but aggressive memoization. hmm.. i read something in "the essence of functional programming" http://citeseer.ist.psu.edu/wadler92essence.html about values versus processes. to paraphrase: - in lambda calculus, names refer to values - in compositional languages, names refer to functions the first one only has values, (while functions are a special case of values), while in compositional languages there are only functions, with values represented as functions. going the intuitive route: a name is a function, and only a function. an object is only a function. it has an associated action. data is represented by a generator. Entry: a new PF Date: Fri Mar 2 15:22:12 GMT 2007 summary: - object oriented: objects are functions. each object has a 'syntactic representation S' and an 'associated interpreter I'. (the result of applying I to S is X, an executable program which acts on a data stack. - the basic composite data structure is a CONS cell. - composite data is linear: no shared tails. - the interpreter needs to be written in (a subset of) itself, to allow easy portability (to C). problems: all the problems are related to the linearity of the language. to make things workable, some form of shared structure needs to be implemented. however, this can lead to dangling references. -> continuations / return stack -> mutual recursion if i clean up the semantics such that dangling pointers are allowed in some form, like 'undefined word', this should be managable. to keep things fast, this needs to be cacheable: it should be possible to detect whether an object is live etc.. to rephrase: looks to me that a completely linear language is really unpractical. how do you tuck away non-linearity so behaviour is still real-time? i keep running into the idea of 'switching off the garbage collector'.. decompose a program into 2 parts: one that uses a nonlinear language to build a data/code structure, and a second one that runs the code: trapped inside the brood idea: tethered metaprogramming. -> a predictive real-time linear core (linear forth VM + alloc) -> a low priority nonlinear metaprogrammer (scheme) together with the smalltalk trick to simulate the real-time linear core inside the metaprogrammer. the VM: - no if..else..then: only quotation and ifte - no return stack access: use quotation + dip this can be a lot more general than for next gen PF. i can run this kind of stuff on a microcontroller too, to have a different language. one with quotation, and no parsing words.. the idea is to make the VM as simple as possible: i already have a way to implement a native forth, maybe the catkit project should be just that: CAT is that thing that runs on the micro? linear CAT? Entry: linear CAT vm Date: Fri Mar 2 15:59:41 GMT 2007 - run: invoke interpreter - choose: perform conditional - quote: load item from code onto data stack - tail recursion: this is really important - continuations (return addresses) are runnable using variable bit depth? code word bitdepth is determined by the number of distinct words. an 8 bit machine is for small programs, while a 16bit machine is for larger programs and/or programs that need to do more math. something inbetween is also possible. most practical is 12 bit. but the most important thing is: the data stack needs to be able to hold a code reference. for the 18f, i think it's best to go to 16bit. the forth is for inconvenient features, while the highlevel language should be that: a highlevel language. in order to properly implement tail recursion, the caller should be responsible for saving the continuation. Entry: direct threading Date: Fri Mar 2 16:33:47 GMT 2007 i'm trying to write an interpreter with these properties: - proper tail calls (caller saves continuation) - continuations can be invoked by 'RUN' - direct threading. in direct threading, threaded code is a list of pointers that points to executable code, and a continuation is a pointer that points to a list of such pointers. so yes, these constraints can be satisfied: - composite = array of primitive - continuation = composite - a composite code can be wrapped in a primitive using a simple header TBLPTR -> composite code PC -> primitive code see direct.f -- summary: the most important change is threaded code + proper tail calls by moving the continuation saving to the caller. Entry: linear languages Date: Fri Mar 2 19:13:23 GMT 2007 http://home.pipeline.com/~hbaker1/Use1Var.html "A 'use-once' variable must be dynamically referenced exactly once within its scope. Unreferenced use-once variables must be explicitly killed, and multiply-referenced use-once variables must be explicitly copied; this duplication and deletion is subject to the constraint that some linear datatypes do not support duplication and deletion methods. Use-once variables are bound only to linear objects, which may reference other linear or non-linear objects. Non-linear objects can reference other non-linear objects, but can reference a linear object only in a way that ensures mutual exclusion." what he describes a bit further on is an 'unshared' flag. a refcount = 1 flag, but it looks like this is more in the context of a mark/sweep GC. an attempt to make some patterns automatic? reverse list construction followed by reverse! is an example of a pattern that might be optimizable if the list has a 'linear' type: the compiler/interpreter could know that 'reverse!' is allowed as a replacement of 'reverse'. so as far as i get it, baker describes a 'linear embedded language'. linear components are allowed to reference non-linear ones, but vise versa is not allowed without proper synchronisation. so in a RT setting, this means the only thing that is allowed to run in the RT thread is the linear part, while the nonlinear part can maintain it's game outside this realm. so, again: - high priority linear RT core (forth) - pre-emptable nonlinear metaprogrammer (scheme/cat) the linear part contains only STACKS + STORE. the nonlinear part can contain the code for the linear part. the compiler runs in the nonlinear part. the nonlinear part is not allowed to reference CONS cells in the linear part. this can be implemented entirely inside of PLT. on the other hand, having this structure independent of a PLT image makes it more flexible: the core linear system should be able to do it's deed independent of the metasystem's scaffolding. baker calls my 'packets' nonlinear types: names with management information (reference counts): a strict distinction is made. this allows a nonlinear type to be decouple from it's (possibly linear) representation object. in PF this means: packets are references to linear buffers. the result is that underlying representation can change ala smalltalk's 'become'. conclusion: - cons cells are linear - packets are nonlinear wrappers for linear storage elements - packet access: readers/writers access protocol: mutation is only allowed when there are no readers (RC=0). (functional ops) - 'accumulation ops' use shared state + synchronized transactions. Entry: standalone forth Date: Sat Mar 3 13:25:46 CET 2007 maybe he didn't get it, but writing this compositional language and a standalone forth are conflicting ideas.. it's not so hard to give up on parsing words, other than true quoting words: there will be only one left, let's call it '{'. what's worse is that i need to dumb it down a bit. i'd rather define a new language, but an ANS forth might be better for teaching. for the simple reason that i don't need to write such an extensive manual. maybe it still makes sense to run both languages on the same VM ? another forthism that's not really necessary: since i'm sharing code between the lowlevel subroutine threaded forth and the direct threaded forth, why not make the VM primitives equal to subroutine threaded forth, instead of them being directly linked to a NEXT routine. in other words, why not have an explicit trampoline? this will be slightly slower, but uses less code since the primitives don't need a separate binding, which would just call the other code anyway. conclusions: - interpreter loop allows primitives == native code (STC forth) - 'enter' uses short branch -> code needs duplication - primitives need no IP saving!! (compiler needs to distinguish between primitives and highlevel words) the last one is a consequence of doing continuation management on the caller side: caller cannot be agnostic! it should be possible to pass this information to 'enter' somehow, so enter can save/restore depending on some flag. carry flag? that's ok, a long as this machine state is guaranteed to be saved.. however, in this case, the primitive needs to call 'EXIT' in case the carry flag is not set! so still, some compiler magic is necessary, or all words need to terminate with an EXIT call, independent of whether they terminate with a tail call. this is a bit messy... let's try to summarize: the flag is called the NTC flag: non-tail-call. - EXIT = leaves current context - WORD -> ENTER conditionally saves the context (carry flag) - PRIMITIVE: needs EXIT if TC flag is set. again.. there are 4 cases: PRIM/COMP and TC/NTC. what i'd like is to solve the PRIM/COMP completely in 'enter', such that the interpreter can be agnostic about highlevel words. an instruction = primitive + NTC flag what does the NTC flag mean for the interpreter? nothing. it's just extra information passed to 'ENTER': it means the rest of the code thread can be safely ignored. the interpreter completely ignores it, and just runs forever, assuming the code stream is infinite. all threading changes are implemented by other primitives. so, given the current implementation, a solution is to always compile EXIT, together with a bit that indicates an instruction is a tail call. this is not very clean.. the exit bit should be universal. semantics: the bit indicates that the current thread can be discarded BEFORE passing control to the primitive. then the primitive can always just save the continuation. (a possible optimization is to overwrite the continuation, but let's to the former first since it's conceptually simpler). this is different in that the interpreter is not agnostic about the return stack, but effectively implements 'EXIT'. Entry: is code composite? run or execute? yin or yang? Date: Sat Mar 3 16:37:37 CET 2007 in CAT it seems i've converged on only using composite code = list of primitives as the quoted programs that can be passed to higher order functions. however, original forth does not use this stance: threaded code is a list of execution tokens, and execution tokens are the canonical representation of quoted code, when treated as data. this is wrong. why? reflection becomes more difficult. the stuff on the return stack is the saved IP. this should be "a real program". the inner interpreter deals with arrays of primitives, and such arrays can be wrapped in a primitive by prepending them with ENTER. however, the data representation of code should really be composite, so no primitive address, but a composite address. primitives == internals. it's better to treat primitives as singleton composites, than to treat composites as primitives. in the inner interpreter, the reverse view is better. i think this view originates in original forth, and is mainly historical: primitives came first. composites were treated as primitives. i can't think of another reason really.. conclusion: ------------------------------------------------------ programs are composites = array of threaded primitives ------------------------------------------------------ i'm going to reflect this in the following change: - execute is reserved for primitives - run is reserved for composites also, if you look at native code, the picture is pretty clear: primitives are machine instructions, and you simply cannot 'execute' them, they always need to be inside a code body.. composite code = list of instructions, referred to by an address. it's just the same... Entry: reflection Date: Mon Mar 5 02:26:40 CET 2007 have to think about this a bit more. something strange going on with this primitive/composite thing. what about only having highlevel code: composite code just links to more composite code.. there's no way to plug in primitives here. for purely pragmatic reasons, using primitives, and highlevel words wrapped in primitives is worable.. Entry: essentials Date: Thu Mar 8 16:57:26 EST 2007 * symbolic -> ast + room for it * possibility to 'uncompile' an AST * use abstract types (structures) and 'variant-case' in AST Entry: delayed list interpretation Date: Fri Mar 9 10:48:54 EST 2007 thinking about eopl : i need more data abstraction. car and cdr is nice, but they really are quite lowlevel. there's too much implementation leaking through. the asm monad is a good example of abstraction, but code probably needs it too.. about using symbolic code and caching: parsing a list, it can be either code or data, depending on how it is accessed. maybe it should really have these 2 identities? if accessed using list processing, it behaves as a list, but if accessed using 'run', it is behaves as code -> jit compilation cache. the benefit of this is that the semantics can change. so a list is really an object with different identities. all list processors should be modified to take an abstract list object. Entry: platforms Date: Mon Mar 12 18:54:28 EDT 2007 ai ai ai... i'm spending money again, surfing on ebay.. discovered this nice ARM7 board on sparkfun made by olimex. it has a 128x128 color lcd, usb client mode and ethernet. a dream platform for brood, especially since this is THE standard 32bit chip, getting really cheap too.. so i want: 8 bit PIC18, 16 bit DSPIC30 and 32bit ARM7. Entry: itch Date: Tue Mar 13 12:47:23 EDT 2007 want to start changing some abstract data type implementations.. the most important one is probably 'quoted program' or 'composition'. this has to be distinguished from a chain of cons cells in that it has more structure. a composition can always be converted to a chain of cons cells, and a chain of cons cells can be converted to a composition if 1. it's a proper list 2. some interpreter semantics is attached so a quoted program is the above: a proper list with attached interpreter semantics. the changes this requires are: - all data operations that modify lists need to accept the 'composition' data type and convert automaticly. - the parser needs to produce compositions instead of chained cons cells. it's getting old.. but the structure of program (source compile cached) is probably better written as (primitives) where each primitive has its own semantics and cache: word (atom compile cache) there are several options for word (source thunk/cache) (source thunk cache) (source compile cache) (source compile env cache) in general: do we want the environment to be explicitly specified, or is an abstract representation of the binding operation enough? one of the requirements is the ability to rebind, so at least cache and binder need to be separate, since the binder uses (in the current implementation) some mutable state. probably the following model is close enough to the current one (basicly the same as 'delay' but with possibility to clear the cache) to not need a lot of changes (or enable incremental ones) and will do what's required: prim = (source atom, interpretation thunk, cached compilation) so: - code can be re-interpreted by just clearing the cache - all code, independent of semantics, can be specified as lists - a 'data' mutator can be defined that strips a list from all executable semantics so i guess the conclusion is: - composite code is a concrete list of abstract primitives - primitives contain memoization info this brings me to restate that in BROOD, the compiler itself is written using OO techniques with mutable state, but the target compilation is completely functional. the reason is this: - the host language is mostly about organization -> mutable OO - the target compiler is mostly about algorithms -> functional + monads the main practical reason for using functional approach in the compiler is the ability to work with continuations for very flexible control structures. the 'constant' part is implemented as an OO system. Entry: reflection Date: Tue Mar 13 13:46:25 EDT 2007 another thing i keep running into is the mixed use of 2 calling conventions: scheme N->1 and cat stack->stack it would be nice to have scheme only to provide primitives, and have all other utility code be out in the open. however, given the way some algorithms are implemented now, that is impractical. i can have all the reflection i want, but not necessarily from CAT, since that would make it harder to use scheme to implement the core of things.. maybe it's good to keep this in mind: CAT is just a minilanguage inside scheme, and all the things i need to bring out can easily be brought out if necessary. full reflection is not necessary yet. probably CPS chapter in EOPL will make this a bit more clear.. ok.. getting rid of the parse/find abstract interface. it complicates things too much.. one thing i didn't think of is that 'find' maps thing -> word. so the compiler for a symbol + find is something that looks up a word AND dereferences the implementation. yep. i noticed it is really a good thing to use closures instead of explicit structures.. of course, this does mean that all the red tape moves to the other side: all things that provide closures need to do the binding. it's not going so smooth as expectected: all through the code it is assumed a primitive is either a function or a promise. so i guess it's a good idea to change it now. the main problem is the 'find' as expressed above, since i have an extra level of indirection that distinguishes 'find' from 'compile', with explicit delayed compilation (interpretation) instead of implicit. i can probably work around it by providing: - compile lifting - special primitive registrars i need to sleep over this.. current problem = pic18-literal: used in a lot of places. produces primtive but should produce word. i changed this, need to check. the rest should be straightforward.. also writer-register! is wrong, due to the lift not being wrapped.. maybe better just lift words instead of prims? all this freedom!! Entry: spaghetti Date: Wed Mar 14 08:33:03 EDT 2007 the change above brings up some conceptual confusion. - a word is a representation of an atomic piece of code. it retains its source representation, and a translator which defines its semantics. - lifting is done on the primitives, not on the words. maybe that should change? NO - the pattern (register! name (atom->word name compiler)) has 2 occurences of name. this is ok. the first one is an index, while the second one is there to recompile the word if necessary. ok, it seems to work now.. Entry: reload Date: Wed Mar 14 09:48:10 EDT 2007 i run into problems with redefining the structs: data lingering after a reload is not compatible with type predicates and accessors. this means code cannot survive a reload. this is a bit ill defined, so i need to make a descision. - if i don't redefine 'word' on reload it can never be redefined on reload, which is a bit of a nuisance. - the other solution is to redefine 'word' and chainging all the data on the stack to reflect this change: the stack is the only thing that survives a reload, so it needs to be properly processed. it looks like i need a temporary structure to solve this. - find a better way to implement reload. the struct thing is really annoying. i need to find a solution to that soon. all the rest works with reload, even using a 'symbolic' continuation passing: after load, the repl loop itself is recompiled. maybe i should separate the files into init and update. that way it is possible to perform incremental updates. ok. the solution seems to be to install a 'toplevel' continuation that is passed the entire application state (stack). 'load' can then be called with a symbolic code argument == continuation. Entry: TODO Date: Wed Mar 14 11:01:25 EDT 2007 got it hooked up. lots of things to fix: - pic programmer endianness bug for high word? - fix reload / scheme modules (using different files with include) DONE - create a monitor/compiler for the 16bit threaded interpreter a compiler for the threaded code would: - map a list of words to their respective addresses - perform tail call optimization i should go straight for cat-like code with code quoting. yep. the most important things to tackle now are modularity and platform independence. aspect oriented programming :) maybe is should leave the module stuff for later, since reloading is not that easy... loading inside a module namespace might be possible though. Entry: languages Date: Thu Mar 15 09:40:09 EDT 2007 how to combine 2 different languages in one project? i'm trying to write the purrr language in terms of purrr/18, and i need an easy way to switch between them. i need a methodology.. what is a threaded forth? basicly a set of primitives. so what i need is: - a list of primitive names - a way to compile 'enter' things to look at: * unify all toplevel interpreters, so i can have more * separate console.ss into machine specific things. a toplevel interpreter is - a string interpreter - an exception handler - a continuation what about i store all the modes in the data store, symbolically? seems to work. now, about the VM. i think it's best i standardize on the VM mentioned above: call threaded with return (jump / tailcall) bit. this can be written in C too, so should eliminate most porting problems, with only optimization problems remaining. ok. back to where i started. should i allow 2 different languages on one attached system? why would i want to do that? debugging of course, but what else? there's a bit too much freedom here. 2 languages: native + ST forth will compilicate things, but will also make things a lot easier to use.. and a threaded forth compiler isn't so incredibly hard to build.. so, i work with one core language purrr/18 and build a threaded forth on top of that using a different mode. so.. next problem = representation for words. i'm using a simple name prefix = underscore. maybe there's a better way to do this? name prefixes allows the use in the lower level language. ok.. rambling on. the way to do it is to just translate highlevel language into lowlevel forth, and pass it on to the compiler. (1 2 +) -> (ENTER ' _lit 1 ,, 2 ,, ' + EXIT) here EXIT is a special word that installs the return bit in the MSB of the last word. Entry: TODO Date: Thu Mar 15 13:17:53 EDT 2007 - variable abc abc 1 + - flash addresses as literals - exit bit - write a paper about the absence of '[' and ']' and the relationship between literals and (dw xxx) the first 2 are similar: it would be nice to partially evaluate some code that uses words from the ram and flash dictionaries next to constants.. this introduces another dependency. currently the partial evaluator only resolves constant symbols. it requires a new dependency [ dict -> compiler ] to resolve this problem. there is a possibility to delay the evaluation of the optimization until assemble time, by using closures.. there's a deeper problem here: name resolution needs to be fixed.. let's see.. partial evaluation can't fail in a sense that is recoverable: if the literal optimization fails, it's a true error that fails the entire compilation. this means the evaluation itself can be delayed until the environment is ready, since the control flow does not depend on the result. a delayed evaluation has the form \env -> value whyle env is: name -> value. the more i think about this, the more better i like the idea. ok, so the first 3 can be solved using some form of delayed evaluation until exit. Entry: delayed evaluation forced in assembler Date: Thu Mar 15 15:56:08 EDT 2007 there is already one kind: symbolic constants. the addition that needs to be made is generic expressions. there are several forms to choose: - symbolic lisp expressions - symbolic cat expressions - scheme closures the first one are nice since they are symbolic, so easier debugging. the last one might be simpler to implement. lisp style expressions make more sense here since they have a single value, not a stack. now, this can be combined with the paper on partial evaluation: partial evaluation should then be transformed to compile time meta code evaluation. actually: 1 2 + -> [ 1 2 + ]L following colorForth, executed code always results in a literal on the ->green transitions. ok, so this fixes the question above: * delayed code is symbolic cat. * assembler does final evaluation of this code so what is the context? machine constant -> number variable name -> data addresses forth words -> code address operations come from some dictionary, probably cat, but need to be escaped somehow. let's say: search meta first, then variables, words, constants. this needs some changes: * the assembler needs to be a CAT word, so the stack can be used as context. * it's probably better to wrap all symbolic names in a list, so the evaluation is uniform: either numbers or lists. this seems to work pretty well. Entry: 16bit threaded forth compiler/interpreter Date: Thu Mar 15 18:06:23 EDT 2007 let's give them a name: the highlevel forth is PURRR and the lowevel forth is PURRR/18. here i use shorthand names threaded and native resp. first problem is the parser, since the forth needs its own parsing words. that should be the only real problem. since this forth is mainly for higher level stuff, i don't need machine constants: all machine access is solved on a lower layer. actually, the different namespace is a nice excuse for some simplificiation. second problem is running code from the brood console. this needs a little trampoline, since the only way to get out of a running interpreter is to call 'bye': ' bye -> IP _run CONT Entry: added pattern debug Date: Fri Mar 16 10:13:07 EDT 2007 the pattern compiler now has a debug method which dumps the source rep of the patterns into the asm buffer for inspection. this is implemented in the form of a match rule that matches ['pattern]. Entry: added robust reloading + logging Date: Sat Mar 17 09:38:56 EDT 2007 error on reload: console waits for then reloads. this is to give a chance to correct syntax errors without loosing state. also added 'cat.log' output, which enables the use of emacs' compilation-mode. just run 'tail -f cat.log' as compile command. Entry: trampoline Date: Sat Mar 17 09:40:45 EDT 2007 ok, i got something wrong... words stored in the dictionary are primitives. invoking a highlevel word from within a lowlevel native context requires the use of these primitives. remember: highlevel 'run' ONLY takes composite words, while lowlevel entry ONLY takes primitives. IF a primitive contains a highlevel definition AND the primitive points to an ENTER call, THEN the rest is highlevel code. the correct way to build a lowlevel -> threaded trampoline is: * set the highlevel continuation, saving the current one, to highlevel code that does (bye ;) * call the primitive * call the interpterer to invoke the continuation Entry: delayed eval in assembler Date: Sat Mar 17 10:18:14 EDT 2007 i should go to an architecture where the number of passes in the assembler is not fixed, but just enough so all expressions can evaluate with correct dependencies. maybe one pass is necessary to at least find out if all the labels are defined. Entry: literals Date: Sat Mar 17 12:06:44 EDT 2007 should literals be handled in the interpreter? problems: * a _lit instruction cannot drop the current thread, because the value needs to be accessed from the current thread. so the code "123 ;" translates to "LIT 123 NOP|EXIT" * moving this to the interpreter by encoding the value in the opcode is possible, but then large words need 2 words. it also requires 2 different lit instructions, implemented as: - LOAD + LOADHI - LOADLO + LOADHI - DUP + LOADSHIFT - LOAD + LOADSHIFT the one with DUP requires only one explicit lit, but it always needs 2 words, even in cases where the data would fit into a single word. i think code density is more important than any other constraint, except conceptual simplicity of the language, which is independent of the VM implementation. so it's probably going to be 2 different lits. so how to implement that? it's easy to detect if the high byte of an address is zero, since the flags will be set. this could be the clue. the address space this overlays is just the boot monitor, so that's no problem. some bitneukerij. 'x>IP' uses movff so it doesn't affect flags, this means the testing of the zero flag can be done after testing of the carry flag. are literals important enough to give them half of the address space? the answer is probably yes: - they occur a lot - if they are as cheap as constants, you don't need constants - 14 bit signed words will cover most use for numbers (counters/accumulators) - other literals are addresses: make sure memory model respects this opti so how do we give them half the address space: - effectively: use only 32 kb -> enough for now - align words to 4 byte boundary (and possibly reclaim the storage..) let's go for the first one: only 32kb address space. the other half could maybe be used for byte access? so, encoding primitives as [ EXIT | LIT | ADDR/NUMBER ] gives EXIT -> c LIT -> n afer one shift this looks nice: \ inner interpreter loop : continue prim@/flags \ fetch next primitive + flags exit? if x>IP then \ c -> perform exit literal? if 14bit ; then \ n -> unpack literal execute continue ; \ execute primitive \ interpret doubleword [ 1 | 14 | 0 ] as a signed value. : 14bit _c>> \ [ c | 1 | 14 ] 1st 5 high? if \ sign extend #xC0 or continue ; then #x3F and continue ; after fixing a bug in 'd=reg' it seems to work Entry: parsing Date: Sat Mar 17 19:00:52 EDT 2007 just fixed the interpreters and direct->forth translators using parser.ss somehow something doesn't feel right though.. parsing words feel 'dirty'. i'll try to articulate why, since i don't think anything can be done about it. internally, quoting is no problem: you just build a data type (word) that supports quoted functions/programs/symbols.. in CAT this is done by creating primitives that map stack -> (thing . stack) however, in program source code it is problematic: non-quoted compositional code has a 1->1 correspondence symbols<->semantics, and the semantics of successive words is not related. quoting is about modifying the semantics of symbols. one example where this is done very nicely is colorForth: here the color of a symbol is part of the source code, and represents information about how to interpret a symbol name. in textual from this would be something like (red drie) (green 1) (green 2) (green +) (green ;) here a pair of (color word) represents a single semantic entity. in ordinary forth however, it's not done this way: not all words have a prefix (a color). an other way to say it: most words use the default 'color'. so, in a sense, the thing that is 'dirty' is the default semantics. this is not so bad for convenience sake, but does requires a parser that introduces the semantics. otherwise we would have : drie number 1 number 2 word + word ; which is really what it is parsed to in the end.. the thing i'm being anal about is that CAT has a 1->1 correspondence between syntax and semantics, inspired by Joy. although, this is not entirely true. a syntactic shortcut in the form of (quote thing) is introduced to be able to quote lists and symbols. but this is not entirely necessary: '(1 2 3) == (1 2 3) data 'foo == (foo) data bar with these operations being a bit less efficient. that concludes the rant. Entry: quasiquote Date: Sun Mar 18 09:00:31 EDT 2007 which leads me to the following. it does make sense to have lists of programs in CAT, where quasiquote would come in handy. `(,(+) ,(-)) Entry: program->word Date: Sun Mar 18 09:09:50 EDT 2007 some nitpicking about constant->word. before i had quoted programs wrapped using constant->word. this doesn't make sense, since the 'constant' is really a parsed thing, and not a source representation. however, it does enable 'data' to do its work. but why don't i just quote the source of the entire program, and store the parser as semantics? that would be cleaner, but something doesn't feel right there either.. well, actually. i can just delay parsing completely! that seems like the right thing to do: the source can just be retained in its original form, and initial recursion during parsing is avoided, which directly solves the problem of setting! an atom's semantics. Entry: lazy eval Date: Sun Mar 18 10:06:18 EDT 2007 i think i start to see why a lazy language can be so convenient.. i spend quite some time trying to figure out when it's best to evaluate some expression. if this is always "as late as possible" this work should disappear. nevertheless, it's an interesting exercise. for the assembler it might be interesting to write it completely lazily, including the optimizations necessary for jumps, which i still need to implement. Entry: disassembler Date: Sun Mar 18 14:40:43 EDT 2007 disassembler needs to be smarter. i probably need to add some semantics to the fields, and have a platform-specific map translate them: resolver closure + asm code -> [ shared code ] -> disassembled -> prettyprint. Entry: open files Date: Sun Mar 18 17:15:50 EDT 2007 something is terriby wrong with the open files.. fixed by manually closing. i think i need to read about how ports get garbage collected, or not.. indeed. they are not, need explicit close or make an abstraction: http://list.cs.brown.edu/pipermail/plt-scheme/2004-November/007247.html Entry: where to go from here? Date: Tue Mar 20 01:01:02 EDT 2007 enough mudding about. roadmap: - get dtc working with host interpret/compile - make it self hosting - combine with synth - dspic asm + pic18 share/port Entry: a safe language? Date: Tue Mar 20 01:22:58 EDT 2007 [ 1 + ] : inc [ 2 + ] : inc2 is it possible to make a safe language without too much trouble? something like PF. without pointers.. [1 2 3] [1 +] for-each the interesting thing is that i can use code in ram if i unify the memory model. i think it's time to start to split one confusing idea into 2: - a 16/24bit dtc forth for use with sheepsynth dev: control computations - a self contained safe linear language for teaching and simple apps safe means: * no raw pointers as data * no accessible return stack, so it can contain raw pointers * no reason why numbers need to be 16 bit: room for tags * types: - number [num | 1] - program [addr | 0] features: * symbols refer to programs, special syntax for assigment * assigning a number to a symbol turns it into a constant * for, for-each, map, ifte, loop [ 1 + ] -> inc 1 -> inc [[1 +] for-each] -> addit now.. lists? the above is enough for structured programming, but map and for-each don't make much sense without the data structures.. so programs should be lists, at least semanticly. since flash is write-once, a GC would make more sense than a linear language.. so what about: purrr/18 -> purrr -> conspurrr maybe it's best to stay out of that mess.. cons needs ram, not some hacked up semiram. what about using arrays? if programs are represented by arrays instead of lists, not too much is lost: [1 2 3 4] [PORTA out] for-each ;; argument readonly = ok [1 2 3 4] [1 +] map ;; argument modified in place (linear) the latter one needs copy-on-write. [[+] [-]] [[1 2] swap dip] for-each what about [1 2 3] [1 +] map -> test 1. arrays are initially created in ram, as lists? 2. when assigned to a name, they are copied to flash 3. assignment is a toplevel operation, effectively (re)defining constants 4. flash is GCd in jffs style. 5. words can be deleted. in ram: one cell is 3 bytes: 2 bytes for contents + 1 byte next pointer. this leaves room for 256 cells, or 768 bytes. it might be interesting to make assignment an operation that's valid anywhere: persistent store.. on the other hand, that encourages misuse. so.. - free lists make no sense in flash - they do in ram - persistent store rocks in order to make this work, i need to write a flash filesystem first. problem: does redefining a word redefine all its bindings? it should. so each re-definition needs to be followed by a recompilation. nontrivial. this gets really complicated... can't we represent code as source, and cache it in ram? it looks like variable bindings should really be in ram. but what with persistent store? damn. dumbing it down aint easy. i think maintaining the late binding approach is infeasible. maybe it's good enough to clean up the semantics a bit? 1->1 syntax->semantics mapping (i.e. choose is the only conditional) so code can be used as data using 'map'. maybe that does make sense.. 'map' as 'interleave'. ok, that's enough. Entry: language in the morning Date: Tue Mar 20 09:02:11 EDT 2007 after 4 hours of sleep: it's hard to say goodbye to nice ideas when they don't work for practical reasons.. still there's something here. i think i just need to read the PICBIT paper by Marc Feeley and Danny Dube, and bas it on that. it looks like i just need to be wastful: everything is source code, the flash is a filesystem, and the ram contains executable code. the most important of all: it should be towered: purrr/18 -> purrr -> cat/18 so to distill again: - cons cells in ram - a flash file system it is interesting how the linear/nonlinear language thing i'm using, and this linear ram and nonlinear flash memory model coincide. the approach in PICBIT seems interesting: using fixed size cells of 24 bits = [2 | 11 | 11], with the types: 00 PAIR 01 SYMBOL 10 PROCEDURE 11 one of the others Entry: distributed system Date: Tue Mar 20 09:53:19 EDT 2007 i was thinking: this tethered approach makes a whole lot of sense in the case of one host controlling a huge amount of identical cores. Entry: back to dtc Date: Tue Mar 20 10:42:24 EDT 2007 got compile + interpret working. time for control structures. i'm seriously considering only using code quoting. but how to implement? same as in PF? it's actually not so hard: x x x { y y y } z z z | x x x quot L123 ; y y y ; : L123 z z z this does require a stack / recursion to associate the lables. another way to deal with it is to solve it in the parser, and use real lists. or the lowlevel forth could be extended to use something like this, which is probably easiest. Entry: hands on pic hacking Date: Thu Mar 22 17:10:19 EDT 2007 playing with the synth board. it resets from time to time. found that touching the PGM pin causes this. this pin is floating in my board, so i guess that's where the problem is. datasheet says: CONFIG4L 300006 bit 2 LVP enable1 / disable0 indeed, this is on. as long as this is enabled, normal port functions are disabled. moral of the story: disable it, or tie it high, or enable weak pullups. Entry: sheepsint Date: Thu Mar 22 17:38:38 EDT 2007 after fixing the PGM bug (LVP disabled now), it still crashes from time to time. i suspect it's some kind of interrupt thing.. lets disable stack reset and see if it still crashes. tss... watchdog timer was on. stupid. Entry: modeless interface Date: Thu Mar 22 18:55:22 EDT 2007 - modeless interface (unix socket) to send brood commands for emacs - normal boot vs interpreter based on activity on reset i should find a decent protocol to interrupt an app: to attach a console easily, but to have it running most of the time. Entry: partial evaluator Date: Fri Mar 23 00:50:37 EDT 2007 i'm probably just getting tired, but isn't it a lot better to do partial evaluation on source instead of assembly code? there is some elegance to the greedyness of the algorithm. somehow, this feels ok.. but if i type 1 2 +, it's always going to be equivalent to 3.. if literals can be identified at the time they are compiled, their compilation can also be postponed.. i don't really have a good explanation. what i do know is that this works because it is fairly decentralized.. the price payed is "literal undo" which is not so hard, and also works for pure assembler. don't know if this is going to make sense.. a symbol's semantics is only defined by what machine code it will be compiled into. (concrete semantics) for forth, this is either a function call or some inlined machine code. since the latter is highly machine specific, it doesn't really make much sense to separate that out into partial evaluator + optimizer, since the optimizer is going to add some bit of partial evaluation anyway.. it's better to put some effort into making the code separable: some patterns go for all register machines, some go for all pic chips, ... as i found so far, 1. abstractions will arise whenever they are hinted by redundancy or "almost redundancy". 2. if you build an abstraction you don't use later, you loose. abstractions make code more complicated, and are only justified by frequent use. 3. don't hesitate to keep towering abstractions until the redudancy is gone. some problems really do need several layers to encode comprehensably. what i'm intrigued by: 4. solve only one thing per layer. (one aspect). if the abstractions do not stack, find a way to disentangle them, and weave them back together automaticly. Entry: compiler compiler Date: Fri Mar 23 10:00:04 EDT 2007 seems you can't really use macros to write macros without extra effort im mzscheme. it defines level 0 and level 1 environments (normal and compiler), but a level 2 (compiler compiler) cannot be easily used without the use of 'require-for-syntax' the thing i ran into is this: i want to use a macro to generate a pattern matching expression inside a define-for-syntax function that is used implement a macro that generates a pattern matching expression. maybe it's best i just switch everything to using modules, and reload the full core when i'm reloading. i'm getting a bit tired of these kind of problems. questions: 1. is it possible to reload a module? 2. how to only recompile what's changed to reduce load time? Entry: cat as plt language? Date: Fri Mar 23 13:54:06 EDT 2007 ok, but what is apply in that case? (apply fn args) == (run-composite stack composition) in other words, exchange single code multiple data to single data multiple code. apply then still means means: convert a data + code into a data. Entry: modularizing cat Date: Fri Mar 23 14:35:26 EDT 2007 brings up a lot of problems.. some of the macro's i'm using like snarf-lambda are not very clean wrt names and values.. i also 'communicate using global values' which is not a very good idea.. so it's going to take a bit longer than expected, but the code should be a bit cleaner when it's done. ok, now for the big one. pic18.ss generic forth stuff: need to spend some time to separate out the sharables, which is a lot.. i do wonder if i really need both writers and asm state monads.. it is cleaner, but also a bit of a drag.. i need a proper mechanism to do this separation. but first, get this thing to load properly.. got some bugsies here and there. seems the compiler works fine, but the assembler has got some problems. ok. seems to work now. also compilation seems to work. Entry: macro namespace Date: Fri Mar 23 19:02:35 EDT 2007 there is really no reason to have multiple macro namespaces. i mean: namespaces are defined using hashes. it's easier to just load the generics, then overlay the specifics instead of having a lot of special names in the dictionary.. in other words: the pic18* words should be replaced by global unique things, denoting the fixed functionality: * machine constants * simple/full forth parser * macros -> recursive -> pattern matchers -> writers -> asm state modifiers all specific functionality is added on top by overlaying the code. this used to be done with "load" but is now done using "require". order of execution is preserved in require ??? Entry: double postpone Date: Fri Mar 23 21:16:39 EDT 2007 i'm running into problems with macro generating code.. fixed some. cleaned up some in vm.ss now i have an interesting problem with delayed eval: macro defs (side effects) get delayed till after the macros are used.. ok i think i got it.. what about tagging names that are supposed to be cat semantics in a certain way? ok.. this concludes a long run. from the top of my head, things are better now because: - badnop is better defined as a forth compiler with fixed functionality mentioned above - code makes clear indication if functions are used as cat semantics == code that compiles something into a stack primitive. - 'compile' and 'literal' are now CAT macros - the state monad uses a more highlevel wrapper things to do still: - constants for disassembler - disassembler - core restart - clean up source file layout, maybe split in more modules + docu funny... running into an evaluation order problem again.. maybe i should use some kind of module / scheme namespace trick to get rid of this? because load/eval/parse order is kind of arbitrary now.. -> nothing to worry about. it was a stupid typo. got the meta-patterns macro working too. this is actually an interesting idiom: just wrap a single macro around a body of 'define' statements to alter the way they are used: it allows proper syntax hilighting + individual testing. Entry: so what is badnop? Date: Sun Mar 25 16:02:37 EDT 2007 a native forth compiler for register machines, with provisions for harvard architectures, and provisions to build a dtc interpreter on top of a native wordlength forth. the platform specific part are: assembler generator, pattern matching peephole optimizing code generator, and some recursive macros. Entry: persistent store Date: Sun Mar 25 18:59:45 EDT 2007 so.. it would be way easier to just have the compiled forms cached on disk. but i guess if that's really necessary i can always write out scheme files and compile them. for the rest: all persistent data should be SYMBOLIC. this means: - no compiled CAT code (word) - no continuations in asm this seems really important.. an area where compromise leads to unnecessary complexity. i'm going to leave it open, and implement restart by reload, giving only the parameter. this is turning into a "where to put stuff" quest again.. ok. keep it like it is, and put the data stack in the state store + perform some checking to see if data is serializable before writing it out. Entry: debugging tools Date: Mon Mar 26 15:10:12 EDT 2007 need more debugging tools: - some safe way of dealing with the bootblock (mainly isr) OK- on-demand console: interrupt app OK - proper disassembler - 'loket' - documentation: how to document the language? dasm needs some thought. the interrupt app is as simple as polling the rx-ready flag, i.e. "begin params rx-ready? until" Entry: i need something new Date: Tue Mar 27 10:22:26 EDT 2007 the dasm might be interesting.. maybe i should do that. but i'd like to do something exciting today :) wrote some badnop docs, changed some names.. maybe i should have user definable semantics accessible in CAT itself? (more reflection?) Entry: the road to PF Date: Tue Mar 27 11:13:48 EDT 2007 ok. time to write PF in forth, by gradually bootstrapping into different languages. the lifts are: 1. vector -> linear lists 2. non-managed -> refcount managed 3. untyped -> type/polymorph 4. proper GC 5. scheme the first lift is the same as the one i already did, which is lifting native code to vectored rep. the lower interpreter's composites become the higher interpreter's primitives. however, if data is also being lifted, the change is in no way trivial: primitives won't accept the data until it's moved to a linear stack. so maybe this needs to be separated. the lift to lists is different for data than it is for code. on the other hand, it does look like a nice place to insert some type checking code. need to think a bit more.. Entry: multimethods Date: Tue Mar 27 11:49:14 EDT 2007 i had this idea of representing types using huffman coding, in a binary tree. this requires a set of fixed types and some information about which ones are used most, but it might be quite optimal. there is a lot lot of room for optimization here, moving type checks outside of functions etc.. but it will probably require some type specs. Entry: poke Date: Wed Mar 28 10:57:34 EDT 2007 let's write poke again, the PF vm. the first thing i need to do is to generate C code from some sort of s-expression. expression conversion seems trivial, just need to distinguish between the bultin infix operators, and prefix expressions with comma separated argument lists. statements are more problematic. bodies are straightforward, but how to handle special forms like for/while/do ? seems i got most of it running now. main features: - an s-expression interpreter with a primitive and a composite level - used to implement 2 interpreters for statements and expressions now i was thinking if it would be possible to create some kind of downward lambda. i can't use the gcc extension.. yes, but i do need to allocate ALL functions in structures, meaning explicit activation recors, and use lexical addresses. if this is used, it's better to completely forget about any local C variables. Entry: downward funargs Date: Thu Mar 29 16:12:22 EDT 2007 so, attempt to create a 'downward lambda' for poke. allocating on stack for now, with later possibility to allocate on heap. how hard is this to have in some form? simplifications: - all cells are the same size - values are pointers to 'object' this needs quite a bit of support: - environments - closures the function bodies themselves take: - environment - arg list (part of environment?) a function invokation is: - create environment extension - run function - cleanup environment extension { object_t env[3]; // parent + 2 variables // invoke a function 'FUN' ({ // create new environment object_t ext[2]; ext[0] = env; // link parent ext[1] = 123; // init first and only arg FUN(ext) // invoke fun }) } this resembles PICO. ok.. going a bit too far here. what about introducing these features when they are really needed? one question though.. if only downward closures are needed, why not use dynamic binding instead? nuff. Entry: back Date: Thu Mar 29 17:50:44 EDT 2007 back to the code generator. the reason i wrote this was twofold. one is to have a portable target for brood forth. the main idea there is to rewrite mole into someting more graceful, and have a basis for (re)writing PF. and two: i need a language for expressing the signal processing code in PF. this should not be forth, but a multi -> multi dataflow language. maybe just forth + protos? so. i think the next step should be to transform current cgen (poke) so it has an extensible name space. maybe it is a good time to look into defining new languages inside PLT, since that's what i'm doing basicly, instead of mucking about with explicit environment hashes and interpreters. something to iron out: it's not a new language, it's a cross-compiler: you want to define functionality accessible in one name space using functionality accessible in another name space. Entry: extendin cgen name spaces Date: Fri Mar 30 10:31:15 EDT 2007 i don't really need to make the hash tables available. it's much easier to just create a new interpreter function which falls back on the basic one defined in cgen.ss hmm.. i got myself in trouble again. the above doesn't work since statement/expression are mutually recursive. in addition to that, statement uses closures. maybe i do need a hash? ok. i think i got it ironed out a bit. using a hook for both the expression and statement formatters, and calling this hook recursively does do the trick. Entry: compiler structure Date: Fri Mar 30 15:01:41 EDT 2007 so.. basicly, a compiler/assembler/whatever has the following 'natural' structure: T = target language S = source language C = compiler language it's best to separate the S -> T map into: primitive macros S -> T (small) composite macros S -> S (big) you want to write both S -> T and S -> S maps in C. the reason you want an S -> S map is because it contains higher level code than a S -> T map. one pitfall is to shield functionality in C by not properly mixing in the T name space. the most straightforward way to implement both maps is quasiquoting: quoted S or T and unquoted C. including the compiler language is more precise: primitive: C,S -> T composite: C,S -> S badnop is already organized this way: the primitives are peephole optimizing pattern matchers, where C is scheme. writers and state modifierd are composite, with C being cat. and the recursive macros are a cleaner S -> S map, with C empty. Entry: lifting Date: Fri Mar 30 15:22:03 EDT 2007 now for the ambitious part. the thing that got my whole forth/PF thing started is a desire to generate automatic control structure for video DSP building blocks. basicly: IN: a highlevel description of how pixels are related through operations OUT: a compiled representation processing images / tiles the core component here is loop folding: (loop { a } then loop { b }) -> (loop { a then b }) the win is a memory win: intermediates should not be flushed to main memory. so compilation generates the control structure. compilation 'lifts' the pixel building blocks into something interwoven with the control structures. Entry: grid processing Date: Fri Mar 30 16:01:57 EDT 2007 the possible optimizations depend tremendously on the amount of information available on the individual processors, so the idea is to keep the primitive set really simple, and look at their properties. * associative (n-ary op consisting of n-1 binary ops) * commutative (binary op) * linear/linear1 + l a c * l a c / l1 abs the typical structure to look at is a one dimensional FIR filter, since this can be extended to 2D (space) and 3D (space+time) filters. (* gain (+ x (n x 0 -1) (n x 0 +1))) let's analize. 1 and 3 are constants, so (/ 1 3) can be evaluated. x is used in a 'n' expression, which we use to denote membership of a grid. let's make all parameters into grids (* (gain) (+ (x 0) (x -1) (x +1))) so (gain) is a 0D grid, (x 0) is a 1D grid (x 0 0) is a 2D grid etc.. composite operations can be specified, for example (processor (a b) (+ (a) (b 0) (b 1))) this means all parameters need to be declared, since we need to know the order. the syntax i'm using here requires ordered parameter lists. i prefer this over keywords, since it is more compact, and we need to fill in all inputs anyway (no explicit defaults). another ineresting operation on an expression is to compute it's reverse: an expression represents a dependency graph, which can be inverted. however this is only iteresting for multiple inputs, which we won't use yet: apply explicit subexpression elimination and graphic programming. ok, so we need parameter names. another interesting operation is fanin: how many times is a single value used? this is important for memory management (linearization). note that linearization and operation sequencing is almost equivalent to translation to forth. maybe it's time to go for the first iteration binder. we map a single function to an explicit iterator. i.e. (+ (a 0) (a 1)) it has a single 1D grid input, and produces a single grid. ah! something i forgot: what's the output type? a grid of dimensionality equal to the maximum of the input grids. so, an n-dimensional grid is placed on the same notational level as an n-ary procedure. ok. the above can be transformed to the loop body (+ (index a (+ 0 i)) (index a (+ 1 i))) where a runs over the line. the rest is border values: (+ left (index a 0)) (+ (index a w) right) so the idea is to make the loop body and the 2 borders. implementation (see ip.ss) implicit -> explicit (a 0 (a ([I 0] 0) 1 ([I 1] 1) -1) ([I 2] -1)) | loop depth ---X Entry: thinking error Date: Fri Mar 30 19:53:11 EDT 2007 the error i made previously was to 'precompile' things: bind stuff to tiles, then bind some stuff later in an interpreter. the problem with this is that you're solving the same problem twice. not very good.. a much better idea is to keep everything in a highlevel description, then compile it as composition goes on: one thing i'm dreaming about is to build things in a pd patch, then hit 'compile' for an abstraction, and it will compile an object that performs the operation. so, the other error was to use low level reps. forth has benefits, but not for writing compilers, which is mainly template stuff: mixing name spaces. you really need quasiquoting and random parameter access. EDIT: this is what's so nice about the scheme macro system: the mixing of compiler and target namespaces works really well. Entry: monads and tree accumulation Date: Sat Mar 31 10:44:58 EDT 2007 writing the source code analysis functions i run into the following problem: map a tree, but also run an accumulation. now of course it's easiest to just use local side effects here, since they behave functionally from the outside. (linear data type construction). but just out of curiosity, what kind of structure is necessary to do this functionally? basic idea of monads: if you don't save 'extra' data in the environment, save it in the data. this requires 'map' to be polymorphic, so it can act on this type accordingly. i don't think it's worth the trouble here. Entry: boundaries Date: Sat Mar 31 12:14:13 EDT 2007 border values using finite grids, borders need to be handled. basicly, invalid indexing operations need to be replaced by valid ones. some strategies: constant: (a -1 0 0) -> c repeat: (a -1 0 0) -> (a 0 0 0) wrap: (a -1 0 0) -> (a (wrap -1) 0 0) how to name border regions? there are several distinct cases, for example a square grid has these: (L L) (I L) (H L) (L I) (I I) (H I) (L H) (I H) (H H) L low boundary I bound to iterator H high boundary with looping indicated by { ... } a full 2D loop looks like: (L L) ;; top left { (I L) } ;; top (H L) ;; top right { (L I) ;; left { (I I) } ;; bulk (H I) ;; right } (L H) ;; bottom left { (I H) } ;; bottom (H H) ;; bottom right that basicly solves the problem. note that it's best to lay out the code in a L I H fashon to keep locality of reference. on to representation. the loop body is a serialization of an N-dimensional 3-grid. (a 2-grid is a hypercube). it's serialized into a ternary tree. how to represent ternary trees? the following representation looks best in standard lisp notation: ((L . H) . I) other variants have the dot in an awkward place. another possible rep is (I H L) which can be written in mzscheme's infix notation as (H . I . L) i'm going for the former, as it allows to use (B . I) in case L and H are the same. in order to generate the full loop body. EDIT: it's easier to just use s-expressions: (range H I L), and have 'range' to be a keyword.. loop borders can be constructed using the data structure provided by 'src->loopbody' ah! it's possible to separate the operations performing loop order allocation and pre/post expansion, but probably not very desirable.. so let's combine them, so we can get rid of using natural numbers. note: i found out that when i needs index lists, i'm doing something wrong: applying a certain order on things... so, in order to generate the tree above, we consume coordinates from left to right. all loop transformations need to be done on the source code before generating loop bodies. Entry: lexical loop addresses Date: Sat Mar 31 14:33:53 EDT 2007 i need a notation for addressing loop indices. currently i'm converging on not updating pointers in a loop, but using indexed addressing, since that's something that can be done easily in hardware. an optimization here is to use relative addressing only for the inner loop, so only one index needs to be added, and cache the computation for all other relative accesses. each loop has exactly one index that's being incremented. the depth of the loop determines how many indices are bound. what i'm trying to do is to generate the border conditions that have not all indices bound. how to do that? loop a { ... data (a) ... loop b { ... data (b c) ... loop c { ... data (a b c) ... } } } the inner loop here needs to be split into 3 parts data (l b c) data (a b c) data (h b c) then the 2 unbound parts can be moved out of the loop. so, basicly. BODY -> (nonfree . free) as an example, take (+ (a 0) (a 1)) split in (+ (a 0) (a 1)) ;; border (+ (a (i 0) (a (i 1)))) ;; body since code is originally in unbound form, it might be more interesting to perform binding inward. start from the relative description, and split this into a partially bound and partially filled structure. border <- relative -> bound then iterate downward before this is possible, all code needs to be translated to full 'virtual full grid'. later on, it can be substituted back to its original form. Entry: representation Date: Sun Apr 1 10:22:02 EDT 2007 ok, i think i got the basic idea, so it's time to start using some abstract data structures. on the other hand, if using list structures is possible, debugging is more convenient.. sticking to lists. Entry: breath-first Date: Sun Apr 1 16:52:47 EDT 2007 i think this is the first time i ever encountered a problem that's easier solved using breadth first expansion. hmm.. that's probably plain bullshit.. it's just my particular approach at this moment using an 'infinite' expansion with an escape continuation: (define (expand e) (call/cc (lambda (done) (let ((expand-once (lambda (f) ... (done e)))) (expand (expand-once e)))))) basicly, this just iterates expand over and over, and backtracks to the last correct expansion 'e' whenever some termination point is reached in expand-once. ok, abstracted in 'expand/done' Entry: separation of concerns and exponential growth Date: Sun Apr 1 19:15:54 EDT 2007 was thinking.. separation of concerns: hyperfactoring, whatever you call it, is a means to move from linear -> exponential code dev.. once you can separate things into independent parts A x B, increasing functionality in either will increase total functionality by the same multiplication factor. if they are not separated, increase in complexity doesnt translate to increase in functionality. this is very badly explained, but i think i sort of hit a spot here. compare the payoff of time invested in building independent/orthogonal building blocks that can be combined, against the payoff of time of tweaking a small part of a huge system. the added complexity (information, code size) might be the same, but the added expressivity (possible reachable behaviours) is hugely different. multiplication in the first, and addition in the second. it's the difference between adding a bit in state encoding (exponential), and adding a state (linear). Entry: the inner loop Date: Sun Apr 1 19:26:11 EDT 2007 how to encode the innermost loop? for example start with (+ (a (I 0) (I 1)) (a (I 0) (I 0))) with the inner loop being the last index (arbirary choice). the main question to answer is: "relative or absolute addressing?" either one uses explicit pointer arithmetic, or one uses index registers. for the outer loops, increments occur infrequently, so it's best to use pointers. a -> pa (+ (pa (I 1)) (pa (I 0))) so, the number of registers used for addressing in the inner loop is equal to the number of grids (including the output one), and one loop index. if addressing modes like BASE+REL+OFFSET are not available, extra pointers or indices are needed. i seem to remember that incrementing pointers using the ALU is bad on intel, and it's better done using AGU.. i guess there's a lot of room for doing this right or wrong depending on the architecture. and i sworn never to intel assembly again :) if C is the target language, i guess some experimentation is in order. for simple processors, it seems quite straightforward how to subdivide things so maximum throughput can be attained. i guess the next target is to generate actual code. that should iron out the conceptual problems.. Entry: inner loop cont Date: Tue Apr 3 09:47:24 EDT 2007 the problem is, the indentation shown by 'print-range' is not the same as the indentation for the C code loop blocks. setup code needs to be moved out of the loops. going from inner -> outer: (+ (grid a (I 0) (I 0)) (grid b (I 0) (I 0))) needs to be translated to (update a 0 (I 0)) (update b 0 (I 0)) (+ (grid a (I 0) 0) (grid b (I 0) 0)) (downate ...) effectively updating the pointers before the loop is entered. i was thinking about just shadowing a single variable 'i' in that case, what is necessary is to make sure each expression referencing I has only one occurance (or an occurance in the same position). instead of construction an intermediate range representation, it might be more valuable to generate the loop structure directly, following the same approach as before. (a (0 1 2)) -> (a (L 0) (1 2)) (a (I 0) (1 2)) (a (H 0) (1 2)) -> (let ((a (L a 0))) (a 1 2)) (let ((a (I a 0))) (a 1 2)) (let ((a (H a 0))) (a 1 2)) so, basicly just specializing variable names. this boils down to computing pointers. so, to resume the downward motion is: (expr (+ (a 1) (a 0))) -> (bind ((a_p1 (S a 1)) (a_p0 (S a 0))) (expr (+ (a_p1) (a_p0)))) ... ok, i think i got somewhere: > (p '(+ (a 0 0) (+ (a 1 0) (a 1 1)))) { int i; for (i = 0; i < (400 * 300); i += 300) { float* a_p1 = a + (i + (1 * 300)); float* a_p0 = a + (i + (0 * 300)); float* x_p0 = x + (i + (0 * 300)); { int j; for (j = 0; j < 300; j += 1) { float* a_p1_p1 = a_p1 + (j + 1); float* a_p1_p0 = a_p1 + (j + 0); float* a_p0_p0 = a_p0 + (j + 0); float* x_p0_p0 = x_p0 + (j + 0); *(x_p0_p0) = (*(a_p0_p0) + (*(a_p1_p0) + *(a_p1_p1))); } } } } now, there are quite some possible optimizations or simplifications. one is to leave the inner level as indexed pointers. another is to replace stride multiplication with addition. Entry: scheme syntax Date: Tue Apr 3 22:35:32 EDT 2007 today i (re)discovered: (define ((x) a b) (+ a b)) and was surprised that it also works for (define (((x)) a b) (+ a b)) first saw it used in SICM Entry: accumulation / values Date: Tue Apr 3 23:29:08 EDT 2007 i need an abstraction for (linear) accumulation. no need to mess with monads. the pattern i'm finding is: * substitute expression in tree + accumulate a set i want a function that returns 2 values, the original expression and the accumulated set. note that use of assignment like this isn't so bad, beacause it's encapsulated (linear): there are no references to the object until it's ready. also note (again) that using monads requires polymorphic versions of generic list processing operations, and is overkill. the 'lifting' technique used in the compiler do need monads, because they are open: each operation modifies a state, and intermediates are accessible, so pure functional programming is a good idea to keep backtracking/undo tractable. Entry: aspect oriented programming Date: Wed Apr 4 08:56:29 EDT 2007 1972: Parnas "On Decomposing Systems" 1976: Dijkstra introduces term "Separation of Concerns" 1982: Brian Smidth introduces "Reflection" 1991: Metaobject Protocols 1992: Open Implementations 1993: Mini Open Compiler 1997: First paper on AOP 1997: D 2001: AspectJ 2004: JBoss http://www.cs.indiana.edu/dfried_celebration.html Anurag Mendhekar: Aspect-oriented programming in the real world Entry: back to sheepsint Date: Wed Apr 4 12:49:40 EDT 2007 i need to restart the board design soon, but i do need a fully functional dev env before i can do that. some more things are necessary: proper stateless message interface (CAT = object) for sending code and performing command completions. Entry: summary Date: Wed Apr 4 19:21:27 EDT 2007 THINKING ABOUT PF been looking into bootstrapping PF from lowlevel forth core. aspects: polymorphy and types (clos), linear memory managent (lazy copy), transition from vector -> list. the latter is interesting since it contains 2 parts: code: needs a new interpreter, data: needs a lot of new primitives, maybe combined with type checking. i wonder whether it's easier to just start from a cons cell VM directly. C CODE GENERATION * separate statements and expressions * plugin expression transformers POKE * using a non-blocked version of C gen LOOP CODE GENERATION i think i have the general idea: * c code generation working * functional specification mapped to assignment * nested loops: blocks to bind locally cached index pointers * additive index arithmetic * inner loop uses a single index the scheme code looks simple, and well factored. gut feeling says the code is simplified enough for gcc's optimizer to tackle it. i still need to do the border conditions. this will need to be example driven. next month i might try to plug in some code. Entry: from forth to PF Date: Wed Apr 4 19:37:52 EDT 2007 1. data a PF primitive written in forth looks like: - (force) collect arguments (list -> vector) - method lookup - perform primitive forth code - (lazy) push arguments (vector -> list) so the stack is implemented like: [ list | vector ] the vector actually needs to be a circular buffer, because it behaves as a deque: traffic between list and vector is on bottom end, while primitives operate on top end, unless the primitives accept their arguments reversed. 2. code fairly straightforward. because of the difficult impedance match between list and vector machines, i think it makes sence to forget about building one on top of the other, and write only the vm. an interesting question is whether this can be abstracted. and also, can i write the VM in itself? been tinkering a bit with poke.ss and mole.ss got the basic permutation worked out. Entry: alan kay name dropping Date: Wed Apr 4 19:59:37 EDT 2007 from "Proposal to NSF - Granted on August 31st 2006 - Steps Toward The Reinvention of Programming" i'm curious about the albert thing. what i read i don't understand though.. better next time. motivation and inspiration: John McCarthy LISP ... bootstrapping Entry: persistence & late binding Date: Thu Apr 5 16:11:14 EDT 2007 so, borduring verder on that article.. i ran into the problem of saving parsed code, beacause semantics is stored as a procedure. what about replacing this by a symbol? assuming data will only be read by a system that has the bindings in place, this is a valid approach. then bootstrapping can be solved differently, and all internal representation is just cache. so.. a word = code object * a source representation * a symbolic semantics (other word?) * a cached transformer procedure (concrete semantics) * a cached meaning = lambda expression the cycles in this representation need to be broken somehow. hmm.. this is actually a lot harder than it sounds, since the cache really needs to be a cache. probably needs a from-scratch approach. ok. started the 'symcat' project. for the current project i think i can live with non-savable parse trees, since it's always possible to save source code, and i have a working 'reload core' command for use during compiler development. all in all, the system i'm writing is fairly straightforward. so no more about this really cool idea here. see symcat. Entry: name spaces Date: Thu Apr 5 16:59:24 EDT 2007 something that's getting on my nerves a bit are CAT namespaces. small special purpose apps can benifit from the simplicity of a single namespace and short names, but for CAT i'm not so sure any more. also, i'd like to catch undefined names early on. Entry: standalone Date: Thu Apr 5 17:06:15 EDT 2007 time for the standalone forth. one of the things i've been wanting to try for a while, but never got to.. i should have a look at flashforth and also retroforh for inspiration. roadmap: * 'accept' terminal input into buffer * 'parse' words * 'find' a word in the dictionary compilation is straightforward, but requires some thinking since stuff will need to go to ram first. (it's multipass, i.e. if .. then). Entry: reflection Date: Thu Apr 5 19:37:03 EDT 2007 the ideas of reflection and metacircularity probably go hand in hand.. in CAT i'm getting a bit annoyed by having to choose between implementing something as a scheme function, or as a cat function. for example: semantics is implemented as a scheme function, so it's technically not accessible from CAT. let's re-iterate. the point of CAT... usually, a forth compiler is written in forth. a cross compiler poses problems in this sence, since the normal 'local feedback loop' doesn't work. the (re)constructed rationale: 1. forth is extremenly modular: a function is a composition of functions 2. a forth compiler is most naturally expressed in the same way: a forth compiler is a composition of compilers (macros). 3. most naturally, forth is implemented metacircularly. 4. i can't do that because the target is too simple -> simulated 5. the metalanguage best reflects the same structure: compositional 6. choosing for a functional language (CAT) -> monadic composition 7. CAT is written in scheme to avoid it's own bootstrapping problem the last one actually reads as: CAT is an impedance map from scheme to a compositional language to easier implement an extensible optizing forth compiler. if CAT is metacircular, there is no need for scheme. this approach is not used because: - (plt) scheme is packed with features - i use a fair amount of scheme to provide primitives. in fact 'primitives' is not really a good word for it.. so it's best to see CAT as scheme in disguise, and as a vehicle for a decentralized compiler/interpreter, bound together by monadic composition. to have the possibility of writing new CAT words is mainly for extension property (writing the compiler), not for CAT core. Entry: nested scope Date: Thu Apr 5 19:55:49 EDT 2007 as i've learned, these features are really necessary to write a compiler: * lexical variables * quasiquotation * pattern matching however, they do serve most purpose adapting to a representation that is inherently imposed, i.e. assembly language syntax. anything that is non-compositional is better handled with something like scheme. however, if you can design everything from scratch, it's probably quote doable to get by with a couple of combinators and aggressive factoring. but, in the end, some form of lexical scope should be possible, if only for the practical problem of name clashes.. there is only one question. are names functions or values? in lisp, they are values, because functions are explicitly invoked: if a variable is in the head of a list, it's a function. in a compositional language it would involve something like 'i'. ((a b c) locals ... a i ... b i ...) treating things like values makes them more natural. an abstraction could be added to do the other (bind as program). then, how to handle the environments? NOTE: got lexical variables and quasiquotation working in symcat, but only by a more direct cat->scheme translation. i dont think it's really necessary here, since i do most in scheme. also, some name space issues are still not resolved. maybe i can switch for the next reqrite tho :) Entry: back in the solder lab Date: Mon Apr 23 17:10:12 CEST 2007 things i need to get working before the end of the week: * sheepsint input switches * room for xtal on pcb * capacitors on pots random hacking: * 3.3V serial interface * usb? Entry: emacs integration Date: Mon Apr 23 20:03:23 CEST 2007 this screams for a 'once and for all' solution. i'd like to keep brood portable, so using unix sockets for a console as is done for pf, is not the way to go. since we're running a lisp in a lisp editor, it's probably best to keep the one 'default' interface on stdin/stdout as a lisp channel, and run the console logic in emacs. maybe a bit in the style of slime? ok.. following slime to ielm.el, modified to connect it to a running scheme process. slime is too big for me to make sense of, i might return later for some features, but i need to get something running first. what i need is multiple languages on the same console, or maybe different buffers? the whole idea is to have most of the parsing in emacs, so emacs can make the editing a bit smarter. maybe i should have a look at: Entry: erepl Date: Wed Apr 25 14:18:32 CEST 2007 looks like it's working reasonably well.. things to add: * tab completion * multipe languages either parser in emacs, or sending out raw lines. the former is better for better line editing, it already does that really.. the second is better so i don't need to rewrite anything, though forth parsers are really simple and i'm not using a tremendous amount of special plt read syntax. i wonder if emacs read syntax is extensible? anyways, what i do need is a way to switch the mode in emacs, and not in the target scheme image. Entry: fresh install Date: Thu Apr 26 09:16:42 EDT 2007 i tried a fresh install, but apparently my compile script tries to compile stuff in the plt dist, starting with the deps of "match.ss". "sudo ./go" should work.. so, how to install? should i keep all the source files as 'writable'? should i keep it in dev land for a while? maybe best. Entry: project directory Date: Sat Apr 28 19:24:10 CEST 2007 i need to solve the following problems: - core should be installed system-wide - project directory should contain multiple projects the idea is that 'clicking' on a state file should bring up everything. let's try to make sense of this: the brood system is aimed at developers. in that sense, it is encouraged to hack the system, which means the scheme files should not be stored system-wide, and they should be writable. this allows the compilation cache to remain as it is. the source dir has a subdir called 'prj' which contains subdirectories, one for each project. these individual subdirectories could be managed using darcs. it's absolutely essential to find a way to have the TARGET determine which project to load. in order to do this, we use the reply of 'ping' as the name of the project. there is one default project for each architecture, which serves as an example. -> compilation from scheme: right now i invoke mzc, it's probably better to do so from a scheme script. all this seems to work. next problems: * windows / osx : emacs + serial port config * using snot : rewrite all language repls to standard interface : one line (string) at a time, require from snot for it to be 1 or more valid s-expressions. for the last one, i think i found it: just have 'prompt' display the prompt and accept the next line input, this can be done using a simple coroutine/continuation trick. Entry: getting to working usb Date: Sun May 6 13:04:44 CEST 2007 roadmap: * constants as forth file * platform dependent constants * 2550 init * get serial monitor working * ... Entry: usb debugging Date: Mon May 7 13:36:31 CEST 2007 got the kernel messages going etc.. looked at doc/usb/asmusb.asm (johannes adapted this from C code) to find out i need to enable full speed instead of low speed: #0x14 -> UCFG. now i get transactions. time for the highlevel protocol. Entry: usb device descriptors : usb.ss Date: Sat May 12 13:25:30 CEST 2007 looks like it's working: i can compile device descriptors from a more reasonalble highlevel description. next step is to organize the tables in flash. ignorant of content, the thing it needs to do is to map device -> (n,addr) (string,i) -> (n,addr) (config,i) -> (n,addr) the logic then needs to transfer the buffer in chunks so i need a proper tree structure in flash. preferably one that can handle errors so the device is a bit robust. these things are read-only, so they can be implemented directly as code. for example: device/string/config ( id -- string ) which is encoded as : device 3 word-table addr0 ,, addr1 ,, addr2 ,, : addr0 length , 1 , 2 , : addr1 length , 3 , 4 , 5 , here 'word-table' does bound checking + throws exception for error handling: it's probably easier to just use 'max' to limit the offset, then install the last redirect as an error handler, so: : config 3 min route config0 ; config1 ; config2 ; error ; Entry: conditionals < and >= Date: Sat May 12 14:54:37 CEST 2007 in pic18-comp.ss they are implemented as macro predicates, following the standard forth comparison operators: consume 2, leave condition. ( a b -- ? ). these can be followed by if. i've been looking into a more general way of using the CPFS[EQ|GT|LT] opcodes, by mapping them onto the conditional jump implementation. been avoiding this for a while, because i have unsigned 'max' and 'min'. the thing is 'cbra'. it consumes a condition, and compiles a conditional branch. does this really make sense? the other conditionals can be inverted, these cannot: only by swapping jump targets. so: - change 'not' to support a new pseudo op - change 'cbra' to do this branch based swapping looks like it's working. an optimization is possible in case of single opcode instructions, but it's probably better to just code them as macros. needs some thought. Entry: usb descriptors again Date: Sat May 12 15:53:12 CEST 2007 it's probably best to just keep using 'route' in combination with 'min' and an error handler. let's standardize a 'buffer' or a 'string' to what i already use for the 'ping' command: : my-flash-buffer string>f length , 0 , 1 , 2 , 3 , ; this means that the word 'my-flash-buffer' sets the current flash object (the f register). a string is a flash object which has its length stored in the first byte. so '@f++' on a string object will give the length, and leave f pointing to the raw bytes, so successive '@f++' will read out the bytes. the usb descriptors should be stored in exactly the same way: device, configuration and string should just set the current flash object, which is understood to be a purrr string. so, the following output ((device (16 1 16 1 0 0 0 8 216 4 1 0 4 3 2 1)) (strings ((23 3 68 101 102 97 117 108 116 32 67 111 110 102 105 103 117 114 97 116 105 111 110) (19 3 68 101 102 97 117 108 116 32 73 110 116 101 114 102 97 99 101) (5 3 48 46 48) (10 3 85 83 66 32 72 97 99 107) (28 3 77 105 99 114 111 99 104 105 112 32 84 101 99 104 110 111 108 111 103 121 44 32 73 110 99 46))) (configs ((9 2 25 0 1 0 0 160 50 9 4 1 0 1 3 1 1 1 7 5 128 160 8 0 0)))) can be transformed into: : device string>f , ... , : string 5 min route string0 ; string1 ; ... ; string-error ; : config 1 min route config0 ; config-error ; : string0 string>f , ... , : config0 ... maybe it's easier to just eliminate the intermediate names, since there is a notion of arbitraryness involved. they are just local labels, as used with if ... then. all in all, just generating a couple of names is probably easiest. ok, done. now loading. the thing to fix next is a global path for any kind of file loading mechanism. Entry: some weird bug with forth parsing Date: Sun May 13 13:35:02 CEST 2007 apparently, for parsing macros (color macros) like 'load' and 'path', there is a problem when the macro that's implementing the behaviour, popping the name from the data stack, is not defined.. i don't know why.. maybe i need to make that macro parsing part a bit more transparent. currently parsing words are a bit of a hack. i need to get to the core of the problem and fix it. again: * forth macros are cat words, as such they are 1-1 semantic/syntactic * forth parsing transfers parsing words to quoting code: something forth source cannot represent, but parsed cat code can. maybe i need a symbolic intermediate form, where lists are quoted explicitly? like PF. with a mapping like: (load file.f) -> (('file.f load) run) hmm.. it's probably just a bad day to make decisions. ok. calmed down a bit. load-usb is working now. next: hands on transfer. Entry: state machine or task? Date: Sun May 13 15:55:33 CEST 2007 a task that does usb transfers makes sense. however, since i'm still debugging i think a more lowlevel approach is better. when i got it running, i can write everything in blocking form. Entry: jump bits Date: Sun May 13 15:56:55 CEST 2007 words use relative addressing. this can lead to trouble. what about this: * just assemble, but when an address doesn't fit, keep it symbolic. * 3rd pass: gather all addresses, and compile words which contain a goto statement to the words that were called, but not reachable. this will keep code small, and the assembler simple: no need for variable size goto instructions inside words. the rationale is: this forth is for lowlevel stuff. for highlevel things, use a DTC on top of this: there you don't have a problem. Entry: stamp dead Date: Sun May 13 19:57:41 CEST 2007 serial port driver dead or something? i don't know. it doesnt seem to be a software problem. chip isn't doing anything. without scope hard to debug... so plan B 1. brood + snot (1 evening) 2. sheepsint buttons + audio out port (1 evening) -> leuven for scope and other stuff.. Entry: stamp back Date: Sun May 20 12:01:59 CEST 2007 something going on here.. i tried stamp 2, which refused to work a couple of times, until i got it going. then replaced with the original 'broken' stamp, and now that one works too. maybe it's just my breadboard.. since i did have to move 2 pins to the left on the breadboard because the 2nd stamp's pin header is too big. Entry: late binding Date: Sun May 20 12:47:48 CEST 2007 what i need next is some form of late binding to do incremental debug. the code runs fine up to a point from which i need to make small changes to the code. reloading there is a drag, so i need a proper construct. defer broem 2variable broem-hook : broem broem-hook run-hook ; some premature optimizations: since these variables don't really need to be accessible, it's maybe better to put them somewhere behind the ram bank, for example shadowed by the FSR registers.. this way a hook can be represented by a 1 byte XT. Entry: color macros Date: Sun May 20 18:30:49 CEST 2007 what i mean with color macros is macros that modify the 'color' of subsequent words. currently i have no way to implement new parsing words in forth. this is not a good thing.. something is broken, but i dont know what exactly. probably my understanding... problem: parsing words use automatic name mapping. this is bad, since it's viral. meaning, once you start doing things like that it's all over the place: there is really no clean way to nest parsing words. so i need a different approach: extend the partial evaluator to include symbols. the deal is this: the PE uses the assembly buffer as a data stack. because some words use the CAT data stack for 'data' items, things get confusing. so, the thing is: i need a single macro that quotes the next atom in the input stream as a literal, and then use that. Entry: partial evaluation revisited Date: Sun May 20 19:31:53 CEST 2007 i ran into a pattern: the assembler buffer can be used as a data stack to perform partial evaluation. i don't have a proper way to make this sound, but it seems to eliminate the need for an 'interpret mode' in the sense of classical forth. the interpret mode is replaced by a set of rewrite rules that will perform compile-time evaluation. so instead of [ 1 2 + ]L we just have 1 2 + with the same result: 3 being compiled. actually, in the latter case purrr will produce [movlw (1 2 +)], so the evaluation can be delayed as long as possible. this can be extended to the following pattern: allow target forth values to be richer than just numbers, but require that they can be combined into lowlevel constructs. since i use this trick a lot, why not make it a feaure instead of an optimization? currently the postcondition of compiling a literal is valid assembler code. what about relaxing this to a delayed literal stack, and introducing a 2nd pass to comb out all the remaining, non-optimized literals. once i have this, partial evaluation becomes better defined: quoted symbols can be included and can be used in parsing macros. the CAT data stack can then be used for control operations only. big change. probably requires a temporary fork. NEXT: 'lit' macro preprocessing step is it possible to make 'lit' a pseudo-asm operation? yes, but the disadvantage is that it's not 1->1. is this required? yes. the asm is 1-1 sym<->bin, so this needs to be solved in the compiler. considering the percentage of code that intersects with delaying 'lit', i guess it's best to wait until after the big deadline, and work around the macro stuff now. as a matter of fact, i can still do it the old way, just adding a single 'quote' operator, for example backtick `. that's a good idea, as long as there's a [`] too, meaning macros can have literally quoted symbols in them. with those 2 primitives, all parsing words can be implemented. Entry: back to debugging -- deferred words Date: Sun May 20 20:37:11 CEST 2007 if the idea is just to get debugging working, it's easy: execute will do enough. Entry: back to thinking about the literal stack.. Date: Sun May 20 23:53:26 CEST 2007 there's a jucy fruit on the tree somewhere.. but i can't see it through the thick leafs. a literal stack is an interesting idea, and also is commutation of some constructs with literal stack.. i noticed that a problem atm is hardcoding of [lit a b] instructions: the number of arguments is hardcoded. could be fixed with a postproc step, but have to be careful there.. Entry: parsing macros Date: Mon May 21 11:42:18 CEST 2007 forth parsing words require an input to be attached. my model does not allow that: it requires parsing macros to live in a separate class. hmm.. this is really kind of complicated. what about providing a mechanism to create parsing macros as pure symbolic macros? hmm.. ok, i got symbolic expansion macros now, but that's not the same as recursive parsing macros! i'm having difficulty getting my head around all this.. next step is to write a macro mode which recursively calls the parser. ok, i think i found it now: the trick is to allow composition. the best way to do this is probably to write the parsers as CAT words. Entry: parsing Date: Mon May 21 14:16:48 CEST 2007 i think i got it now. i'm just doing parsing wrong: each parser should have an explicit 'read' and 'write' operation. then some glue can be constructed to compose all of them. 'read' reads the next input atom, and 'write' outputs CAT code in parsed or symbolic form. i need to really let this go and get the usb driver working.. rewrite stuff accumulated thus far: - explicit literal stack with compile postprocess - parser with recursive composition anyway, the bigger picture becomes visible: 3 different interpreters - compiler is kept in compositional mode: every source atom corresponds to a single action in CAT - before: parser converts multiword constructs into single word constructs - after: assembler uses localized arguments -> not compositional, just a sequence of independent commands Entry: grounding problems Date: Mon May 21 16:24:16 CEST 2007 very strange: if i touch the table, the pic resets. some kind of EM interference. i don't really know what's going on, but putting the stamp in a cage worked: just a grounded metal top of a metal box. if i stick the probe in the carpet, i can measure about 25 V peak-to-peak 50Hz signal. maybe i should just ground my table? ok, i connected the TV cable shield plugged into the cable modem to the case of zzz. without this cable there's 114V ac accross. this seems to fix the problem: no more 50Hz on the carpet. Entry: defer Date: Mon May 21 17:11:41 CEST 2007 hmm... the only thing i really need is to 'overwrite' a function. using a separate ram table for deferred words might be a good solution if a lot of them are needed, but it sure does complicate matters. moreover: it requires loading values to ram etc.. what i need is really a cheap hack: : someword nopf 1 2 3 ; the 'nopf' could be overwritten, since it's #xffff. this opcode can then refer to the next definition. Entry: usb debugging Date: Tue May 22 13:49:39 CEST 2007 using usbmon, i get this as first failure after the first request, which is a device request: d97cb540 144438646 S Ci:000:00 s 80 06 0100 0000 0040 64 < d97cb540 145068664 C Ci:000:00 -84 0 the odd thing is the request length, which is set to 64 and not 8. status code is -84 which means http://www.mail-archive.com/linux-usb-devel@lists.sourceforge.net/msg25936.html so why doesn't it respond at all? maybe i need to acknowledge the TRNIF before sending a response? Entry: the a & f registers Date: Tue May 22 16:50:37 CEST 2007 i need a proper coding style. lets try the following: caller is responsable for saving the current object context. this means it's regarded as a low level feature, and bad coding style to pass arguments in the a and f register. conclusion: use them only in small lowlevel words, and use functional words or different object representations on higher levels. CURRENT OBJECT = BAD !! Entry: different macro implementation Date: Tue May 22 16:57:42 CEST 2007 or better: an extension currently 'macro' in forth only takes names from the macro dict. what about allowing runtime behaviour here? Entry: ram copy Date: Wed May 23 11:17:14 CEST 2007 funny, but i don't have any ram copy facility! the reason is of course there is only one free indirect addressing register to use. in order to make a faster one, interrupts need to be disabled and one of the two other regs need to be used. no time to think about that now, so i'm going to avoid mem->mem copy, and save only what i need.. (SETUP request is what i'd like to copy) Entry: something wrong with XINST Date: Thu May 24 12:19:49 CEST 2007 this is probably the cause of a lot of my misery: somehow access bank variables don't work right when XINST indirect addressing is enabled. for the workshop i switched back to old inst, with access bank. need to figure out what's going on there later: somehow fetching/storing address 96 doesn't work either.. if i stay low, it works. Entry: bouncing ball physics Date: Thu May 24 15:22:02 CEST 2007 a bouncing ball can be made using the natural rollover from 255->0, combined with some coordinate mapping. A---B | | D---C -> A---B---A | | | D---C---D | | | A---B---A so, using the high bit to signify whether a coordinate is reversed, the operation simply becomes: : bounce clc rot<>c else rot>>c #x7F xor then ; or even simpler: 1st 7 high? test if -1 xor then Entry: johannes config bits Date: Sat May 26 18:52:37 CEST 2007 low voltage program off HS oscillator power up timer off Entry: meta workshop notes Date: Mon May 28 09:25:16 CEST 2007 all went really well after day 1 of total chaos, very happy with the result in the end. some remarks: 1. need a proper 'erase-all' in case the chip is messed up 2. need interaction words composition -> all symbolic 3. more docs or reference words -> find some automated mechanism 4. need simpler conditionals 5. maybe distinguish between @high? and high? -> btf are odd duck 6. investigate extended instruction set troubles 7. automate 'expose' Entry: quoting symbols Date: Mon May 28 09:35:39 CEST 2007 so, why not use syntax for this? `hello : 1 2 3 ; i think i need to preserve parsing words for the simple reason that ':' is a parsing word. changing that behaviour makes things very different from standard forth. however, internally the parsing words should compile to the literal stack. the code above is actually quite clean. it has a symbolic representation as CAT code, in the form of lisp's quote form. this could be translated to forth in a minimal way. i could use this symbolic representation as the output of the parsing stage. an alternative lexer could then be used to make use of the more functional forth described above (one without parsing words, only some symbolic quote mechanism, where macros are purely concatenative). note that since it's not legal to have a literal symbol not optimized away, the ':' is redundant: symbols present after conpilation are just labels. maybe even better, symbols are always labels. so why not get rid of the space? :help 1 2 3 ; so, if parsing macros are symbolic transformers, interactive macros could be the same. 'test words' if you want. this could lead to a better simulator. the first version looks better, and has ` compile as literal. Entry: literal stack Date: Mon May 28 10:01:19 CEST 2007 just a quick look at what it would take: 1. abstract all literal patterns 2. make a local change in the abstraction so this boils down to writing a generic pattern generator for literal opti, and a mechanism to execute arbitrary macros as a pattern. this is already there, but a bit of a hack. maybe it should be the default? ok. there is already 'lit' defined in comp.ss, which can be extended to take multiple arguments. Entry: cache Date: Mon May 28 10:09:10 CEST 2007 an annoying thing in the current code is to have to reload everything when an implementation of a word changes: the cache never invalidates. or, it's not a cache. so i need to change the implementation of 'word' to include a cache mechanism. it would be interesting to plug into the cache mechanism of scheme, but that would require either a lowlevel thing, or something with namespaces. Entry: bug fixing day Date: Sun Jun 3 12:32:20 CEST 2007 time to clean up some minor annoyances: * serial port settings: use 'system' + platform script when port is opened * faster upload (faster baudrate?) * snot integration + better emacs integration * fix parser -> parse to symbolic code * create interpret macros * sheepsint: build board tests for proto Entry: monad stuff Date: Sun Jun 3 17:15:08 CEST 2007 one problem i have with the way i perform function lifting (monads), is that it's not mixable: i can't just 'tag on' another monad. maybe this should be made a little more explicit. the next thing i need to implement is parsing macros: symbolic preprocessors to map forth to something closer to 1-1 cat code. last time i got lucky: i was able to use code as one of the input streams. now that's not so easy any more: there is an input stream which is not code. i guess the easiest way to tackle this is to just define a prototype CAT function for a parsing word, and work from there. in rout -> in+ rout+ with rout a reversely accumulated list of atoms. it's like the assembler proto, but with an extra 'in' state value. the default parser moves an atom from in -> rout it would be nice to be able to compose parsing macros, so they really should be a special kind of macro: one built on top of ordinary macros, with the input stream on top of stack, and a primitive 'read' which takes an input object. Entry: snotification Date: Sun Jun 3 20:55:19 CEST 2007 -> entry point = load state + enter main loop -> main loop = event dispatch got it mainly working, but i'm experiencing problems with asynchronous messages.. maybe i should get rid of the dots? Entry: faith, evolution and programming languages Date: Tue Jun 5 11:21:39 CEST 2007 bye Phillip Wadler, April 27, 2007 Google Tech Talks http://video.google.com/url?docid=-4167170843018186532 a bit over my head, but things to look into: - a logic corresponding to a programming language - contracts - haskell type classes for polymorphy about logic & programming languages: http://video.google.com/url?docid=-4851250372422374791 Entry: boot config Date: Wed Jun 6 19:06:17 CEST 2007 i'm looking for a better default for the boot loader, to make sure a project is either in one of 2 states: virgin: run purrr interpreter on boot, no interrupts app: fresh reset vector + isr installed if i make it so that 'scratch' can safely erase the boot sector, things might get more robust. things that can go wrong: - reset not defined, but isr defined solotion: always define them in the same macro - reset or isr defined, but target code is gone solution: always erase the boot block on 'scrap' Entry: forthtv pro Date: Tue Jun 12 04:18:08 CEST 2007 let's see. what i need is a dual processor 18f1220 system VIDEO - low bandwidth I2C master (pull: poll at line frequency) - video using USART out - audio sampled at video line frequency HUB - low bandwidth I2C slave (push) - keyboard interface (bitbanged) - USART for host serial this is a nice excuse to prepare brood for multicore projects. note that the 18 pin chips do not have I2C, so i need to go to 28 pin versions. Entry: bank select Date: Tue Jun 19 15:44:50 CEST 2007 keeping bsr at a fixed value, the extra bit in the instructions that access the register file can be used as an address bit. note 1x20 has only 256 bytes of ram. Entry: sheepsint core todo Date: Wed Jun 20 14:52:53 CEST 2007 - standalone boot + fall into debugger - battery operated - brood async io - 16 bit math for control ops - note/exponential lookup tables - pot denoise? - keep it working (*) (*) don't know if i can do that yet. what i can do is to freeze the software: make a fork of brood. i also can't fix the boot block. but i can fix the app block.. maybe i should go that way. Entry: text interface Date: Wed Jun 20 16:07:24 CEST 2007 i probably need to take a deep breath and change the monitor from binary to text. this would make it a bit easier to standardize, and also, make it usable without the brood system, for debug purposes.. Entry: application control flow Date: Wed Jun 20 16:38:38 CEST 2007 1) boot 2) mainloop (contains RX check) 3) on RX, fall into interpreter then from the interpreter, 2) can be entered. this works like a charm. sheepsint runs fine on 2 AAA batteries too using 18LF1220. summary: - empty bootblock -> fall into interpreter ('warm') - application -> install reset and isr vector (best at the same time) something which is important though: if there's no serial TX connected to the pic RX pin, something needs to pull the line high. on the CATkit board, the easiest way is to insert a jumper inbetween RX and TX. Entry: DTC Date: Wed Jun 20 18:03:14 CEST 2007 time to do the real job: a dtc forth. what i'd like to do is to chop the chip up in 2 pieces. first half is kernel + audio, second half is DTC on top of that. this is not so easy as it looks :) but.. it might be more robust. basicly i have the following choice: A. go brood/snot and finish that interface (requires emacs + plt) B. go binary and use just a terminal emulator i basicly promised B. which, for education and not-too-sophisticated use, is what we need. getting A. ready to the point where i can teach it is too much, so i have no choice really. i need a real forth! and i need it before i can do more synth stuff.. or better, while doing. so what is necessary? 1. terminal input with XON/XOFF 2. dictionary 3. compile link to ram 4. copy ram->rom Entry: conditionals Date: Wed Jun 20 21:16:41 CEST 2007 ok.. using the flag macros can be fast, but it's alsy really really hard to use if a condition always needs to be a macro. so i need basic '=' etc.. using nfdrop, and proper if that accepts any kind of byte. not completely tested, but asm looks ok. Entry: mini module system Date: Wed Jun 20 22:19:50 CEST 2007 basicly, do something similar as in PF: a 'provide' word will skip loading the current file if the word already exists. Entry: terminal.f Date: Wed Jun 20 23:10:17 CEST 2007 thinking about this XON/XOFF thing: there is really no way around doing this with interrupts and proper buffers. the problem is really that the when we send an XOFF, a byte can be already in progress. in fact, if there's no break, and the host is sending full speed, it probably is. so a proper interrupt/buffer scheme is necessary. time to dig up those cool 15 byte buffers again :) Entry: read/write pattern Date: Thu Jun 21 12:02:31 CEST 2007 something which occurs a lot is an update to memory which i'd like to put in a macro. till now i always solved this using a macro which expects a memory address. maybe that's the only sane solution? need to think about this.. a bit of a hack, but something that might be interesting: have a 'lastref' macro which compiles a ref to the last referred variable. Entry: workshop Date: Thu Jun 21 12:41:26 CEST 2007 this serial terminal thing is not going very fast.. maybe i should focus on finishing the 16 bit words first, then build a tethered DTC on top of that? maybe indeed best not to stress too much. it is working. i just need to add some control to the synth. Entry: multiplication Date: Thu Jun 21 21:38:45 CEST 2007 the first thing to do is to create a generic unsigned multiplication, and derive the other muls from that. let's call 'z' a 8 bit shift (256) we need to compute (x0 + x1 z) (y0 + y1 z) all coefficients are 0 - 255 this gives z^0 x0 y0 z^1 x1 y0 x0 y1 z^2 x1 y1 the lowest of 4 bytes is unaffected by the 3 bottom ones the second of 4 top one so, i'd like to do this - fast - functional so no temp variables the variables are presented as x0 x1 y0 y1 every number is used twice now the juggling done: i gave up on not using ram. it's probably possible to just use the stacks, but it's really inconvenient due to the 'convolutive' nature of multiplication. what i mean is: multiplication has all to all datadependencies, and is not easily serialized. if it is serialized, it needs random access (variable names) or at least relative indexing. forth is not good at that. Entry: refactoring Date: Fri Jun 22 13:18:53 CEST 2007 some things that need to change in brood to make it easier to understand and modify: - words need to be cached, not delayed evaled, so incremental loads are possible - parser macros -> purely symbolic, using only a 'quote' word for some 'pure forth' - partial evaluator needs to be properly defined, so more elaborate operations are possible. i.e. explicit literal stack + commutation of operations with literals. so, in short: CACHE, PUREFORTH intermediate (without parsing words), and explicit PARTIAL EVALUATOR. the PE needs to work together with the PUREFORTH, to be able to have symbols als "ghost values". Entry: forth vs DSP Date: Fri Jun 22 13:27:09 CEST 2007 following the remark above about multiplication. most DSP stuff is like that, so i wonder if it makes much sense to write a forth for the dsPIC. anyways, it shouldn't be too hard once i clean up the compiler code a bit. Entry: sheepsint next Date: Fri Jun 22 16:34:41 CEST 2007 ok, DTC and multiplier are working. time to get busy :) maybe i do need to think a bit about the memory model though. might be interesting to have full device control. Entry: memory model Date: Fri Jun 22 17:40:02 CEST 2007 what about simply: * kernel is overlayed with RAM + EEPROM * all the rest is flash note that only the first 32kb can contain VM code, due to VM using 2 bits. the other 32kb is addressable, but only usable for tables etc.. not important now for PICs i use. so ram max address space is max #x1000, data eeprom i'm not sure there is a limit. but we have only #x100 and it's not used. what about we map the flash to the upper 32 kb, and ram from the start? then eeprom could be added later. Entry: vm macros Date: Sat Jun 23 22:47:10 CEST 2007 basicly, i need control words. so i need a mechanism for vm macros. ok, in place. next is to just write macros, and to add a mechanism for loading. actually, this is kind of interesting, since it requires 'control stack' operations. to re-iterate, i have these kinds of macros: - peephole optimizers (asm buffer + used as literal stack) - control operations (use data stack as control stack) - recursive macros - simple incremental macros (writer monad) - whole-state assembler macros (i.e. global optimization) if i make the stacks a bit more obvious: literal and control stack need to be independent. control stack is sort of a literal return stack. so i just need to write accessors that bridge literal stack (asm buffer) and control stack (data stack). the more general thing that interests me is to make more functionality available to the forth level, so more powerful macros can be written straight in forth, without having to resort to tricks. in short: i need a meta-forth, not a meta-cat, so cat can be tucked away as an implementation/intermediate language. Entry: compilation stack and word names Date: Mon Jun 25 11:40:58 CEST 2007 i find some standard forth words a bit confusiong. it's probably easier to start calling the compilation stack 'c' and be explicit about the traffic. there are only 2 label operations: localsym>c (generates a new label) and label>c (compiles a label reference for the assembler). in ordinary forth, labels can be patched, effectively implementing a dual pass assembly. since we're not using mutation, we just generate a label at the first occurence (instead of reserving an empty cell and pushing its address) and bind it to opcodes as required later. these symbols will be bound by the multipass assembler later. Entry: writer macros Date: Mon Jun 25 11:56:23 CEST 2007 these are confusing. maybe i really shouldn't distinguish between 'writer' macros, and 'asm buffer' macros. the the writer thing is clumsy and a bit hard to understand. so i'm taking it out. + it's simpler: i'm using some I/O style monad '>asm' - writer macros can't be isolated any more (assumption needs to be: modifies the whole state, not just concatenation.) this doesn't seem to be a big disadvantage. it's probably better to use some kind of tag system to classify macros according to properties. the only thing i use it for is optimization, where missing a classification means some optimization can't be done, so it won't cause fatal errors. Entry: make-composer Date: Mon Jun 25 13:24:19 CEST 2007 another thing i'm running into is my terminology about namespaces. if i have a collection of words, i'd like to specificy: - source dictionary (semantics) - destination dictionary (def) - parser (syntax ) currently that's make-composer, but the names used are a bit confusing. this can be done better. maybe i should just rename make-composer to define/parse/find. Entry: parser words Date: Mon Jun 25 13:38:01 CEST 2007 this needs a thought about what to do with parsing words, mostly quoted symbols. i guess it's safest to put them on the compilation stack, so i don't need any literal optimizations. Entry: todo Date: Mon Jun 25 13:38:54 CEST 2007 - take out all writer stuff OK - rename asm-buffer-find to find-asm-buffer asm-buffer-register! to register!-asm-buffer state-parse to parse-state - fix parser macros: decide on lit/comp stack - fix assembler evaluator i'm not going to change th find/register!/parse names. this is just cosmetics.. about fixing the assembler evaluator. what about requiring all literal arguments to be cat code? Entry: literal stack + compilation stack Date: Mon Jun 25 14:37:20 CEST 2007 the important thing about stacks is that you need two of them, i once read. which seems to be the case. currently i'm trying to figure out what should go to what by default. the idea of the 'literal stack' is simply to be able to do some computation at compile time. a nice feature here is that a lot of operations become more natural. for example: 1 2 + is really just 3. and this is a mandatory optimization in badnop. something you can rely on as a feature. standard forth would make this explicit [ 1 2 + ]L the reason i dont use the above is that my meta language is not forth. it's CAT. more importantly, CAT is much more powerful than the simple 8 bit forth is. so, the idea goes: - mandatory literal optimization (compile time evaluation) - forth extended with 'ghost' types the ghost types are things that make no sense for the microcontroller, but when they are combined with other ghost types, result in things that do make sense. the most obvious one is assembler labels: ' foo will compile code that loads the (symbolic) address of foo. if this is followed by a macro that consumes it, the whole can be reduced to code that does have a meaning on the microcontroller. i'm not 100% convinced this is a good idea (not being explicit), but it does feel like one. what i'm looking for is to give it a decent meaning. and to find out when to use the literal stack, and when to use the compilation stack. another thorn is the way the literal stack is implemented, but that can be fixed later. right now i need to get the semantics right. i'm not asking the right question.. what's the real problem here? the target chip has a clear separation of ROM and RAM. this is both convenient (code is persistent), and not (they need to be treated differently). what i'd like to do is to make a source file correspond to only ROM. standard forth doesn't do that: loading a file both writes code and initializes data. i guess this is the main reason why things are different for me: harvard: ram initialization (run-time code) and meta compilation (compile-time code) are strictly separate. von-neumann: both can be done at the same time (program load time), and blurr together. so what does this have to do with the literal stack? - the meta language is not forth - i'm trying to disguise this basicly, i'd like to not think about this thing being a cross-compiler, and act as if everything runs on the target. one way of doing that, is to require compile time evaluation whenever it is possible. as a result, the simple recursive macro system, which does not refer to the real meta language directly, becomes more powerful: required partial evaluation gives it some run-time power, instead of merely being passive concatenation of code. so the real question is: how to simplify the target language such that no explicit reference to the meta language is ever necessary, and all macros have a compositional semantics. the way that seems most natural to me is: - partial evaluation is the default: act as if everything is done at run time (like "1 2 +"), but write the macros such that they perform compile time compilation + raise an error when higher level things can't be resolved at compile time. - some constructs use the COMPILATION STACK referred to as 'c'. this is mainly intended for code blocks, and serves a bit the role of the return stack. this also gives the solution for parsing words: their default semantics is to map something to a literal compiler. a common problem i encountered is a macro which has 2 references to the same name. this is now easily solved using the compilation stack. so the key is really in the words '>c' and 'c>' Entry: vm words and literals Date: Mon Jun 25 15:18:13 CEST 2007 so, looking at the remarks before.. the literal stack is really more than just literals. it could contain words to. words in their normal meaning are calls. so: the assembler buffer is just a stack of symbols, bound to semantics (literal, call, jump). what i really need is 2 new opcodes: lit and word, that will be resolved in the assembler, but that can be used in the optimizer and partial evaluator without too much trouble. so i think i see the roadmap now: 1. fix assembler to take these opcodes: cw call word (code) jw jump word qw quote word (data) which are really just the primitives used in the VM 2. fix the peephole optimizer to operate on those words this will give a proper semantics to the literal stack, basicly it will then contain words + their meaning: code or data. again a simple pattern: delay low level representation as long as possible. ok. now i need to check first if the monitor code still runs.. it does. time to fix this. it's probably easiest to create an extra assembly step which filters out the pseudo ops. could be interesting to clean up the assembler a bit. i'm writing pic18-compile-post now, and will start using 'values' to do the expansion. at first i though this values thing was a bit clumsy, but having to wrap things in a list is usually more work: it's better to do this in the consumer using call-with-values, than in the producer when there are a lot more producers than consumers. which is the case here.. Entry: literals : save Date: Mon Jun 25 16:02:53 CEST 2007 oops. too much coffee, going too fast.. i AM doing SAVE for each literal, so the postprocess step shoul maybe perform the save too? this is a bit more complicated than i thought.. so: when to do SAVE? currently save does: ((['drop] save) '()) (([op 'POSTDEC0 0 0] save) `([,op INDF0 1 0])) ((save) '([dup])) what about just a second compiler pass with the word 'save' ? that 2nd pass seems to work. problem now is a lot of the literal macros do need their arguments. am i going to try to fix that now? maybe think a bit about how to do that in a smart way.. ok. roadmap: - just added c> == '(qw) op>asm - replace all lit macros by a qw macro, and remove them from expose hack done, now for the calls TODO: - replace branches and calls with pseudo ops - fix vm control ops - start working on the control part of synth Entry: multiple passes: pseudo assembly language Date: Wed Jun 27 10:41:04 CEST 2007 so the pseudo assembly langauge is a bit more explicit now. between forth and real assembler there is a representation where the opcodes qw, cw and jw are used. they give a proper typed stack meaning to the assembly buffer. as a consequence, quoting in macros can be eliminated, and pure postfix notation can be used, using the 'word>c' operation, takes a code word from the assembly buffer, and moves the tag to the compilation stack. in short, instead of ' word <...> one can do word word>c <...> where <...> handles the symbol/address. now wait.. if quote is not longer necessary in the compile time semantics, why use it? (it is still necessary in the run time code. back to that later.) the whole idea seems to be: because all behaviour is postponed (compilation), it doesn't have to be stopped before it happens. meaning: if i enter 'broem' at a command prompt, it will execute -> damage done. if i want to not run it, i need to 'quote' it, which means postpone execution. during compilation, everything is postponed, so no need for quotation! wonder if i can make that a bit more formal. this does ring a bell somewhere. been reading about the macro/module system in PLT scheme. something with keeping run time and compile time separated to make dependencies explicit.. anyways. oops. not completely true: if it's a macro, and you want to refer to it, it needs to be quoted. an 'almost right' thing here.. if there are no macros, it's right. macros are code, the rest is data during compilation. if there's no code, true. back to my original point: quoting is postponing execution. maybe i should just try it to see if i get into situations that are awkward, because it does look promising. Entry: vm compilation: one word to change semantics of parsed code? Date: Wed Jun 27 11:25:18 CEST 2007 since the compilation buffer already contains the code/data (quote/call) distinction, only a single word is necessary to convert any operation to its vm equivalent. this word should leave macros alone. problem here is that i need a type (pattern) matching word here, so not yet.. simpler: i'd like to remove the quote in 'vm->native/compile' and in the vm-core.f file, so i can easily compose macros. quote is really a preprocessing thing, which is necessary to get from source -> forced data semantics. once parsed to intermediate, no quote is necessary. argh.. so i don't really need to remove the quote there, since it's exactly that: a preprocessing step to generate native forth code. it's ok this includes a quote operation. si '_literal' and '_compile' take data atoms on the literal stack, which means quoting is necessary for code atoms. this allows VM semantics (decision for code/data) to be different than lower level language, which is a good thing. check vm-core.f for some explanation. sommary: "purely compositional macros == good thing". it's the basic idea of CAT. see the notes below. question is though: can i make these macros powerful enough to have some kind of lambda construct? postponed macros basicly? the only thing i want to solve now is conditionals, but better to aim for the bigger thing. Entry: language path Date: Wed Jun 27 11:41:28 CEST 2007 FORTH with parsing words = symbolic ---> FORTH with only quote = symbolic ---> pure FORTH without quote, = compositional CAT code --> intermediate assembler = effected macros (real asm) with pseudo asm (qw, cw, jw) --> symbolic machine assembler --> binary i should give these a name PURRR18/forth (quote + parsing words) ---> PURRR18/quote (pure + quoting word) ---> PURRR18/pure (purely compositional macro language, as CAT code) ---> PURRR18/asm (PIC18/asm augmented with pseudo ops) ---> PIC18/asm (my version of the symbolic assembler language) ---> PIC18/bin (binary machine code) Entry: i want it all Date: Wed Jun 27 12:14:37 CEST 2007 what about postponing macros? i basicly want conditional branching at compile time, but full lambda (quoted macros) would be interesting too. now what did i expect? forth is not CAT. this is a game of syntax, in the end.. i'm trying to cram a meta language into the language syntax, without using its quoting mechanism: lists. it looks like i can't make it too powerful without introducing quoting syntax, which is what i'm trying to avoid to keep it simple. the problem which sprouted this line of thought is the VM return operation. the words "_then _;" don't work because ";" expects a word. so i'm going to need an extra primitive to solve this conditional execution. maybe there is only one real solution. make the ' operation a syntactic one, like in lisp. - if quote is syntax, an intermediate language is not necessary. - if it's not, a parsing stream needs to be available the last one is obviously worse, since it makes composition harder. so that's what it will be: quote needs to be syntax, and ' is a special character. so, pure forth in s-expressions is ::= ( {} ) ::= | ::= | ( quote ) to preserve previous syntax, the run-time semantics of "' word" still is "load address of word on parameter stack" so, summarize again: is quote a lexing operation, or a syntactic operation? the answer seems to be the former. the problem this solves is this: syntacticly, code and data are distinct. the full domain is split in 2 parts, but semanticly, code is a subset of data. introducing quoting at the lexing level gives: - better mapping to CAT (using the same lexical trick) - saner semantics: quote is defined independent of an input stream - quoting can be used in macros, using forth syntax, keeping the compositional property in the language path above, the 'pure' and 'pure+quote' will now be the same, so i have Entry: updated language path Date: Wed Jun 27 13:31:57 CEST 2007 PURRR18/forth (quote + parsing words, symbolic form is not CAT) ---> PURRR18/pure (purely compositional macro language, has symbolic CAT form) ---> PURRR18/asm (PIC18/asm augmented with pseudo ops) ---> PIC18/asm (my version of the symbolic assembler language) ---> PIC18/bin (binary machine code) so the entry point is there to preserve original forth syntax, i.e. ": abc". for internal processing, this will be mapped to "'abc make-:" or as s-expression ((quote abc) make-:). the 'make' name i need to think about still.. reason for having ' as lexing operation, instead of parsing, is that it eliminates one parsing layer + it maps better to CAT. this is different than forth, but in a way that is probably hardly noticed. Entry: again? Date: Wed Jun 27 16:16:34 CEST 2007 so why not just a parsing step? i need types to do this properly macros: pure+quote -> pure forth: forth-> pure parsing words are merely frontends for pure the alternatives are: 1. lexing produces a stream of symbols and numbers. then there are 2 different parsers that map this to pure forth. 2. lexing already produces quotes the first option is really simpler, so let's keep that. Entry: parsing Date: Wed Jun 27 16:36:30 CEST 2007 so now i need to redo parsing. currently, it's a bit of a hack. it's not extendible. but do i really want it to be extendible? i need a different 'kind' of word. a parser is not a macro.. they operate on different levels. so let's abstract it out a bit. 2 steps need to be separated forth -> symbolic cat symbolic cat -> parsed cat both are parsing operations structurally, but it's maybe best to give them different names? i got it, except for the quoting stuff.. now, a problem i ran into is that ' abc actually compiles a byte address. i wonder where this will fail if i change that. Entry: bytes or words Date: Wed Jun 27 18:25:42 CEST 2007 some conflict here bytes: ' abc org needs byte addresses words: "' abc" can be used as just a symbol. maybe quote is more important. maybe we need to have "execute" take word addresses everywhere? that's also better for the VM. the thing is: data is always byte addressed, while code is always word addressed. a unified address space (bytes) would be nice, but makes things complicated since quoting is not just quoting.. so best seems to me: * execute takes word addresses * monitor JSR will also take word addresses * quoting a symbol name has default semantics to load word address on stack Entry: cosmetics Date: Wed Jun 27 18:40:35 CEST 2007 TODO: - make dtc intermediate code a bit more readable - fix prj path as mutable state (arbitrary.. maybe see it as a constant?) last one isn't so important.. first one requires some kind of loopback, and i think it will make things too complicated.. need to think about it. Entry: dtc control primitives Date: Wed Jun 27 20:49:18 CEST 2007 i need 'run' and 'jump' prims.. time to get confused about primitives and programs again. if i remember correctly, the lessen is to never let primitive addresses leak into the higher level code: it's not convenient to have to deal with 2 kinds of code words. in cat, i only use programs (lists of primitives) never primitives directly. same here. just like for primitives, i need to choose for some kind of basic representation: byte or word addresses for composite code? the only thing i need to take care of is that continuations (return addresses) are compatible with "run". i'm getting confused.. i guess i just need to write if/then/else and we'll see how to continue. it does look like there's no easy way other than: LIT L0 BRZ L0: and LIT L0 ROUTE LIT L1 RUN; L0: L1: ok, so be it. can't win them all.. maybe a good opportunity to use ifte instead of if .. then .. else. so.. primitives. can't 'run' primitives. can run programs. so the idea is that quoting code always quotes programs, so i need something like PF's { and } words. for conditional branching i can use 'route' as a basic word. cloaqued goto or something. route \ ? program -- Entry: assembler bug Date: Thu Jun 28 00:07:21 CEST 2007 performing meta evaluation needs to happen in the 2nd pass, because of the presence of code labels. time to clean up the assembler, and sort out all different meanings. the bug is simple: just retry if there's an undefined symbol. then another problem: literals take 14 bits, but quoted programs are byte addresses. can we resolve this somehow? if i really need the return stack to contain word addressess, that can still be fixed later. now i'm going for 'run' and 'run/b'. ok, it seems to work now. Entry: vm optimization Date: Thu Jun 28 09:36:41 CEST 2007 now it's time to reduce code. it's not very fast anyway, so no reason to start spilling bytes. but this is for later. got some stuff to get ready now. i'm happy with how it's looking though. some minor things need fixing, probably the most important one being return stack alignation. something to focus on is to limit the number of macros. i probably only need conditionals, the rest can be written in forth even. macros are only necessary for marking jumps. Entry: sheepsint 8 bit interface Date: Thu Jun 28 15:24:27 CEST 2007 so. i need a synth control layer. going to use the ordinary 8 bit forth. Entry: loading dtc forth Date: Thu Jun 28 16:03:59 CEST 2007 problem. the mapping from vm -> native forth is not just syntactic. it uses knowledge about target words being macros (as native macros) or dtc target words. this means 'load' will not work properly. so this decision needs to be postponed. easiest is to load both symbols (word and semantics) on the literal stack, and have a macro determine the semantics. ok, seems to work. Entry: problem with dup and literals Date: Fri Jun 29 09:49:11 CEST 2007 123 dup 456 doesn't give 2 literals on the stack.. if i let dup copy the literal, some other things go wrong.. maybe it's best to have dup copy the literal, and solve the other problems in a second pass? i found an optimization that solves it in one pass, by realizing 1 (2 3 !) -> <...> 1 where <...> stores the value with stack effect = 0. other places where this might go wrong is where an explicit dup is expected.. there are none outside of the '!' i think. Entry: sheepsint core Date: Fri Jun 29 10:51:39 CEST 2007 things to fix: - noise - sample playback then for control, i need to find ways to map parameters to meaningful ranges. this is where multiplication and exponential table lookup come into the picture, which might be an interesting advanced topic. ok, there's a problem with the buffering: i don't have a fixed sample rate any more, so computed values need to be sent out immediately: i have no idea when the next event will output the previous state! ok, just moved it to the end of the isr.. now there's a bit of jitter, but probably not really noticable. noise still isn't working. i can't find the problem. probably needs a fresh look. also, notes aren't working.. Entry: unified namespace and rolling back Date: Sun Jul 1 15:47:32 CEST 2007 for target stuff.. meaning: something defined as a variable should be able to be redefined as just a target word. or not? this is not so easy since all meta objects are compiled into the core, and are not really seen as data.. there is also a conflict between forth's "first find" and my meta language's last redefined. maybe the project file should index macros somehow? so they to can be reverted.. this would be cool for variable names etc.. Entry: VM and TBLPTR Date: Sun Jul 1 15:57:09 CEST 2007 maybe it's not such a good idea after all.. the deal is this: the VM should be easy to use. anything that needs speed can simply be moved to primitive code, completely eliminating interpretation overhead. i put some effort into making both layers interoperable, so why not use it? it seems as if each 'useful' feature of the VM makes it a lot slower. why do i care? the whole idea is to make some kind of standard. why not write the VM on top of the memory model for instance? Entry: swapf Date: Sun Jul 1 17:53:10 CEST 2007 something is wrong with the nswap macro: ok, i found it: nothing wrong with the macro. there was en error in the assembler binary opcode. Entry: control slides Date: Sun Jul 1 18:17:29 CEST 2007 linear & exponential. in-place updates? probably best to go out-of-place. with wrap-around? Entry: control timer Date: Sun Jul 1 18:37:05 CEST 2007 previous sheepsint had some fixed sample->control rate timer. here i'm using a fixed sample rate for the noise generator (bit less than 8 khz), which increments a 32bit counter once every tick. this can be used as a general fixed time source. ok, trying to sync to bits of the 32 bit timer, i'm using this code: \ control at 244 Hz : wait-control begin tick0 6 high? until cli tick0 6 low sti ; but the cli/sti isn't necessary: the timer increment is atomic: there's no read-modify-write. one problem though, if the counter is reset, higher bits will never get set! so a better strategy would be to wait for a bit to go low, then wait for it to go high, so the transition is captured. Entry: fix macro loading Date: Tue Jul 3 12:07:42 CEST 2007 really annoying to have these not synced to project.. maybe include them directly in the project file. also need caching: timestamps would work together with mark points. a problem point is missing variable and function name spaces. once something has been a macro, it will remain a macro. a single dictionary stack is easier to use. Entry: transient controller Date: Wed Jul 4 12:43:16 CEST 2007 this is fairly simple if it only needs to save the mixer config (one byte). saving oscillator frequency state requires 6 bytes more. what about making the transient word itself responsible for saving current state, and just using the x stack. if the time base is fixed (32 bit tick timer), control words become fairly simple. remaining question: who is responsible for syncing to note tick? this is a question of composition: i.e. hihat + kick at the same time requires hihat word to sync to note, not kick. best to keep control syncing independent of note syncing. Entry: AD conversion Date: Wed Jul 4 13:53:39 CEST 2007 2 things to determine: - aquisition time (sample/hold settling) - TAD (per bit sample time) TAD should be as short as possible, but greater than the minumum TAD, approximately 2uS for 18F1220. the datasheet says for the F version at 8MHz, to use 16TOSC, and for the LF version to use 32TOSC. It was on 16TOSC, 20TAD.. put it to 32TOSC, but can't see a difference. maybe the pots are too noisy. i tried to add a capacitor. 100n and 10u, but no difference.. Entry: noise Date: Wed Jul 4 15:49:18 CEST 2007 noise is probably more useful as one of the oscillators instead of a fixed 3rd one, just like sampler. using the 8 bit timer only for control time base frees up some resources, and decouples noise frequency from control frequency. best seems to be OSC1, keeping in mind the formant mixer. changing the mixers: silence, xmod, formant. and having OSC1 do noise/square/sample. Entry: bootsector Date: Wed Jul 4 16:02:52 CEST 2007 maybe it's best to reserve some functionality for chip erase, so i don't need to worry so much about messing up the bootsector. basicly, just need a single piece that never changes, which has the ability to influence the booting process to run the interpreter. probably an 'ack bang' or something? - keep boot sector free for fast isr - reserve 2nd block for reset vector? seems the core of the problem is that boot vector and isr vectors are in the same block. what if: - default reset vector = jump to second block - add an application vector after this - second block contains some kind of checking code to determine activation of application or debugger Entry: metaprogramming Date: Fri Jul 6 13:03:44 CEST 2007 more things from forth. i'be been using the first couple of macros that use the compilation stack explicitly. i could probably move more code to be accessible from the forth macro language. to have a forth like [ and ] section would make sense. the point where i want to stop is s-expressions: once i'm introducing that syntax into forth, there's nothing stopping it from becoming something completely different. one of the aims really is to keep out s-expressions. however, it's not so hard to have some kind of 'begin ... end' construct that maps directly onto cat code. Entry: noise as osc1 Date: Tue Jul 10 22:22:35 CEST 2007 tested. seems to work. Entry: macros and cat Date: Tue Jul 10 22:25:32 CEST 2007 name space mixing in macros. the ultimate goal is to have a forthish CAT that i can just include in PURRR/18 code. currently the 'c' words, combined with the literal stack, work pretty well. i need to think about cleaning up the semantics a bit. there's a lot of nice things hidden here.. one of those is: you need 2 stacks. mapping behaviour in an assymetric way (i.e. return stack / data stack) is arbitrary "human meaning" to ease understanding of components so they can be composed. Entry: nand synth Date: Tue Jul 10 23:45:29 CEST 2007 works like this: - 4 schmith-trigger based oscillators, cap select (decade) + pot - chained: 2nd AND gate input turns oscillator off - the NOT in the chain prevents subsequent oscillators from being OFF at the same time so, the the first oscillator A produces a square wave. during A's ON period, the second oscillator B produces a square wave, during A's OFF period, the second oscillator gives ON. and so the story continues... ....AAAAAA....AAAAAA....AAAAAA....AAAAAA BBBB.BB.BBBBBB.BB.BBBBBB.BB.BBBBBB.BB.BB C.C.CC.CC.C.C.C.CC.C.C.CC.CC.C.C.CC.CC.C etc this can give a quite complicated pattern after a couple of steps. one thing is missing though, there is no resync: all capacitors keep state between oscillator ON/OFF switches, so no formant-like tricks. Entry: noise as sync source Date: Wed Jul 11 12:12:56 CEST 2007 It's possible to use 'filtered pitched noise' by using the RESO mixer, together with a noise OSC1. however, the opposite: an oscillator resynced by noise i don't have atm. maybe OSC0 should be able to do noise too? OK, that's a different game Entry: boost converter hack Date: Mon Jul 16 19:13:12 CEST 2007 as mentioned before (probably in brood 2 ramblings.txt), it is possible to use a protection diode as rectifier for a signal -> power converter by connecting a signal with a large enough duty cycle directly to an input pin, and connecting a cap across the power pins. related, it should be possible to convert that scheme into a boost converter, by connecting a power supply to an input pin using an inductor, and using the pin's output stage as a switch that to charge the inductor (by connecting the point to ground). when the pin is switched to input, the inductor discharges the energy stored in the magnetic field, and charges the capacitor through the protection diode. this way, the uC can regulate its own supply voltage. this scheme just needs an initial push to charge the capacitor such that enough energy is stored to boot the program that starts the feedback mechanism. Entry: filter bank on PIC18 Date: Mon Jul 16 22:13:34 CEST 2007 so, if i want to run a digital filter on the PIC18, for, say, some demodulation. what performance am i looking at? running on 5V and a xtal, i can get to 10 MIPS. for audio rate signals, say up to 5kHz, this gives 2000 instructions per sample. that's not quite nothing. using half of this for the filtering, and the other half for the decoding and the actual application, we're looking at 1000 instructions of DSP to burn. looks to me there's plenty of room. to make it sound good, tones need to be quite stable. at least 1/16th of a second. say 6.4kHz, this is about 400 samples. Entry: PSK31 and meshing Date: Mon Jul 16 22:27:06 CEST 2007 i think for the waag, we need to keep the basic objective simple: PSK31 as it is tried an true, and there is decoding/encoding software to actually test it. Entry: human naming nature Date: Tue Jul 17 01:21:33 CEST 2007 One of the things that's nice about a compositional language like CAT is that they force you to aggressively factor. Simply because programs become to hard to understand if you don't. Factoring is really identifying (naming) substeps. In a compositional language, factoring is really totally arbitrary, from a machine point at least. Not for the programmer. Since function arguments are not named, names have to be introduced elsewhere. This is that extra bit of 'meaning' in a program which transforms it from the mess a computer just executes, to some meta-executed thing represented in a human mind.. Those are really not the same. Being able to program something and 'knowing' how it works are different things. The 'knowing' is hard to explain sometimes. It's just a force of (human) nature, really.. For a program to be actually readable, a bit more than the connectivity (topology) is necessary: the information encoded in the names an sich seems to help the human brain to understand the connectivity, or at least give it some analogy. Maybe a bit like embedding a topological thing in a geometry to make it more 'real', programming is embedded in the real world of thoughts by associating some native language to it. The two ways to do this are either the lambda calculus (lexical scope) or combinators. Entry: get off that lazy ass Date: Thu Jul 19 13:49:18 CEST 2007 i think i'm not made to idle around. depresses me. people tell me i need to try harder, give it a couple of weeks of idling to find out the true joy of life. i don't have time for that :) so.. next things to tackle are: * fix boot loader so the ICD2 can stay safely in the box for really stupid mistakes. * interaction macros * SNOT and sending code from emacs * the slow highlevel forth on virtual memory Entry: the boot block Date: Thu Jul 19 13:54:04 CEST 2007 conditions: * BLOCK 0 = empty OR 0000 and 0006 contain jumps to BLOCK 1 (soft reset) this ensures that an empty boot block is valid + interrupts and application invocation result in a reset when they are not defined. * during boot, a DEBUG condition is checked. this will force it to run the interpreter to await commands. * if the DEBUG condition is false, the application (addr 0002) is executed. if there's no application, a soft reset is run. (so eventually the chip responds). * installing a new application: - clear boot block - install security jumps - install isr code * possible conditions: - a pin - a boot wait + serial activity - break condition on serial port -- installing the bootblock can be done in a single interaction macro: compile an init macro, then when this succeeds wipe bootblock, and upload a new one. the deal is that the boot sequence up to the DEBUG check is NEVER changed! it's not enough to have your application perform such a test. this can go wrong in it's boot sequence before the check is executed, or even during the check. get it right once, then keep it like it is. another possibility is to have the serial port operate from interrupt. that way sending a break signal could actually stop the program. however, this is more complicated and reduces freedom for custom isrs. -- thinking about it, why the one at 0006 ? ok, it prevents problem if there's a reset vector but no application vector installed. better to be safe. ok, default really is empty boot block: means app is gone. whenever APP and ISR vectors are installed the 'reset-vector' macro needs to be included. Entry: new stuff Date: Mon Jul 23 13:44:31 CEST 2007 done doing goto10 admin stuff. time to make a list of things that need a different approach. BROOD: * streams (don't save intermediate state) * macro namespaces * interaction macros * clean up pattern matching macros * SNOT * clean up / document / reflect on the forth macro semantics (partial evaluation + parsing words) PURRR: * boot block updates * highlevel forth on virtual memory Entry: name spaces Date: Mon Jul 23 13:54:02 CEST 2007 i guess i need a proper name space mixing for the macro system. it should all be just scheme functions, not hashtables full of structures. currently i have the following name spaces: cat, state, store, meta, asm-buffer, forth-parse, macro, badnop so.. let's see if i actually understand the plt scheme namespaces. a namespace is something that maps symbols to storage cells for works like 'eval' and 'load'. so instead of using hash tables and explicit lookup, using namespaces one could use 'eval'. the advantage is that run time 'eval' could be avoided, and macros could be used where possible. so, what do i want really.. * access macros using scheme names in scheme code. * compile (eval?) a symbolic cat function straight to scheme fn * be able to change cat macro name bindings just like scheme questions i need answered: * can an entire namespace be hidden in a module? * is it possible to dynamicly add stuff to a module? (i guess so, using module->namespace) * how to 'merge' namespaces? * can i abstract the rather awkward symbol prefix merging? * is prefix merging really awkward? name spaces in scheme: * once evaled/compiled, an expression is bound to a certain name space and independent of the current one Entry: callout Date: Mon Jul 23 22:30:28 CEST 2007 i need some knowledgable people to discuss this stuff with. don't know where to find them though. things to try: * plt list * comp.lang.forth * picforth list * gnupic list Entry: BROOD 4 takeoff Date: Tue Jul 24 00:00:00 CEST 2007 EDIT: this where the ramp up to brood 4 starts with the move from interpreter -> macros. Entry: really on top of scheme Date: Tue Jul 24 19:02:23 CEST 2007 so, i need to get rid of the explicit interpreter. or not? i'm mostly concerned with name spaces here, not implementation. (1 2 +) -> (lambda stack (apply cat:+ (cons 2 (cons 1 stack)))) what about preserving original source form? do i actually still use that? yes, when printing code. for example, doing (1 2 +) creates a quoted CAT program, which when compiled doesn't have a source form. so, how to assoicate original source form to lambda expressions? i really should define my interface first. i don't need to use raw functions as representations. the 'stuff' that's bound to names can just as well remain a word structure. in the end, i'm doing nothing but replacing hash tables by name spaces. so.. * modules: separate code into logic entitites * namespaces: allow run-time eval/compile the latter part is not really necessary for the core! so, i should build macros first, make sure i have a direct map from: CAT (or any monad language derivative) -> 'raw' cat -> scheme raw cat is just cat with scheme words. so how to do this? - all CAT code is compiled: use modules - how to separate name spaces: (i.e. how to prefix names?) so.. it's seeping through. names are compile time stuff. macros are compile time stuff. anything that juggles names should be a macro. so (cat +) is a macro, which expans to a labda expression, or a variable. it's not enough to have it expand to just a lambda expression. storage should be shared, so (cat +) should return a binding in case of a single expression, or a composition (cat 1 +) in case of multiple arguments. so, what about this: any CAT-like languages use the (: ...) syntax, where the macro : (i.e. 'cat:') transforms the code into a function that maps stack -> stack this way everything is directly accessible from scheme. for example (cat: 1 +) is a lambda expression. neat. even, ':' could signify THE cat. then 'cat-compile' is no more than (eval (cons 'cat: src)) note that i don't really need to ever run any programs. cat is just functions, and in scheme, they can be applied to data. the thing is, i don't need an interpreter. i just need a proper way to associating compiled code to original source form (reflection). this does mean giving up some reflection: the current source/semantics association probably needs to change. it's not a small rewrite.. Entry: the macro way.. Date: Thu Jul 26 11:17:42 CEST 2007 let's start with some basics. apparently structures can be used to implement behaviour of procedures, using struct-type properties. this should be enough to convert completely to macros. i started cat-base.ss so, here we go.. all the freedom is there again. * i'm starting with one modification: low level CAT source representation is reversed. this makes writing the macros a bit easier. this makes (a . b) be 'compose a AFTER composition b', so: (pn-compose a b c) == (apply a (pn-compose b c)) * 3 phases are separated: - compile: atom -> representation of behaviour (apply/cons) - compose: list of words -> nested apply/cons - abstract: application -> lambda expression compile can be recursive due to the presence of quoted programs * reversal is introduced early on: it's too confusing to have it around after the nested 'apply/cons' is in place. i'm switching from pn- to rpn- prefix at the point of abstraction (converting code to scheme lamda expression). * snarfs can be stolen from previous implementation. maybe the code reversal should use a generic reverse macro too. (done) * now all that's left is to solve the name resolution. Entry: separating syntax from semantics Date: Fri Jul 27 13:05:09 CEST 2007 I got the syntax working. Now i'd like to build an abstraction that takes a binder macro, and produces a compiler macro: cat-bind -> cat:: Assuming the structure of the language remains the same. The problem is i keep running into compilation phase problems and i don't really know why. It's quite intriguing, this macro programming. Not quite the same as regular lisp hey :) It's a bit like a lazy language with pattern maching. Maybe it is a lazy language? Would be nice to read a bit about this.. Anyways, i do see to start some programming patterns. I have a problem that i'd like to keep both semantics and syntax abstract. Currently, i pass around 'compile', but it's too general. I'd like to specialize only some compile behaviour, and keep the rest open. So: message passing! That seems to work quite well. Now, on to semantics. Entry: macro expansion Date: Fri Jul 27 16:36:51 CEST 2007 One problem i run into is that (cat: ....) seems to be looking for symbols in the toplevel. I guess if i know why, i'm a big step further in understanding this whole module/namespace stuff.. From the manual: 5.3 Modules and Macros "... uses of the macro in other modules expand to references of the identifier defined or imported at the macro-definition site, as opposed to the use site." This looks like the 'no surprises' rule, or the 'dynamic binding is evil' rule to me. The toplevel can still be used for dynamic binding, hence the macro expands to (#%top . xxx::+) So it looks like i have only one choice. Either i make sure the names are available at the point where the macro body is defined, or i put them in the toplevel explicitly. Let's see if the former is doable. Ok, trivial but still feels a bit weird. Maybe i'm too much accustumed to late binding by 'load/include', which is as far as i get it, exactly what the module system tries to avoid. * Circular dependencies are allowed within a module * Not in between modules * Undefined symbols in a module are not allowed. * Any late binding is to be done in the toplevel (but feels dirty) Ok, time to clean up the utility code. Entry: control structures Date: Fri Jul 27 18:24:03 CEST 2007 .. become a lot easier to implement: (define (xxx.choose no yes condition . stack) (cons (if condition yes no) stack)) Entry: where to store the functions? Date: Fri Jul 27 18:34:08 CEST 2007 This remains a question. I thought it was necessary to have them in a scheme name space. Not true. As long as they can be identified at compile time, and mapped to storage, all is well. Not true, and also not convenient, because i really can't find a good way to do it except for explicitly creating an empty name space and dumping all the references there. Another thing: i don't really need the extra level of indirection a name space cell provides: it is ok to just mutate the word structure that's permanently attached to a certain name. It already behaves as a cell: instead of NAME -> CELL -> WORD we could just have NAME -> WORD since every cell is a word. So why not just dumping stuff into hash tables? If (compile function sym expr) returns a word structure, all is well. Since my language doesn't have anything else than words, each name simply IS a word. Make that nested hash tables, so i have a mutable real store to go with the functional store. Maybe i can even unify them? Entry: macros really are better Date: Fri Jul 27 18:46:44 CEST 2007 * no VM, no custum control structures that invoke the interpreter. just 'apply'. * functionality can still be stored in a hash table: each name refers to a fixed cell = word struct. * hash table needs to be available at compile time Entry: 2 stores Date: Fri Jul 27 19:10:32 CEST 2007 Why not store the functions in the functional store? The main reason is that the functional store is supposed to be dynamic, and the mutable store static, never muted, except for debug purpose. But debug is always! So is there a better reason? * It's not serializable. * It's fully derived from source, and just a cache. So a better division is: - everything that's completely derived from source, and doesn't change during a regular, non core-sev session goes into the hash store. - all the rest, the real state which is result of computations (like assembler labels) goes into the functional store. Entry: compile time hash Date: Tue Jul 31 20:14:25 CEST 2007 let's do this namespace thing: a hash module, used at compile time and later run time to solve all binding problems. something i forgot: a namespace has both runtime and compiletime semantics, however, i need to transfer everything explicitly from compile time to run time if i want to use a hash.. now i am really confused. does this even matter? the hash is not accessible at run-time, but it is possible to have it around at compile time and just have a macro spit out some values.. the real problem is: modules can be compiled independently, and all state accumulated over such a run needs to be saved if it is to be used somewhere else. so what i'm trying to do will probably not work. Entry: got snot working async Date: Wed Aug 1 22:49:16 CEST 2007 so now it's time to do some real work. i still don't want to give up the idea of putting cat names in modules, and using eval to compile code at runtime. it really can't be that hard. would be a good exercise to find out what a namespace needs next to being empty to just compile code.. Entry: cat and #%top / lexical variables? Date: Thu Aug 2 09:00:15 CEST 2007 what about this: redefine #%top in the cat syntax expander to go look for the cat namespace. this should enable the use of lexical scope to do name resolution. i found something easier: using 'identifier-binding' names that are not lexical can be drawn from a namespace object. this gives maximal scheme<->cat interplay, while keeping the namespace mechanism we had before. so: - compilation to lambda expressions - top level name resolution are now separate. at this point it looks like i'm where i was before, only with word rep changed a bit, and lexical scope. Entry: namespace again Date: Thu Aug 2 10:19:01 CEST 2007 so all name resolution is a runtime thing. at runtime, a tree of hash tables is available which contains permanent bindings to word structures. the code expands to forms that get bound to this word structure whenever they are executed, using 'delay' forms. so, with this delay mechanism in place, is there a need for storing semantics in word structures? probably not. .. something is not right: - can't have (apply (delay expr) body ...) - can't insert a word structure at compile time either i wanted to to the latter to avoid a delayed expression. the only solution is to use a different apply. ok, i got it now. just using delay in the macro and force in the applicator. Entry: lot of work Date: Thu Aug 2 18:40:57 CEST 2007 got myself in a lot of work because i'm not respecting interfaces.. maybe fix that temporarily? it was necessary for the control structures because they're low-level, but maybe not for the rest of the code? next: the 'compositions' macro parameterized by: * source name space * target name space * compilation macro maybe it's best to take a step back, and respect the interfaces.. it looks like this is going to work, so i can just as well make the step and replace the entire vm code. Entry: weird macro bug Date: Thu Aug 2 20:52:12 CEST 2007 ;; This driver could be generalized into eager evaluation for macros. (define-for-syntax (process-args op stx stx-args) (datum->syntax-object stx (map (lambda (x) (if (and (list? x) (eq? ': (car x))) (op (cdr x)) x)) (syntax-object->datum stx-args)))) ;; This utility macro calls another macro with an argument list ;; reversed if it is tagged with ':'. This is necessary for PN <-> ;; RPN conversion. (define-syntax reverse-args (lambda (stx) (syntax-case stx () ((_ m . args) #`(m . #,(process-args reverse stx #'args)))))) The code above doesn't work.. Something about the syntax gets lost maybe? Expanding the macro seems to do the right thing though.. Entry: base functionality working Date: Fri Aug 3 10:34:04 CEST 2007 got cat/cat.ss as absolute minimum: anonymous and named functions. (like lambda and define). Entry: macro weirdness Date: Fri Aug 3 10:38:41 CEST 2007 i'm confused again.. syntax-rules macros are like normal order application: (macro arg1 arg2) the arg1 and arg2 forms are left alone until after the expansion of macro. This is how it should be i guess (the only way to get non-eager evaluation in scheme is by constructing macros). But somehow it's hard to switch between both ways of writing code.. One of the things i miss is to parametrize a macro with an 'anonymous macro'. Something that behaves as a transformer, but does not have a name. More specificly: (compositions (lambda-macro ...) ....) Is this possible, or am i just confused about something?? and another one: why is it so difficult to get this working: (define-syntax lex/cat-compile (syntax-ns-compiler cat-ref (cat))) (define-syntax syntax-ns-compiler (syntax-rules () ((_ ref (ns ...)) (syntax-rules (global) ((_ c global s e) (apply-force (delay (ref '(ns ... s))) e)) ((_ args (... ...)) (cat-compile args (... ...))))))) i'm importing the module that has 'syntax-ns-compiler' as require-for-syntax, but i get the error: ERROR: cat/stx.ss:146:10: compile: bad syntax; function application is not allowed, because no #%app syntax transformer is bound in: (cat-compile lex/cat-compile dispatch 3 (pn-compose lex/cat-compile (2 1) s)) but this works : (define-syntax define-syntax-ns-compiler (syntax-rules () ((_ name ref (ns ...)) (define-syntax name (syntax-rules (global) ((_ c global s e) (apply-force (delay (ref '(ns ... s))) e)) ((_ args (... ...)) (cat-compile args (... ...)))))))) i don't get it.. Update: the answer might be that the latter is a pure rewriting macro, and thus doesn't need any phase separation.. The former does, and the problem is just that i don't understand the separation here.. Entry: list operations on code Date: Fri Aug 3 13:44:12 CEST 2007 since all compiled code should have it's source rep still attached, generic list operations are possible. i'm inserting a call to 'source' for most of them. Now, why not have 'run' accept data? This will make the language simpler, and representation just a matter of optimization. So.. a consequence here is that there always is a default or base semantics. Maybe that's better. Entry: Conclusion Date: Fri Aug 3 14:52:34 CEST 2007 Maybe a bit early since i don't have the old stuff ported yet, but the main conclusions seem to be: * name space storage can be kept abstract: it's ok to do part of the binding at runtime, as long as this behaviour is abstracted (cat/lang.ss) * defining a new language as syntax instead of explicit interpretation is good, because scheme's scoping stuff carries over: it's possible to only replace the global name space, but to keep lexical variable bindings. And, macros can be simple, if you stick to syntax-rules. The more general syntax-case can become very confusing very fast. The most important thing to remember for syntax-rules is that it is a DIFFERENT language than scheme! It is normal order (breath-first) instead of applicative order (depth-first). So.. time to look into CPS a bit more. There's this SRFI 53 i might have a look at, but before that, i had a go at rev-k and rev-arg in stx-utils.ss seems to work.. Entry: and beyond Date: Sun Aug 5 01:14:43 CEST 2007 So.. Maybe it is time to make a proper module based CAT language. Modules really are a nice way to factor a design.. and i am already running into the simplest of problems: name space clutter. A lot of temp functions i'm using are littering the name space. Entry: porting Date: Sun Aug 5 16:08:52 CEST 2007 so, i started porting badnop to the new cat core. the first nontrivial problem i run into is 'state-parse'. Maybe i should keep 'define-symbol-table'. This needs some thought, since the whole namespace thing changed. In effect, it's the same: there are still hash tables with functions. Wait.. the 'make-composer' things need to be macros now.. so, what's needed is sourcedict,compiler,target currently, 'cat' is sourcedict+compiler, and 'cat!' adds destdict. i need a better naming for this, since it's so general.. Entry: mzscheme things to look into Date: Mon Aug 6 10:35:47 CEST 2007 * what is a 'transparent repl' * moving more snot functionality to scheme * snot and syntax coloring Entry: anonymous macros Date: Tue Aug 7 00:30:38 CEST 2007 is it at all possible to have anonymous macros? what i need is to parametrize one macro with an implicitly defined other macro. maybe this is not necessary: it is possible to have 'local' macros, meaning macros defined by other macros, with names from syntax templates. those names never clash, so it serves the purpose. (define-syntax compositions (syntax-rules () ((_ (gen-def! . gen-args) . definitions) (begin (gen-def! CAT! . gen-args) (compositions CAT! . definitions))) ((_ def! (name body ...) ...) (begin (def! name body ...) ...)))) Entry: lifting Date: Tue Aug 7 09:09:06 CEST 2007 when i want to do lifting, a decision needs to be made based on whether a symbol is present in one namespace or not. this is a run-time decision, since i'm using late binding. that doesn't look too difficult. i think i have it now, overriding 'global' and 'constant' methods. the rest should just work. but. it's good to have a better look at monad theory and the 'lifting' formulation to clean up my terminology a bit. Let's see: map (a -> b) -> (M a -> M b) unit a -> M a join M (M a) -> M a Setting a 'stack' as the base type t, the monad type M t will be a stack with added state. map is trivial and already used, however, the other two operations are hidden somewhere else: in the words that implement the monad dictionary. Does it make sense to make them explicit? The thing that confuses me is that i am doing the 'lifting' automaticly, based on a name space distinction. All the functions inside the monad dictionary actually to the mapping joining and returning, but in a way that's not factored into those 3 operations. Entry: state lifting works Date: Tue Aug 7 11:54:47 CEST 2007 now i need to think about some proper abstraction names, so the 'compositions' declarations look nice and readable. maybe it's best to standardize on the following syntax: (compositions (syntax (dst ...) (src ...) ...) def ...) * 'syntax' refers to the macro used to compile the body of the code. this is actually a compiler which needs source semantics. * '(src ...) ...' are the namespaces representing the source semantics used by the compiler. * '(dst ...)' is the namespace used to store the resulting code object. Entry: program quoting and lifting Date: Tue Aug 7 13:29:37 CEST 2007 i ran into this before.. in lefted code, how do i quote programs? because of automatic lifting, the only sane way is to default to non-lifted cat semantics. so i need to fix it up a bit.. looks like it's fixed now. Entry: things that need fixing Date: Wed Aug 8 12:10:32 CEST 2007 Probably the parser in forth.ss needs to be rewritten.. maybe as macros? The thing that needs to change is that the parser always returns symbolic cat code. No tricks with inserting internal representations. another thing i need to fix is default semantics: what to do if a symbol is not found? maybe using a parameter? done.. so the parser macros. if it's entirely built on top of the ordinary cat macros, i could disentangle them and get them to work first, then rewrite the parser macro preprocessor. so let's start top-down. Entry: macro.ss and literal + compile Date: Wed Aug 8 21:09:02 CEST 2007 now i get it: they need to be in (asm-buffer) and c> and c>word in (macro) need to refer to them. that way 'macro-prim:' can be used together with lexical binding. ok, that works. actually, it's quite cute this way. lexical scope to mix scheme and cat code is nice.. this makes me think: if i implement the preprocessing macros also as lexical extensions, that property remains. maybe that's overkill? maybe the current code is ok, as long as i make it fully symbolic? Entry: hygiene and the rewrite-patterns macro Date: Thu Aug 9 11:35:50 CEST 2007 It's fairly complicated, but the name bindings introduced are only: make-word-compiled lift-macro-executable lift-transform what if i factor it into 2 parts: a nonhygienic part that creates just the match clauses a hygienic part that binds the function and macro names It looks like this is sort of working. Now what about preserving syntax information in the expression parts of the match clauses? (match --- (pattern expression)) so the expression part can refer to lexical variables etc.. let's do that, but first see if this non-hygienic version works. one important question: when peeling off syntax with syntax-e, and using datum->syntax-object to put it back, is the orginal syntax that wasn't peeled off preserved? it really has to be.. seems to work.. at least the expansion does, but i can't see what can go wrong with the quoting.. Entry: reduce Date: Thu Aug 9 14:34:52 CEST 2007 transforming ((a . 1) (a . 2) (a . 3) (b . 4) (b . 5)) into ((a . (1 2 3)) (b . (4 5))) is called 'reduce', at least that's what i recall... but, i think maybe the more general 'fold' is also called reduce sometimes.. so i'm going to call it 'collect' for now. Entry: require-for-syntax Date: Thu Aug 9 17:55:48 CEST 2007 look at the macro compiler-patterns. find a way to put the utility functions in a module without getting the error: pattern-core.ss:94:11: compile: bad syntax; function application is not allowed, because no #%app syntax transformer is bound in: (begin (ns-set! (quote (macro +)) (make-word-compiled (quote +) (lift-macro-executable (lift-transform (lambda asm (with-handlers (((lambda (ex) #t) (lambda (ex) (pattern-failed (quote +) asm)))) (match asm ((((quote qw) b) ((quote qw) a) . rest) (appen... i don't get it. when i make them local to the transformer expression, all is well, but using 'require-for-syntax' doesn't work. i tried the following isolated case: ;; Utilities for syntax object processing. (module stx-utils mzscheme (provide (all-defined)) ;; Reverse a syntax list. (define (reverse-stx stx) #`(#,@(reverse (syntax-e stx))))) (module test mzscheme (require-for-syntax (file "~/plt/stx-utils.ss")) (define-syntax reverse-quote (lambda (stx) (syntax-case stx () ((_ list) #`(quote #,(reverse-stx #'list))))))) and this seems to work fine, so i'm doing something else wrong.. Entry: CPS macros are fun Date: Thu Aug 9 18:29:06 CEST 2007 but not really practical when syntax-case is around. now that i'm understanding it a little better, there isn't any reason to keep the CPS macros for list reversal. the other thing to consider is the 'compile' macro. i'm using something akin to CPS there too, only it's more message like message passing: pass the current object (self). Entry: datum->syntax-object Date: Thu Aug 9 19:17:06 CEST 2007 thinking a bit more.. i'm still not convinced that #`(#,@(syntax-e #'some-list-stx)) is doing what i think it is doing: the manual says datum->syntax-object is used, but does it see the syntax substructure? reading the manual again, now that i know what i'm looking for: "(datum->syntax-object ctxt-stx v [src-stx-or-list prop-stx cert-stx]) converts the S-expression v to a syntax object, using syntax objects already in v in the result." for (with-syntax ((pattern stx-expr) ...) expr) "If a stx-expr expression does not produce a syntax object, its result is converted using datum->syntax-object and the lexical context of the stx-expr." then for quasiquoting syntax: "If the escaped expression does not generate a syntax object, it is converted to one in the same was as for the right-hand sides of with-syntax." so i guess we're ok! Entry: symbolic macro names Date: Thu Aug 9 20:05:26 CEST 2007 something i run into is the (macro x) function from comp.ss: (macro:) wont work because the variables in the patterns are symbolic! this is really confusing.. i'm replacing the symbolic function with 'macro-ref' to make it more clear this is a run time symbolic lookup, not something that can be bound once. Entry: lexical quoted Date: Thu Aug 9 20:41:58 CEST 2007 with the new syntax approach, i can use lexical variables like (let ((xxx (lambda (a b . stack) (cons (+ a b) stack)))) (base: 1 2 xxx)) Which is convenient. However, i ran into at least 2 cases where the more convenient thing to do is to insert a constant instead of a function. However, the semantics of a symbol is always a function in CAT. Except.. when it is quoted! So what about this (let ((yyy 123)) (base: 1 'yyy +)) Meaning (base: 1 '123 +) ??? This is very convenient, but looks a bit weird. The reason is of course that stuff after base: is NOT SCHEME. Quote in the cat syntax only means: "this is data". The benefit of this is that it somehow resembles pattern variable binding as in syntax-rules. A better explanation is this: The scheme and cat namespaces are completely separate: scheme has toplevel and module namespaces, while cat has everything from a separate hierarchical namespace. The only way they can interact is through lexical variables: this is the only set of names that is fully controllable. In cat expressions: * free identifiers come from the associated name space * identifiers bound in scheme are - used as functions when they occur outside of quote - used as data when they occur inside of quote This can be implemented by mapping quote -> quasiquote, and unquote a symbol whenever it is lexical. It seems to work fine. Quote for macros is now also fixed. Another attempt to justify myself: The quote operator in cat language is NOT the same as the quote operator in scheme code. More specificly: lexical variables will be substituted whether they are quoted or not. i.e. both (base: abc) and (base: 'abc) will be substituted if the variable abc is bound. The quoting just indicates the atom is not to be interpreted as a function, but to be loaded on top of the data stack. The substitution is there to make metaprogramming easier. Entry: pattern transformer extensions Date: Fri Aug 10 10:23:06 CEST 2007 I'm trying to perform the pattern extensions properly. A true test about this phase separation thingy, since i have a couple of phases here: 0 matcher runtime 1 execution of pic18-pattern transformer 2 execution of pic18-pattern transformer generator an extra problem is that i'm matching transformer names -> syntax transformers. this gets a bit complicated, because the name of the pattern generator, i.e. 'unary->mem' is used both as a macro template, and as a function name, so the transformer generator needs to be generated! too many levels of nesting: this has to be simplified somehow.. wait: the thing that needs to be generated is a pattern expander function, which can be used in pic18-comp.ss to create the extended compiler-pattern macro. ok, i'm running into the problem again: if i put pic18-meta-pattern and pic18-process-patterns in a different module, and require-for-syntax it, i get the #%app error again.. so until i figure that out, maybe best to always use local transformer procedures? i guess it has something to do with binding identifiers. here the problem seems to be 'lit'. in the binary-2qw pattern.. i ran into the problem again, in pattern-utils.ss / extended-compiler-patterns and it was something like: #`(namespace . #,( ---- )) which needs to be #`(namespace #,@( ---- )) because the #, expands to an s-expression, which is then just inlined too, leading to 'process-patterns' not being quoted. weird stuff can happen.. ALWAYS CHECK EXPANSION when this #%app thing occurs! wait.. that's not it.. damn! i'm going to leave it at not using any syntax for it.. it's not too bad, and understandable. Entry: for-syntax Date: Fri Aug 10 14:08:57 CEST 2007 I still get into trouble with higher order where i completely don't understand what's going wrong. Well, i guess it will come with time. I'm glad i get syntax-case to a point where i don't need to use unhygienic constructs any more. And, if i get into trouble with name bindings, it's always possible to put local functions in a transformer expression. I'm done for today tough.. head is hurting :) Entry: preprocessor Date: Fri Aug 10 14:50:57 CEST 2007 Time to adapt the parser/preprocessor, and change it to something purely symbolic. Seems to work. It's a lot simpler too now that the macro representation supports quoting etc.. Entry: cat name space organisation Date: Fri Aug 10 17:45:16 CEST 2007 Things changed: * i found a way to easily debug modules and scheme code using snot * cat code is now fully embeddable in scheme * things got separated out a bit more So, what i'd like to do is to separate pure scheme code from stuff that accessess the global 'stuff' name space. This doesn't include private name spaces that are only written in a single module, like 'asm' or 'forth', but it sure does include 'base'. base is full of junk.. maybe that's the real problem? Maybe i should just implement more in scheme, and have this base thing as a scripting language only... Or. I need to add an easy syntax for creating 'local words' using the lexical stuff i have now. (letrec ((a (base: 1)) (b (base: 2)) (c (base: 4))) (begin (ns-set! '(base broem) (base: a b + c /)) (ns-set! '(base lalala) (base: a b c)))) (local ((a 1) (b 2) (c 4)) ((broem a b + c /) (lalala a b c))) this requires a different syntax, since the anonymous compiler needs to be available always. very straightforward. works like a charm. Entry: wordlist search path Date: Sat Aug 11 15:21:18 CEST 2007 i need to change the 'find' macros so they accept multiple paths.. one think i'm wondering about is how 'force' is implemented.. somehow i suspect that the thunk is not erased... it probably is.. found it: (define(force p) (unless(promise? p) (raise-type-error 'force \promise\ p)) (let((v(promise-p p))) (if(procedure? v) (let((v(call-with-values v list))) (when(procedure?(promise-p p)) (set-promise-p! p v)) (apply values(promise-p p))) (apply values v)))) thunk is erased. the only thing to optimize is to not use the values stuff, but a single return value. probably not worth it. Haha, something i didn't see at first there: the p is either a procedure or a list, so i can't make a single atom of it, because it could be a procedure :) another trick: creating the ':' macro at the spot where the '(language)' dictionary is created and populated with primitives uses the module dependency system to somehow enforce dependencies on namespaces, which are not checked. Entry: next: namespace Date: Sun Aug 12 21:10:45 CEST 2007 more specificly, it's time to start using eval on the 'macro:' syntax, and i run into the problem that this happens in a toplevel where it's not defined. can you tie 'eval' to a module context? update: what about making this namespace explicit? i just need a single namespace object which contains all the relevant compilers. hmm... it looks like the easiest way to implement this is to require each 'lang:' macro to be associated with an 'eval-lang' function. argh... can't do that, since those also need eval.. looks like i can't escape this namespace thing.. Entry: lazy data structures Date: Mon Aug 13 01:33:23 CEST 2007 so, what do i need to make the assembly process lazy? match needs to work on lazy lists. Entry: disentangling Date: Mon Aug 13 10:43:14 CEST 2007 i can't just separate out all the code that defines names in the namespace, because the namespace is used for other things. there's some conflict here.. what about i start doing it anyway. - more namespaces - populate each name space at the same place where the scheme code that uses it is exported. the real trick is of course to see the direct specification of namespaces as 'internal'. this should be wrapped by functions. Entry: name space trick doesnt work Date: Mon Aug 13 15:56:35 CEST 2007 the problem seems to be that data constructed using the struct from rep.ss is different from data constructed using run time evaluation in the separate namespace.. i need a different approach. first to identify the problem: - mzscheme has strict phase separation - "(eval '(macro: ,@src))" works when the runtime name space, the current one when that code is executed, has that macro available. - somehow, the 'rep.ss' gets loaded twice, since i run into incompatible instances. now, why is it loaded twice? one thing i could eliminate was a for-syntax dependency on rep.ss, so the messages are a bit less confusing now. it looks like the namespace trick creates a new instance. that's where my trouble is. let's simplify it a bit. the only thing that really needs dynamic compilation is 'macro:'. so i'm going to put that dependency in the code itself. but this should be independent of the problem, so i leave it like it is. i found a solution: just make sure the current namespace has the right symbols. there's no other way. currently i just dynamic-require them in, but i dont know if this is better than just requiring them at load time, or namespace creation time.. it does feel a bit dirty though.. why use modules if they start injecting stuff in your namespace? Maybe i should just pass the macros upstream.. whatever.. Entry: quoting parsers Date: Tue Aug 14 12:59:37 CEST 2007 They seem to work now. Map well onto the literal stack / typed macros approach. The question is how to map them. It doesn't look like a good idea to keep the same symbol, but how to change it then? I'd like both ' load and load to work. Prefixing them with 'def-' seems not right. What about "/load". A single symbol seems the right thing.. ~#$%& are ugly. "*load" seems a good compromise. I got it sort of working, and factored it out a bit. However, might need a bit of name cleanup to distinguish the following source representations: - a filename - a list in symbolic forth format - a list in symbolic macro - the latter compiled into executable code I switched to the following naming convention: - files use 'load' - strings use 'lex' - list -> compiled code uses the ':' prefix - all other operate on lists directly Entry: default semantics Date: Tue Aug 14 13:55:56 CEST 2007 Also, i start to wonder if it's a good idea for 'run' to take literal lists as an argument. The only real benefit is Joy like introspection, but since in badnop most source reps are macros, this doesn't make sense: source is not the only thing, semantics needs to be added, so default semantics might not be good. Entry: long run times Date: Tue Aug 14 14:46:10 CEST 2007 seems to go into a loop somewhere.. time for a break. it seems it's just really slow! and all the time is spent during compilation. looks like i got some quadratic things going on in the expansion.. so, i suspect this is syntax-rules.. i'm using a lot of rewriting to avoid syntax-case.. maybe i should just come back from that? already eliminated the if-lexical? macros. ok.. let's see. first make the expansion a bit less dramatic: some things can be abstracted in a function. then i replace rpn-compile by a single syntax-case macro. it still calls the compile macro, which can be customized by stuff built on top.. there's no difference in speed, so i guess it's somewhere else.. so it's this: (rpn-compile *forth* 'macro:) if *forth* is about 150 atoms, there's about a second delay. maybe it's the nesting of the macro? i wonder if i can write a macro that's faster.. let's try something different. currently, the rpn-abstract macro is using fold. it's still calling the 'compile' macro. looks like that's what i need to replace. so, how to implement modified behaviour. instead of using macros, why not use functions? i do need proper phase separation to do this. let's see if that's possible by moving stuff out to for-syntax-rpn.ss Entry: running into #%app trouble again Date: Tue Aug 14 22:15:17 CEST 2007 This is the smallest example i could find that doesn't work as i expect.. (module for-stx mzscheme (provide (all-defined)) (define (break-stx fn args) #`(#,fn #,@args))) (module test-stx mzscheme (require-for-syntax "for-stx.ss") (define-syntax (bla stx) (syntax-case stx () ((_ fn . args) (break-stx #'fn #'args)))) ) The, putting the 'break-stx' definition inside the define-syntax def works fine.. Ok, if i change the quoting mechanism above to: (define (break-stx fn args) (datum->syntax-object fn (cons fn args))) It does work. Now i'm really confused. I found this on the plt list: http://groups.google.com/group/plt-scheme/browse_thread/thread/327013d5c6f61017/9a12e93d683a5f94?lnk=gst&rnum=2#9a12e93d683a5f94 (require-for-template mzscheme) in the module that generates syntax seems to solve it. So, on to replacing the old 'compile' macro with a functional approach, which works a lot better. There's really no reason to mess with syntax-rules for anything else than simple patterns. Entry: disentangling Date: Wed Aug 15 14:19:05 CEST 2007 rpn-tx.ss lowlevel syntax generation, parameterized by 'find' rpn-runtime.ss runtime support for the above rpn.ss bind a 'find' closure generator to lowlevel syntax ns-utils.ss support code for namespace lookup, to be used in find closures. state-stx.ss namespace namespace -> state syntax "language:" compiler base-stx.ss namespace -> base syntax "language:" compiler composite.ss create named words from compiler Entry: mission accomplished Date: Wed Aug 15 15:15:54 CEST 2007 Looks like i got it back online. The transformer works a whole lot faster now. Let's repeat the conclusions: - don't use syntax-rules if you need CPS tricks. it's ad-hoc and slow. use syntax-case with real functions instead. - when using complicated syntax-case macros (compilers for embedded languages), separate out the transformer procedures and the template runtime support into different modules, so they can be tested separately. I did this for pattern.ss -> pattern-tx.ss and pattern-runtime.ss Entry: better error reporting Date: Wed Aug 15 16:21:00 CEST 2007 so... it would be great to be able to relate errors to where they occur in the source code. however, to use the builtin syntax readers i need to move both the lexer and the parser so they can operate/generate syntax objects. pretty clear what's to do next then: - rewrite forth parser so it operates on syntax objects + create a proper 'forth:' macro that goes with it. - make the lexer behave as 'read-syntax' when this is done, i should be able to compile forth files straight away. First part was easy: driver works. The rest should be straightforward. However, moving this to compile time requires some phase magic... I was thinking about doing a proper phase separation in the forth code too. Instead of defining macros as side-effect, it's probably better to isolate them. Entry: predicates->parsers Date: Wed Aug 15 21:57:18 CEST 2007 I don't remember why exactly the map 'vm->native/interactive' is not purely syntactic. it really should be.. refer to previous code to find the previous functionality, but i'm breaking it and taking out the 'dict' dependency and will replace the predicates->parsers with something that doesn't evaluate. the previous 'predicates->parsers' behaviour is too dense. took me a while to understand it. better to separate out in different mechanisms: 1. syntacting transformation 2. run time symbol lookup Entry: produced first monitor.hex Date: Thu Aug 16 00:31:51 CEST 2007 looks like i got it mostly running now. didn't test the code yet since a lot of things are still broken, mostly the interactive part. but it looks ok. Entry: brood 4 Date: Thu Aug 16 00:36:17 CEST 2007 enough things changed, and i'm in a broken state for a bit now. this means it's time to up the version, and rewind the the 3 archive to a working state. it's archived as brood-3 on apatheia. this is the last patch included: Mon Jul 23 21:29:12 BST 2007 tom.goto10.org * namespaces and next projects at that time i was changing stuff to the boot block.. i'm not sure if that code actually works.. might be better to revert a bit more back till after the workshop. Entry: next Date: Thu Aug 16 01:01:53 CEST 2007 - test the target code, see if the monitor still works - fix the interaction code - fix the vm interaction/compile code - fix snot for interacion/compile mode - factor some badnop code: use local words Entry: separate compilation Date: Thu Aug 16 10:51:12 CEST 2007 got me thinking: can't i do the separate compilation trick for macros? i already run into the non-transparency problem several times: trying to define some code with some macros not defined.. one of the problems is 'constant': it needs run time compilation so i can't just do this... another is that macros defined immediately start influencing compilation of code after their loading. but.. can the loading of forth files be made free of side effects? or at least somehow separated? let's see what kind of side-effects we got: constant-def! 2constant-def! macro-def! those are easily isolated into separte dictionaries to separate 'core' from 'project' macros and constants.. as long as project macros are loaded AFTER core macros, they can be safely deleted as a whole. the short version: it's impossible to change it now without real phase separation.. Entry: literal pattern matching Date: Thu Aug 16 11:20:58 CEST 2007 Patterns like ((['qw a] ['qw name] *constant) (begin (def-constant name a) '())) are a bit redundant.. a better notation would be "(a name *constant)" Entry: assembler cleanup Date: Thu Aug 16 11:36:25 CEST 2007 Can't i get rid of the 'constants' namespace? Again, why are they different from macros? To postpone symbol -> number conversion until assembly time. So they can't be macros, because at assembly time all macros have run. Entry: compilation syntax Date: Thu Aug 16 13:42:21 CEST 2007 i'm thinking about adding some syntax to compile code using different syntax.. (a b c) is still default demantics quoted code, but (lang : a b c) is interpreted as compiled with 'lang'. or maybe (lang: 1 2 +) let's see if i can do this first on the rep.ss level: just store a symbol naming the rep. probably the first thing that needs to change is to change state-stx to take an anonymous compiler as 2nd op.. it's really annoying to be at the border of compile/run the whole time! first, the above is not really possible since state-stx code fallback code is not derived from a named compiler. Entry: override semantics Date: Thu Aug 16 15:05:36 CEST 2007 Introduced the (language: ...) syntax for overriding language semantics while quoting code. It's implemented as follows: the default 'program' compiler checks if the first symbol in a list ends in ':', if so, the whole expression is passed to the scheme expander, otherwise the default 'represent' method is used to compile the code anonymously. It's a small step from here to a 'lambda:' macro. I also fixed the semantics annotation. However, it is possible to run into code which doesn't have the semantics annotated because it works with an anonymous macro.. This could be cleaned up, but i guess it serves the debugging purpuse now: 'ps' displays macros as (macro: ....) Entry: name mangling Date: Thu Aug 16 16:01:27 CEST 2007 Maybe i should give the name mangling a go again.. If i recall the thing i did wrong last time was to get rid of syntax information for names, so they were mapped to toplevel names. This macro seems to do the trick: (define (prefix pre name) (->syntax name ;; use original name info (string->symbol (string-append (symbol->string (->datum pre)) (symbol->string (->datum name)))))) So basicly now i have a mechanism to use the mzscheme module system for handling namespace and dependency management. I bet i can use some kind of 'module-local?' predicate on the syntax to find out if a name is local to a module, and if so use that instead. I guess a good time to find out if i have the namespace stuff sufficiently abstracted. Something about naming conventions: the 'rpn-' modules do not need or depend on the namespace implementation. I do need a different kind of 'compile' macro, but for the rest it works perfectly. Maybe time to rename some things.. Allemaal goed en wel, maar hoe kombineer ik? Lijkt me niet direct een goed idee.. Dit werkt beter als alles of niets.. So, combining runtime namespace lookup and static modules.. how to? One of the things to change is to not inherit from a namespace, but from a named compiler macro. What about starting from the ground up? Making the base language static, then moving things from dynamic -> static? Starts with snarfing. Instead of snarfing to dictionary, snarf to prefix. Start with separating primitive.ss into snarf.ss and ns-snarf.ss So in principle, it should be really easy now to move the implementation of base.ss to static functions without anybody noticing. That is, if i can somehow make delegation work using just a language: macro instead of namespaces.. Entry: the royal DIP Date: Thu Aug 16 17:35:27 CEST 2007 i guess the solotion is to use 'dip' from base to create state syntax abstractions. and maybe, to add an optimization that: (+) does not create an extra lambda, but returns the primitive + right away. i guess the optimization can be left until later.. so.. the idea is to make the delegate compiler abstract. this requires quite some change, but should make the code a lot simpler.. it would also fix the annotation problem mentioned above. so (ns-base-stx badnop: (badnop) base:) instead of (ns-base-stx badnop: ((badnop) (base))) let's call this 'extend-base-stx' haha. gotcha! of course, the delegate: is a static thing, and the namespace delegation is a dynamic thing, so there's no way to compile this: the information necessary to decide about delegation or not is not available at compile time, when the deligation needs to be frozen.. it needs to work the other way around !!! if a symbol is not defined at compile time, the resolution can be postponed until runtime. so i guess the gentle way to move things to static implementation is to use the 'module-local?' predicate mentioned before. that way module-local symbols can bind first, and cannot be overridden. the number of methods in the compiler is getting larger. maybe use real objects? prototypes? also, if i use a decent prefix, symbol capture is not a real problem, so i can put it on always? maybe just a dot or a pound sign.. Entry: pff... done coding.. Date: Thu Aug 16 20:49:59 CEST 2007 today was a bit intense. i start to get a bit more of this syntax / lexical / static stuff.. it would be nice to make more things static. there are only a couple of places that have 'plugin' linking. one of them is 'literal' and 'compile' in the macro-prim dictionary, so it looks as if i do need some dynamic binding. however, i wonder if it's not better to solve this using units. more standard tools = better, now that i know what i want at least.. one thing is bugging me though. some paradoxical thought: i'd like to define words that fall back on another dictionary. however, using static linking there is no such thing: a symbol is there, or it is not. and there's no override.. maybe i should stick to dynamic.. it's really different and no easy migration to static. Entry: if i go static Date: Thu Aug 16 23:29:47 CEST 2007 one name mangled namespace is enough, since i can use ordinary modules to organise code and hide details, just like in scheme. let's stick to 'rpn.' so i built that in: names like 'rpn.xxx' that are visible at compile time get used as functions, and bind variables 'xxx' in the cpmpositional code, just like lexical variables. it looks like delegation from dynamic -> static parts is not possible. since this is quite a deep thing to change, i'm not going to. it's still possible to move highly specialized code into modules to shield them from the main dictionaries. what is possible is to add a static interaface to words in 'base'. they could still include code to register to the dynamic space also, but at least this would enable to freeze some functionality. so, maybe this: all base words are exported - rpn.xxx variables from the rpn-base.ss file - exported in a dynamic dictionary from base.ss, which gets the functionality from rpn-base.ss is a bit confusing.. maybe leave it as is.. Entry: because i can Date: Fri Aug 17 00:05:30 CEST 2007 there's a lot of 'because i can' code in thethered.ss ... as i found out, some tasks are just easier to code in scheme. if it's anything algorithmic, meaning intricate data dependencies, you're usually better of writing a scheme program. they are easier to understand, probably because they are a bit more verbose, and because 'automatic' permutation and duplication of names avoid mental gymnastic for stack juggling. there is nothing in the way now that i have both 'base:' or 'prj:' in scheme, and 'scheme:' in cat. for what is the cat code useful then? simple patching and scripting, there it clearly wins. as long as not too much data juggling is needed, cat is really easier to patch things together. also, imperative code looks nicer in cat. because cat is just composition, it looks sequential. it happens that all (most) imperative code i use is for communication. in scheme imperative code seems always ugly.. maybe it's because synchronisation is easier to imageine in a linear instruction flow: threads of execution joining together at certain points, breaking the linearity of composition? Entry: joy Date: Fri Aug 17 01:28:07 CEST 2007 added a joy interpreter. it doesn't have much, but it can do ((dup cons) dup cons) i Entry: interaction Date: Fri Aug 17 10:45:30 CEST 2007 got a bit off track again.. time to fix interaction. first thing to do was to put 'tinterpret' and 'tsim' code in prj.ss together with the supportin code dip/s and ifte/s so.. why is this so ugly? by default, quoted programs and run + ifte use functional context to limit surprises. however, sometimes i want to do things like: (tsim (prj: dup tfind not) dip/s (prj: tinterpret) ifte/s) the xxx/s words are the analogs of xxx but pass stack + state to the programs, and 'prj:' compiles state words. is there a way to do this automaticly? probably not using my current setup, unless i make 'run' understand state words, which means they should be type tagged. since that only takes away the /s notation, i'm not going to do this. so the convention: functionals do NOT pass state to quoted programs, while the corresponding /s DO but... if one uses types to do this automaticly, which means that the core 'apply' routine should be made aware of state, and rep.ss should implement some kind of tagging for state words.. what would be the real problem? Entry: Monads Date: Fri Aug 17 12:10:45 CEST 2007 i don't know much about type theory, but i think i understand how my ad-hoc approach relates to monads using the unit-map-join formulation. X is state type S is stack type ( . ) is cons unit :: S -> (X . S) map :: (S -> S) -> ((X . S) -> (X . S)) join :: (X . (X . S)) -> (X . S) so 'unit' introduces a new state object on the data stack. 'map' will create a function that does what it did, but ignores the X part, and 'join' will accumulate one piece of state into another. the first two are trival, and i do use them fairly explicit. but the last one seems to be hidden a bit deeper, because i never use it explicitly: every state dictionary has a couple of words that bring stuff into the monad, but they have type: A is assembly opcode asm :: (X . (A . S)) -> (X . S) here 'A' is not the same as 'X', but in spirit it does the same flattening operation. looks like i'm missing some of the fun. clearly the 3 law formulation has some benefit due to a higher level of abstraction, but what would it bring me to make this a bit more explicit? first of all, i need a proper type system. the monad objects should be somehow tagged. that way 'unit' and 'join' can be made polymorphic. 'map' should not be polymorphic, given i implement monads as 'things on the top of the stack'. ;; map (lift (dip) curry) the other two are problematic. 'join' is possible to do, since monad types could be tagged so it _could_ be made polymorhic. but 'return' / 'unit'.. such polymorphism won't work because i can't infer the type! i.e. 'return' is normally plugged into some expression that expects a monad type. i have no way of determining something like that, so 'return' should have explicit annotation, probably best using just a different name. for example, the assembly 'return' for a single opcode would be (asm:return '() cons) ;; wrap the single opcode in a list (asm:join append) ;; concatenates the 2 state lists what i do is to just combine those 2 operations into one that conses a single opcode to the assembly list. i think i sort of get the gist of it.. or not? so, the other formulation uses bind: (X . S) -> (S -> (X . S)) -> (X . S) so bind is like 'join' in that it combines a monad data type with a function that maps from outside the monad to inside, and returns a monad type. note: because i have only one type (a stack: each function maps stack -> stack). * in the general case, the source and destination monads for the 'bind' operator do not need to be the same, but in my approach they are, since there is only one type that can be "monadified" * i do not have the concept of a type constructor (types do not have an abstract representation), so i can leave that out. so a stupid question maybe. how do you get stuff OUT of a monad? i think there's something i didn't get. the type signature of bind is: M t -> (t -> M u) -> M u so i guess if M u == t, bind can get things out of a monad. in general, it can get t out of M t (multiple times!), apply the function (t -> M u) (multiple times), and combine 'stuff' from M t with the (multiple) M u, and return an M u. conclusion: not having 'real types' makes all this a bit difficult to formulate.. it might be a nice exercise to try to do it anyway. nice base for some more reading on the subject. maybe "Monadic Programming in Scheme" http://okmij.org/ftp/Scheme/monad-in-Scheme.html this talks about the case where there's a single monad, or types of different monads do not get mixed. Entry: source annotation Date: Fri Aug 17 16:45:25 CEST 2007 really.. does it make sense to NOT have the source annotation be formal, if with a little more effort it can be? It's sort of formal now.. Things that are not uncompilable have #f semantics, the others are created straight from the named macro, so should be right, or by composition from such, so should be right because all code is syntacticly concatenable. It sort of strikes me as odd that i can't have 'curry' or 'lift' defined in a generic way, because quotation of data is not standard. I could try to force it. Anyway, for 'lift' i only need base semantics/syntax. Wait, lift is possible if semantics is defined, but it requires that quoted programs are always available. (even in forth macros!) Entry: and so on.. Date: Sat Aug 18 01:48:03 CEST 2007 time to get back to pic programming... i didn't really anticipate this static change and the move to brood 4. but this are really better this way. once the pic part is back online, it's time to look at interaction macros, or how to create interactive meta functions. so, timeline: - interaction macros - the standard 16bit forth (requires interrupt driven serial I/O and an on-chip lexer + dictionary) - write something about compile/runtime and the different ways to fake the single machine experience. Entry: state Date: Sat Aug 18 14:34:18 CEST 2007 got interaction working. i changed it so the commands available in the interaction mode need to be specified explicitly. this has to be done for commands that take arguments anyway, so why make an exception for 0cmd? see interactive.ss so, i've been doing the snot-run thing, which works quote well. it's a feast that state is stored elsewhere, and my function core can be just reloaded. however, there are a few spots where i'm using state still.. one is IO. since it's non-functional anyway, storing the name of the serial port couldn't hurt, right? wrong.. on restart, it needs to be reset. i made the 'boot' word which loads all the macros from source. this is slow, i guess because of the constants? so maybe i should just put the constants back as scheme file.. Entry: KAT and TAK Date: Sun Aug 19 14:05:15 CEST 2007 I'm looking for a better way to explain the pattern matcher. Usually generalizing helps. The reason why it seems special is because it is only used with "macro pattern" and "quasiquoted scheme template" Entry: no phase separation Date: Sun Aug 19 15:16:21 CEST 2007 Now that i finally understand the point of phase separation, i wonder if i can do something similar with the forth? Maybe it's not necessary for small projects, but it does feel a bit weird to first struggle to write scheme code that obeys mzscheme's phase separation rules. To see that it's a good thing, and then to go back to some non-separated way. I see a roadmap on how to do this: just turn everything into scheme syntax. The result after loading is a single function that generates the program when evaluated. That way i know i'm going to get there. On the other hand, i do not know what to give up then. My whole design needs to change. An other way is to do it incrementally. First make sure i can separate code into macro definitions and rest. For just macro definitions this is not so difficult. However, constants are a different story, since they require compile time computation.. I guess that's where the problem is: Entry: constant Date: Sun Aug 19 15:27:10 CEST 2007 What are constants? Phase separation violation! In contrast to normal macros, which obey separation because they do not use any values created at compile time, macros generated by 'constant' join 2 phases. The 'constant' word could be termed a "phase fold". The compiler after 'constant' is not the same one as before: it is extended with a macro. This kind of behaviour prevents modularization of code, because it is not clear what the definition of the new macro depends on, the only thing that can be assumed is that it depends on all the previous code, and that all the following code depends on the new macro. The solution is that this behaviour needs to be unrolled: instead of updating the compiler on the fly, an extention phase (where macros are defined) needs to preceed a compilation phase (where macros are executed). There is a general way to unroll 'constant': split the code in 3 parts: the part before, the definition of the new macro, and the code after. This is rather cumbersome and entirely unnecessary.. However, in the case of Purrr18 it is usually possible in to transform the code to a macro definition. Instead of writing 1 1 + constant twee one could write macro : twee 1 1 + ; forth This enables the macro definition to be distinguished from the rest of the code, to clarify the dependencies of a file's plain code on the macros defined in that file. The only reason not to do it the second way is because it looses the name 'twee' in the eventual assembly code. Removing 'constant' could lead to a better transparency in the code: compiled macros could then be seen as 'only cache'. Note that i would do this just for more transparency, not to eliminate undefined symbols: macro name binding is still late. Entry: phase separation Date: Sun Aug 19 16:34:26 CEST 2007 So, a forth file contains both macros (M) and forth (F) code. The forth code always depends (->) on the macros (M -> F) If a forth file depends on an other forth file, the macros from the former depend on the macros of the latter, and the forth code depends on both macros and forth code from the latter. Due to transitivity, the arrows from M -> F in between files can be omitted, so one gets something like Ma -> Fa | | v v Mb -> Fb where the arrow from Ma -> Fb is left out. What this would buy me is that i solve the problem of keeping the macros consistent with the state of a target: Target state is a consequence of compiling all the Forth code in a project. However, as a side effect, a project defines macros that are used to generate this code in the first place. There needs to be a clean way to 'reload' these macros from the source code, so we can connect to a target with the macros instantiated. I'm trying to see how to make this more rigorous: how to make incremental compilation work without having to manage dependencies yourself? Basicly, how to map the nice module system of mzscheme to incremental Forth development. This is clearly not for now. It requires a lot of change. One of them would be management of storage on the controller: if dependencies of separately compilable modules are fully managed, incremental uploads are still possible, and become 'transparent'. I.e. changing a module but not changing its dependencies makes it still possible to update a system on the fly, but in a transparant way. I'm still quite happy with the ad-hoc hacked up way of incremental development. But knowing this is possible might make the itch a bit stronger. Entry: dynamic updates and functional programming Date: Sun Aug 19 16:53:53 CEST 2007 I guess most of this train of thought started after i got to using sandboxes with SNOT. Currently it works +- like this: SNOT (the bootloader) * manages memory: stores project state in a single toplevel variable * manages purely functional sandbox * implements REPL outside of the system the edit-compile cycle runs: changes are made to the collection of functions that acts on the state and a compiler recompiles those that have changed. then 'restarting' the system is almost instantaneous: the state remains, only the operations on the system can change. the requirements for this are of course that all state is stored in a fairly concrete way: representation must not change from one version of the system to the next. if representation changes, a small 'converter' could be made.. What i'd like is something like a smalltalk environment, but then for scheme. A lisp environment with incremental loading comes close, but transparency is necessary. Smalltalk solves this by being completely dynamic: compilation is just cache, and code can be edited on the fly. There is no 'off', it's always running. MzScheme solves this by being static but well-managed dependencies. Separate compilation to make 'restarting' cheap. There is an 'off', but it can be made small. Using the approach above: managing ALL state separately renders a virtually always-on system. The off period can approximate zero since it's just "swapping a root pointer" once the code is compiled. Compilation can take longer if changes to core modules are made, but there remains a 1-1 correspondence between the system and the source code. I guess it's possible and not even too difficult to delay compilation in the scheme case, making compilation behave more as a cache. Entry: purification Date: Sun Aug 19 17:17:13 CEST 2007 So i need to eliminate state. There are 2 cases where i've introduced state because i thought it "wouldn't hurt".. * the target IO port * the project search path The rest really behaves just as cache. So if i'm allowed to be really anal about eradication, these things need to change. Project search path is the easiest. Target I/O is more difficult because it requires moving from a functional to a monadic implementation. Entry: eliminating global path variable Date: Sun Aug 19 17:41:15 CEST 2007 to be able to eliminate the path state, i probably need dynamic variables (parameters). is this cheating? not really.. since i'm using with-output-file already, and that doesn't really feel like cheating. this would also solve the problem with IO of course.. still i'm not convinced it's not cheating.. one could say it's not cheating because the value has finite extent? so why not implement monads as dynamic variables? because dynamic variables are not referentially transparent, which you would want when you 'run' a monad: it should still act just on the state provided, not on something else... so why are parameters different then? are they less evil when they are constant? they represent 'context'. * one thing is sure: they are less evil than global variables due to limited extent. * if they are constants, they are less evil than when they are not To really answer the question is to implement dynamic variables with monads, and see how they are different. The problem i'm facing in my ad-hoc state hiding approach is that i can't combine monads: When i'm running something in the macro monad, i can't access anything else. To have access to path, the monad should be bigger and include 'compilation context'. The real solution is of course to make compilation independent of file system access. Source code needs a preprocessing step that expands all include' statements. Since it's only one keyword, this can be implemented in the forth-load function. That function already implements 'file system dereference', so why not include path searching? Ok, made it so. 'load' is now a load-time word so file system access is concentrated in one point. 'path' is removed: this needs to be specified in the state file, because it really is a meta-command. Entry: cleaning up interaction Date: Sun Aug 19 19:07:52 CEST 2007 This is the biggest change. Probably best to separate it out in a different monad. The state associated with interaction is: * I/O port * target address dictionary * assembly code With this data it can start assemble code and upload it to the target. But.. looking at the contents of the state file, there is not much else! (forth) ;; might come in handy for interaction (file) ;; in case we want to access the file system (config-bits) ;; on the fly reprogramming? some day probably (consoles) ;; this is the only real meta data not necessary for interaction.. maybe it's not worth it to split interaction off of prj. maybe it's even just a bad idea: you'd want the 'fake console' to have power over the whole project. impossible without giving it all the state. let's just clean up tethered.ss and move out functions to badnop.ss but... i'm using with-output already. so why not just have the i/o commands do the same? done. this immediately solves the problem of having more than one device attached. i.e. a distributed system with all identical devices. Entry: side-effect free macros Date: Sun Aug 19 20:00:19 CEST 2007 i was thinking: if macros are side-effect free, constants can be eliminated. because it's always possible to see if a macro is just a constant: execute it, and if the result is '((qw )) it is! the only thing you would need constants for is to 'uncompile'. another thing: what about making the partial evaluator reference macros if it can be guaranteed the macros perform only computations that can be completely reduced to values? i need to disentangle this a bit.. Entry: no values Date: Mon Aug 20 15:12:00 CEST 2007 i owe this to Joy: it's really good to have no "function value quoting", i.e. just (foo) instead of something like 'foo this leaves ' free for quoting literals, and has the benefit of a symple abstraction syntax. Entry: distributed programming Date: Mon Aug 20 15:15:37 CEST 2007 The next hardware project is going to be krikit. It's going to be a distributed system of small devices. Entry: done Date: Mon Aug 20 16:27:08 CEST 2007 yes, i guess so.. no pressing chages ahead, except for the macro/code separation, side-effect free macros, and maybe dependencies.. which is a biggie. another thing is interaction macros. so the todo looks like: - move the words "constant, macro/forth, variable" and the 2-variants to preprocessor stage which can separate code into macros and forth code. - add interaction macros Entry: brood.tex Date: Mon Aug 20 23:03:09 CEST 2007 i'm starting an explanation of macro embedding with a purely functional approach. while i'm on the right way with my notion of compilable, the effectful part is less obvious. the idea is this: [ a 1 2 + b ] can be simplified to [ a 3 b ] if a and b are effects. somehow i'm missing something important. maybe the situation is symmetric? instead of having language and metalanguage, which both share some evaluation domain, they also have functions that act on their full domain only. i think i sort of got the duality now: the target depends on run time state which is not representable, meaning only pure functions can be evaluated. ... my explanation is not completely sound.. when i'm talking about target and host langauge, i never make the explicit conversion. there's something wrong there. almost right, but not quite. a compilable macro is something which can be 'unpostponed'. meaning, it is a function that all by itself produces a program that can be evaluated on the target. ... another thing is that macros, in the way i implement them, are not the macros i'm describing in the paper. my macros are EAGER, they are a combination of the partial evaluation strategy AND their original meaning. the macros in the paper, at least the partial evaluation strategy, is monadic. for compilable macros this makes no difference, but for other algorithms, order does matter. Entry: monoids and stacks Date: Wed Aug 22 16:09:11 CEST 2007 something which has been tickling me for a while because i have it not formalized in my head: functional programming with stacks.. how does this work, really? what's the relationship between state and stack? so, compositional programming languages use compositions like [fg] to express programs. all functions are unitary. that's nice to give some framework about evaluation order (it being arbitrary, if there's a representation of composed functions). so: Functional compositional languages make it easy to talk about partial evaluation: it's just the associativity law. Whether this is of any practical sense depends on whether we can partially evaluate FUNCTIONS to something simpler. so let's start with inserting that thought in the paper.. then the other one is about locality of action. the fact that a language is compositional doesn't really do much about this. you need a way to ensure separation. this is where stacks come in. but this is more about continuations than about being able to perform partial evaluation.. really, the only thing i need to know is * POSSIBLE: that [1 +] is equivalent to [1] followed by [+] * ECONOMIC: that representation is actually simpler that's the end of the story. the fact that the thing uses stacks is relevant to prooving that [1 +] is equivalent. i need to clean up notation.. i'm using two different notations for application. one rpn, and one pn. let's stick to pn, because i use functions somewhere else, and reserve rpn only for compositions. ... there's another thing that's really wrong in my explanation. something i noticed yesterday already... macros are about IMPLEMENTATION of partial evaluation. i really have only a single language! that's what it feels like also when programming. so i think i can plow over my whole text again... frustrating, but i'll get there eventually. maybe this is why i like programming so much. making sense is only defined from the point of works/notworks. math is too free for me.. i am not strict enough. ok. the plan: get rid of the notion of 'macro' and introduce it only later. keep everything abstract, just show a way to translate forth into a functional language operating on state + metastate. looks like i'm getting somewhere.. and this is going to turn up some conceptual bugs. looks like i needed to spend this time plowing through misconceptions.. again, this is wrong.. ARGH! the compiler is not a map. it's a syntactic transformation. what i call a compiler now is just the property 'compilable'. so the compiler is something that proves a program is compilable! ok, i got it sort of explained now. so this composite function thing is about semantics, which leaves more room to talk about the implementation of the proof constructor (compiler). just added a note about function definitions. creating new names is either something which happens outside of a program, or has to use side effects. currently it's the latter, but i'd like to move to the former. Entry: real compositional language? Date: Wed Aug 22 22:37:47 CEST 2007 actually, the step to a real compositional language is not so big any more. just adding the parsing words '[' and ']' for program quotation, and possibly an optimization for ifte -> if else then conversion should do it. all other constructs can then be translated into higher order functions. Entry: phase separation Date: Wed Aug 22 22:44:48 CEST 2007 Name: phase_separation i guess now that base.tex seems to be about bull-free, the step is phase separation for forth files. basicly this means: 1. collect all names and macros. this includes constants, variables, AND the macros used for compiling function calls. 2. compile the code. so.. it should in principle be possible to have proper semantic separation of names before a source file is compiled. currently, words have a default semantics (target word). however, i could catch undefined names if i catch all occurances of ':', and register a macro for each of them that will compile a procedure call. that way i can remove problems with macro/code confusion... so 2.. a name always maps to a macro explicitly. otherwize it is not defined. no more default semantics. the macro might choose to compile a call instruction using a symbolic reference. this means the language becomes a bit less flexible: : (2)variable (2)constant are no longer accessible from forth, and are prepreocessor directives that change the code into a form: (macros (a 1 2 3 +) (b 5 -)) (constants (c 1) (d 2 5)) (tape ((broem) a bla (lalala) bla broem)) where the tape is the layout of code memory with labeled entry points. this structure is there to preserve multiple entry points (fallthrough) and multiple exit points. if macros are side effect free, constants can be eliminated. they are simply macros that evaluate to a literal sequence, if they evaluate at all. i can even keep the current context for 'constant' suppose a forth file starts with the code: 1 2 + constant broem so the loose code "1 2 +" can be interpreted as a macro. the consequence is of course that it's not possible to define constants after the first function. hmm.. i do need constants if i want constants in the assembly. because to get them there, every constant needs to have a macro associated to it that will compile the constant value.. so let's leave them in, but employ the mechanism above to give them macro semantics. maybe a constant is a macro that evaluates to a literal, so the actual macro code can be stored somewhere else? maybe the more important thing is to unify the compile-time constant evaluation with macro execution? not really.. ai ai.. time to go to bed.. Entry: set & predicate Date: Wed Aug 22 23:56:09 CEST 2007 it never occured to me, but a set is indistinguishable from a predicate function. operations on sets are then (define (union a b) (lambda (x) (or (a x) (b x)))) (define (intersection a b) (lambda (x) (and (a x) (b x)))) a thing you can't do here is to iterate over the elements. Entry: a day in bruges Date: Fri Aug 24 10:32:26 CEST 2007 tourist in my own country.. anyway, i made some notes: * partial evaluation/optimization replacing a composition [fg] by a specialized function is always possible in a compositional language. the reason why it doesn't work for me is mainly because of 'hidden quotation' for example the sequence "1 THEN +" contains a jump target. which is not purely compositional. solution: only pure quotations. all branching should be optimization Forth is too dirty, need a syntax preproc. is there a way to have "[ 1 + ] [ 1 - ] ifte" as the base form, and translate it into "if 1 + else 1 - then" ? should i move all macros that break the compositionality to a different level? * terminology/concept cleanup: define compilability in terms of the existance of a retraction. * proper credit MOORE: required tail recursion, multiple entry (fallthrough) and exit points. VON THUN: program quotation + combinators, program = function composition and constants are functions, monadic extensions: top of stack is hidden. DIGGINS: typed view + things you can't do (whole stack ops kill stack threading) FLATT: separate compilation + phase separation * semantics of jumps? they get in the way of FCL formulation. a jump could be a non-terminating evaluation? is there a way to make this sound? * closures versus quoted programs note that quoted programs are not closures, since they are not _specialized_. for closures you really need dynamic behaviour: at run time, some values need to be fixed. something that could emulate closures is the consing of an anonymous function with a state atom. this operation is call 'curry' in kat. it could be combined with a monadic state for more elaborate emulation of closures & objects. Entry: i hate it when this happens Date: Fri Aug 24 13:31:09 CEST 2007 i have something in my head, about the relationship between compositional stack languages, monads, virtual machines for the implementation of functional languages and the lambda calculus and combinators.. but i can't quite express it due to lack of literacy.. argh.. ik weet weer nie waar de klepel hangt.. dus: 1. compostional language -> put partial evaluation and meta programming in a simple framework. independent of set! 2. elaborate on the set's substructure. -- so, 1. gives a framework on how to build a compiler. but without stacks, composition isn't really useful. so the stacks are needed as a tool to create general functions that can be applied in several concrete settings. so these functions need to somehow be independent of SOMETHING. that something is the way in which run time data is organized. need to find a better explanation.. something that hit me just now: a computation on a stack language always involves saving some state, and recombining it later.. there are 2 ways this happens: * most functions leave the bottom of the stack intact * 'DIP' leaves a part of the top of the stack intact this is probably related to normal order and applicative order reduction. -- another problem.. why is it so hard to get this formulated correctly? in my exposition about parsing words, i cannot really use "variable abc" as a good example, because it really is not compositional code, that needs to be disentangled.. conclusion is right though: in order to disentangle this system, it is neccessary to remove some reflection. to 'unroll' the dependencies. and the picture is really about dependencies. functional programming is more about getting your graph free of cycles than anything else.. maybe that's the reason for stressing on the Y combinator: how to introduce cycles, but not really. there's another example in dan friedman's book: essentials of programming languages. i can't find it now, but somewhere about implementing an environment there is a need for a circular reference, but he uses a trick to not have to do this.. maybe it's about how to make things static. to keep them from moving so they can be looked at in peacefully and quietly :)) -- basicly: - stack = environment (de bruyn index) Entry: so.. what's he most important thing now? Date: Fri Aug 24 16:06:47 CEST 2007 a lot of ideas need some fermenting still. but there's one that's quite clear: names cannot be created dynamicly, because that kills the representation as a declarative language. so i need a preprocessing step that takes out all creation of new names. this makes some things problematic. one of them is multiple exit/entry points. multiple entry points can be translated: : foo a b c : bar d e f ; -> : foo a b c bar ; : bar d e f ; then at the point where '(label bar)' is assembled, the jump to bar can be eliminated. multiple exit points need to be translated to an else clause : foo if a b c ; then d e f ; -> : foo if a b c else d e f then ; so it looks like it's not just names, also 'implicit names' or labels. Entry: environment and stack Date: Sat Aug 25 09:14:26 CEST 2007 let's elaborate on this a bit more. the stack can be seen as related to an environment, which is a way to implement substitution in lambda expressions. to simplify, suppose we have only unary lambdas. (lambda (a) (lambda (b) ((+ a) b))) this can be rewritten using de bruin indices (starting from 0) as (lambda (lambda ((+ 1) 0))) where the numbers refer to an index into the environment array. this gives an easy way to represent a closure as a (compiled) lambda expression, and an environment. maybe the missing ingredient in my understanding is the SK calculus? Entry: paper again.. Date: Sat Aug 25 10:42:23 CEST 2007 in fact, i need to distinguish between syntax and semantics a bit better. a compiler works on syntax, (a representation). von thun has some text about this.. again, i'm amazed by how untyped you can be in scheme! i'm just performing operations on lists, without ever having to clarify what things are.. interpretation is a consequence of what functions you apply on the symbols.. so, let's say that "working with symbols" is always untyped. they are a universal tool of delayed semantics. maybe that's the idea behind formal logic, right? by just specifiying HOW to operate on symbols, you never need to explain what you are actually doing. Quite an adventure, trying to provide a model for the language and compiler. * read Flatt's paper about macros again * logic and lambda calculus. * monads and their relationship with compositional programs. * a purrr module system + compositional language http://zhurnal.net/ww/zw?StokesTheorem Funny. I have that book on my shelf, and i tried to start reading it on thursday. I guess it has a major truth. Once the necessary structure is in place, the conclusions are often trivial. So all the effort is in the creation of structure. Sounds like programming. Try "Once things are clearly defined, the solution is at most a single line.", "Write the language, and formulate your solution in it.", "Ask the right question." Entry: fully declarative and compositional Date: Sat Aug 25 11:45:01 CEST 2007 declarative: all names defined in a source file are to be know before the body of the code is compiled. that way, a program is a collection of definitions. compositional: make all branching constructs fit the compositional view by using combinators only. both are largely independent, but should lead to a better representation. advantages: D - side-effect free macros - detection of undefined words - possibility of modularization (later) C - correct optimizations in the light of branching let's learn a lesson from the past.. i can't afford to break it again. the changes that need to be made can be made without changing the semantics so much that a radical rewrite of forth code is necessary. all constructs used at this moment need to be preserved. is there an incremental path? the following syntactic transformations are necessary: 1. constant -> macro 2. variable -> macro 3. word definition -> macro 4. split a file into macro + code Entry: monads in Joy Date: Sat Aug 25 13:56:47 CEST 2007 http://permalink.gmane.org/gmane.comp.lang.concatenative/1506 http://citeseer.ist.psu.edu/wadler92essence.html so that's what i've got to do today. after reading manfred's comments, i think i need to read more of his work before i attempt to re--invent his ideas. the paper by wadler gives some relation between monads and cps. might contain what i need to explain the relation between monads and stacks. probably reaching the conclusiong that stacks are monads. let's see if i can learn something from this. for each monad, provide bind and unit. one complication is that functions in cat return a stack. let's see if that makes things worse. unut: x -- M x bind: M fn -- N bind extracts values from the monad, applies fn to each of them, and constructs a new monad from the output. it's easier to use 'join', since 'map' is so trivial. wait, is this really the case? map is (a -> b) -> (M a -> M b) from http://en.wikipedia.org/wiki/Monads_in_functional_programming (map f) m ≡ m >>= (\x -> return (f x)) join m ≡ m >>= (\x -> x) m >>= f ≡ join ((map f) m) this is a little different than what i've been talking about before.. maybe it's best i try to formulate this in scheme first. See brood/mtest.ss -- the misconceptions M a -> (a -> M b) -> M b does not mean the monads are different! it merely means: unpack, process, repack so what is 'map'. map really is map! see the next entry for more intersting stuff about monads in scheme.. Entry: monads in scheme Date: Sat Aug 25 16:18:59 CEST 2007 ;; Monads in scheme. (module mtest mzscheme ;; Monads are characterized by ;; - a type constructor M ;; - unit :: a -> M a ;; - bind :: M a -> (a -> M b) -> M b ;; In words: something that creates the type (ad-hoc in scheme), ;; something that puts a value into a monad (unit) and something ;; that takes values out of a monad, applies them to generate ;; several instances of the monad, and combines them into one. ;; Let's create some monads in scheme, using ad-hoc typing: ;; representation is not abstract, and there is no type check. Start ;; with the list monad. (define (unit-list a) (list a)) (define (bind-list Ma a->Mb) (apply append (map a->Mb Ma))) ;; Using monads, functions need to be put into monadic form. Simply ;; wrapping them with 'unit' is usually enough. ;; (bind-list '(1 2 3) (lambda (x) (unit-list (+ x 1)))) ;; So what is 'map' for the list monad? Haha! It's map! (define (map-list a->b Ma->Mb) (lambda (l) (map a->b l))) ) so now introduce polymorphy. instead of storing stuff in a hash, it's easier to just store a pointer to the monad structure in the record for a certain monad, i.e. use single dispatch OO. so i got a polymorphic bind, and a fairly decent interface that abstracts away the polymorphism, so 'unit' and 'bind' for each monad can operate on the representation only. (define-monad Mlist (lambda (a) (list a)) (lambda (Ma a->Mb) (apply append (map a->Mb Ma)))) so.. what i can i do with this? maybe best to try to translate some examples from wadler's paper into this mechanism. or to define a 'do' macro. the Haskell code for the list monad do {x <- [1..n]; return (2*x)} is a bit too mysterious.. let's try something simpler. the maybe monad. wait, all my functions are unary.. damn. how to take multiple values into a monad? can't really do that.. will need explicit currying. this uses letM* http://okmij.org/ftp/Scheme/monad-in-Scheme.html (define-macro letM (lambda (binding expr) (apply (lambda (name-val) (apply (lambda (name initializer) `(>>= ,initializer (lambda (,name) ,expr))) name-val)) binding))) so i transform this to my code.. try this: a = do x <- [3..4] [1..2] return (x, 42) a = [3..4] >>= (\x -> [1..2] >>= (\y -> return (x, 42))) now (define-syntax letM* (syntax-rules () ((_ () expr) expr) ((_ ((n Mv) bindings ...) expr) (bind Mv (lambda (n) (letM* (bindings ...) expr)))))) leads to this: (letM* ((a (Mlist '(1 2 3))) (b (Mlist '(10 20 30)))) (unit Mlist (+ a b))) #(struct:monad-instance #(struct:monad Mlist # # #) (11 21 31 12 22 32 13 23 33)) wicked. the macro expansion gives (bind (Mlist '(1 2 3)) (lambda (a) (bind (Mlist '(10 20 30)) (lambda (b) (unit Mlist (+ a b)))))) let's see if the type of 'return' can be inferred in a structure like this. no. the return type of the entire expression is determined by the return in the letM* block. this type is arbitrary and only determined by the context of the expression, to which we have no access in scheme. one possibility to fake this is using a dynamic variable. so, i guess it makes more sense to switch to map and join as basic operations? no. i ran into a problem with double wrapping of structures that requires the 'join' operation to be aware of the wrapping. so i'm going to revert the changes. next exercise: the state monad. i never really understood this. a state monad contains a function that will return a value and a new state. -- "return" produces the given value without changing the state. return x = \s -> (x, s) -- "bind" modifies transformer m so that it applies f to its result. m >>= f = \r -> let (x, s) = m r in (f x) s EDIT: see monad.ss Entry: kat monads Date: Sat Aug 25 21:35:12 CEST 2007 the problem seems to be that 'return' and 'bind' need to be formulated in a way that properly deals with the stack. somehow it seems to get in the way. let's take a new look at it, modeling things on 'map'. fmap s.a->s.b Ma -- Mb join MMa -- Ma return a -- Ma bind s.a->s.Mb Ma -- Mb (bind fmap join) the thing which bothers me is 'map'. something is smelly about map in joy, because of the stack "doing nothing". it's strange that 'for-each' feels really natural, because it has threaded state. but map somehow feels wrong.. Entry: for-each is left fold Date: Sat Aug 25 21:45:30 CEST 2007 for-each is foldl is sort of 'universal iteration'. '() '(1 2 3) (swons) for-each == '(3 2 1) foldr is more like 'universal recursion', and i don't have a direct analog in kat. maybe i should create one like this: '() '(1 2 3) (cons) foldr == '(1 2 3) Entry: state monad Date: Sun Aug 26 13:08:41 CEST 2007 a state monad is a nice example of a computation. nothing 'happens' as long as the monad is not executed explicitly by applying the value to some initial state. i think this is a nice starting point to formalize what i'm doing, since it's about the same principle: build a composition that represents the compilation, and execute it on an initial state. so, really, monads are a way to formulate any computation as a function composition. doesn't that sound familiar? the thing to find out is * how my very specialized way of state passing fits in the general monad picture. * why does 'map' feel so strange in Joy/KAT ? * what is a continuation in KAT? the last one i can answer, i think. it's a function that takes a stack, and represents the rest of the compuation. so the continuation of 'b' in [abcd] is just [cd]. i've added call/cc to base.ss let's re-read von thun's comments Entry: closures & stacks Date: Mon Aug 27 14:49:10 CEST 2007 something to think about: a compositional language can have first class functions without having first class closures, and without this leading to any kind of inconsistency. like 'downward closures only'. this brings me back to linear vs. non--linear. a key observation is that linear data structures are allowed to refer to non--linear ones, as long as the non--linear collector can traverse the linear data tree (acyclic graph in the case we work with reference counts as an optimization). but non--linear structures are NOT allowed to refer to linear structures. (because otherwise they would not be able to be managed by the linear collector). this makes the non--linear collector trivially pre--emptable by linear programs. PROVE THIS! Entry: linear memory management Date: Mon Aug 27 17:57:00 CEST 2007 something to think about is how embed a linear language in scheme as a model. as long as its primitives never CONS, this should work. i'm trying to formulate a machine that can express the memory management part of a linear language, if it is given a set of primitive functions. see linear.ss this is an attempt to make poke.ss work, but from a higher level of abstraction. something i did wrong on the first attempt is to change the tree structure WHILE still using the old addressing mode. permutation of register contents needs reference addressing, so my macros are wrong. this means i need a different representation of REFERENCES. let's say a reference is: - a pointer to a cons cell - #t for CAR and #f for CDR funny. i'm running into a problem numbering binary trees. the most visually pleasing numbering is breath first 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 this corresponds the the binary encoding: 1abcd... where a is the first choice, b the second, etc... the one i chose intuitively was 1...dcba which is not so handy, but is more efficient to implement when the labeling doesn't really matter that much. ok i got it working. i have a tree permutation 'engine' which is accessed by numerical node addresses. now what does this buy me? a simple way to talk about embedding linear trees. in practice, some of the nodes are constant, and are better put in registers. Entry: binary trees Date: Mon Aug 27 21:51:58 CEST 2007 still not 100% correct.. i'm loosing nodes. ok. i'm making a mess of it, but i think can conclude the following: 1. it is possible to use a tree as the data universe 2. normal forth operations can be written as binary and ternary permutations on a tree 3. such a tree is conveniently addressed numerically what i'm about to do is to: - create an embedding of normal forth operations in a single tree, by: * fixing the positionss of the stacks * associating each operation to a permutation - find a way to efficiently generate code for these operations, with the possibility of mapping some fixed nodes to registers. AHA! one pitfall i knew, and i run right in it. there's one operation which is not allowed: if R points to a cons cell, it is not alloweded to swap the contents of R with CAR or CDR of that cell, because this creates a circular link, effectively loosing the cell. more generally, it is not allowed to exchange R1 and R2 if they are in the same subtree. baker's machine contains no operations that can lead to such permutations. it only talks about exchanging the contents of registers with cons cells. this is different. i'm trying to write the permutation for '>r', written as (D . R) the following sequence of permutations is legal ((d . D) . R) -> ((d . R) . D) -> (D . (d . R)) which is (5 3) followed by (2 3). can this be written as a single cycle (2 3 5) ? one would say yes.. so i guess i had a bug? since it created a circular ref in my previous implementation. now i can get (2 3 5) to work, but (5 3 2) doesnt! i think i don't understand something essential here.. this is getting interesting! i think i see the problem now. one is that my permutations are inverted, and two is that (2 3 5) is not legal, but (5 3 2) is. how to distinguish legal from illegal permutations? and the inverse of (5 3 2) is not (2 3 5) but (2 3 7) it looks like this encoding of the nodes is not very useful for tree permutations. Entry: legal permutations Date: Mon Aug 27 23:39:23 CEST 2007 it looks like a more interesting approach is to start with operations that are legal and invertible, and find their closure. the difference with baker's machine is that i'm trying to use only one root. hmm.. there has to be a way to see if a permutation is legal.. why is (2 3 5) not legal. because 5 gets the value of 2, which points to 5. so a condition is that a register x cannot receive the contents of a register y if x is in a subtree of y. in (5 3 2) none such assignment happens. - 5 is not a subtree of 3 - 2 is not a subtree of 2 - 3 is not a subtree of 5 'subtree of' can be computed by comparing box addresses [1] [2|3] [4|5] [6|7] [1] [10|11] [100|101] [110|111] a is a subtree of b if b matches the head of a. this way, no circular refs can be introduced. instead of thinking about cons cells, think of binary trees. it indeed does not make sense to swap nodes if one node is a subtree of another node. what about enumerating all legal binary permutations on an infinite binary tree? () identity (2 3) (2 6) (2 7) (3 4) (3 5) (4 6) (4 7) (5 6) (5 7) (2 12) (2 13) (2 14) (2 15) (3 8) (3 9) (3 10) (3 11) (4 12) (4 13) (4 14) (4 15) (5 12) (5 13) (5 14) (5 15) (6 8) (6 9) (6 10) (6 11) ... back from tree rotations, which are not general enough... in binary () (10 11) (10 110,111) (11 100,101) (100,101 110,111) back to numbers level (bits) 1 / 2 (2 3) 3 (2 6,7) (3 4,5) (4 5,6,7) (5 6,7) (6 7) 4 (2 12,13,14,15) (3 8,9,10,11) (5 8,9,12,13,14,15) it's quite hard to specify without exclusion statements.. but i guess i got what i was looking for: limited to only binary permutations, the legal ones are easy to characterize. what about using multiple coordinates, and then embedding them in a numeration? It is always possible to encode an n-tuple of natural numbers as a single one by interleaving the bits. a legal binary permutation from node A and node B (A < B) can be written as the tuple (A - 2, s, d) where s denotes the same level trees and d the dept from it. this is really clumsy and doesnt work.. it looks like what i am looking for is a primitive dup and drop. the reality is, these are not primitive! Entry: tree rotations Date: Tue Aug 28 00:03:47 CEST 2007 can i work with just tree rotations? yes. moving an element from one stack to another is a tree rotation. the essence of a tree rotation is: - reversal of P -> Q to Q -> P - movement of one of Q's subtrees to P so a rotation is parameterized by 2 adjacent nodes P -> Q, and the left... wait! it's not a rotation, since the subtree that moves is the one in between. it is a rotation if the stacks are encoded as ((D . d0) . (r0 . R)) then a rotation is simply (D . (d0 . (r0 . R))) trees which represent associative operations have a value which is invariant under tree rotations. is this helpful at all, or am i moving away from my point? with 2 stacks, a data stack and a free stack, motion can be implemented by rotations. this is not general enough.. i have no need to preserve ordering. Entry: different primitives Date: Tue Aug 28 00:55:17 CEST 2007 so with a 2-stack system (D . F) with D rooted at 2 and F at 3, the primitives are: D = 2 D+ = 5 D0 = 4 D1 = 10 F = 3 the free list needs to be flattened. this can be done when reserving a new cell or when dropping a data structure. the latter is probably best since it is * more predictable: deleting a large structure takes time * all references to externally collected objects can be removed so, i do have a need for rotation! if the CAR of the free list is not NULL, rotate the free list, then DROP the newly exposed top and DROP the part we rotated to the stack. : >free (D+ F) (D F) ; : swap (D0 D1) ; : free> (D F) (D+ F) ; \ [a.k.a. nil / save] : drop null? if >free ; then rotate drop drop ; like baker remarks, a lot of operations can be coded so they avoid copying of lists. i have a lot of this code in PF already.. the moral of the story is: * this linear stuff is quite nice to build a language on top of, but you need a decent layer below it to create a proper set of optimized primitives to make it work efficiently. * using a single tree works just fine, but is probably not necessary if the basic structure (like where the D, R and F stacks are) doesn't change. * only use binary permutation of disjunct trees. disjunct trees are easier to spot for binary permuations. * numbering trees in 1abc... fashion works well, and is easy for drawing diagrams. * drop needs to deconstruct its argument. the hash consing thing in this paper i dont get http://home.pipeline.com/~hbaker1/LinearLisp.html but but... about ternery permutations. they are easier to understand. because the rotation i'd like to perform has to be factored in a non-intuitive way.. maybe it's just the rotation operation that's difficult to express that way? instead of focussing the movement of the data stack's first CONS cell, it's easier to focus on the movement of the cell we want to get rid of. so in the picture painted above, the operation 'rotate' is actually 'uncons' and would be (9 5) (4 5) i think that settles most of the questions. the rest is fairly straightforward to fill in. Entry: next Date: Tue Aug 28 14:23:06 CEST 2007 after this small detour about trees, NEXT on the list: * clean up syntax preprocessing & purely functional macros * investigate on HOF syntax for Purrr18 * determine if Purrr is a valid project, or if it's best to aim for Poke. Entry: ANS Forth - poke - PF Date: Tue Aug 28 14:26:07 CEST 2007 the last question is quite an important one.. if i'm planning to write a language for education, do i really want ANS Forth? the only reason would be to have something 'standard', but for what reason. better documentation? i never used ANS Forth, and the more i get into this language simplicity thing, the more i start to dislike it. i think i have all the elements for a decent linear VM ala PF. should fit on a pic18 and.. a cleaner language is easier to teach. moreover, a poke language can be made safe. is it worth to stop somewhere in the middle to use a little bit more optimal language, instead of one based on CONS cells? this is not something to decide in an instance, but i think life is already complicated enough to fill it with problems created by weirdness in ANS that i don't use.. Forth is dead. long live KAT & PURRR :) Entry: Haskell Date: Wed Aug 29 13:15:22 CEST 2007 just watched Simon Peyton-Jones’ OSCON 2007 tutorial, which clarified a lot of things. he talked mostly about type constructors, type classes, and the IO monad. * IO a is world -> (world, a) * and a type class is implemented as a record of functions that 'travels independently' from values, i.e. dispatch based on return type. * type constructors are also used for destructuring. this generalizes the 'list' constructor, and tuples (which are not constructors i think..) Entry: hash consing Date: Tue Aug 28 20:23:22 CEST 2007 so what's that all about. see: http://home.pipeline.com/~hbaker1/LinearLisp.html Reconstituting Trees from Fresh Frozen Concentrate first, that section is not about hash consing, but about something different: "our machine will be as fast as a machine based on hash consing" i dont get it.. Entry: compositional and? Date: Wed Aug 29 14:30:25 CEST 2007 i was wondering what the deal is with compositional view. it allows a simple framework for metaprogramming, but that's all.. i made this a bit more clear in the paper. Entry: curry-howard Date: Wed Aug 29 16:42:46 CEST 2007 quite remarkable. i'm running into cases where operations from the code i thought were merely a hack, like the 'snarf' operation, turn out to be quite important for a monadic formulation of a stack language. in other words: i'm extracting some mathematical structure by naming the types of all the transformations that are present in the code. i think i'm just going to do this exhaustively.. in other words, by hacking around semi-blindly, following just an ideal of 'elegance' i end up with a nice description of what i'm doing in categorical sense. Entry: arrows Date: Fri Aug 31 00:29:36 CEST 2007 reading 'programming with arrows' by hughes. this 'dip' business is really arrows.. just rewrote brood.tex to give a categorical relationship between a TUPLE language and a STACK language. what remains is to explain their difference... it's been quite a day.. what did i learn really? given a tuple language, mapping it to a stack langauge makes explicit the need for run time 'cons' if the tuple language can create closures. ok, i need to go over this again since i lost direction a bit.. the CTL -> CSL bit is good though, since it reflects a 'real' part of brood, namely the relationship between scheme and kat. I'm still not really satisfied about the explanation. I probably need some more time thinking about closures and dynamic memory: how to: - combine a low level language with just stacks and function compostions, both implemented as vectors, with a linear memory model that supports closures. - how to add 'constant trees' to a linear memory tree. - what about trees and reference counts. Also, i need to read Hughes paper about arrows. what about this vague rambling: - data stack = future data - return stack = future code Entry: stacks and continuations Date: Fri Aug 31 18:19:05 CEST 2007 from wikipedia http://en.wikipedia.org/wiki/Continuation Christopher Strachey, Christopher F. Wadsworth and John C. Reynolds brought the term continuation into prominence in their work in the field of denotational semantics that makes extensive use of continuations to allow sequential programs to be analysed in terms of functional programming semantics. for the linear memory case, i need to implement: - closures (== cons) - continuations (== a stack copy) to do this efficiently, i need baker's approach to linear data structures, which can be implemented using reference counts because they cannot be circular. something tells me i'm chasing something really obvious.. i guess the next thing to tackle is to describe the linear language, and write a C model for it. i.e. to implement POKE. Entry: CSL vs CTL Date: Fri Aug 31 22:03:26 CEST 2007 i talked myself into a pit.. what about "1 2 3 +". how can this be seen as a CTL? only by making + operate on more than 2--tuples. this means all arrows T_i -> T_j are also in T_{i+n} -> T_{j+n} Entry: linear Date: Fri Aug 31 22:29:04 CEST 2007 the next thing to do is to create closures without garbage collection. this would make PF interesting. so the deal is: tree structured data allows for 1--ref structures which can be optimized using reference counts. i guess this is the hash consing business. hash consing = - tabel van CONS cells - bij (cons a b) -> check if cell is in hash: inc refcount else new so that should be able to speed it up.. it's a bit smelly though. Entry: poke Date: Sat Sep 1 12:28:16 CEST 2007 yep.. time to get practical. this linear thingy is the most problematic one.. i guess the thing i need to investigate is: - write a linear memory manager in terms of a low-level set of operations (forth machine) - write the linear machine's interpreter in itself. i'd like to take a different approach with this: first write it in a testable highlevel setting, then just map it to lowlevel code. remarks: * by making the code storage nonlinear, a large problem is already solved: the return stack does not need to copy continuations. the return stack is a program == a primitive program | list of programs. * CDR coding. all code in flash are CDR-linked lists, but encoded such that they can be represented as vectors. this works very well with the remark above. it looks like this solves my earlier problem of vectors vs lists. * no branches. only combinators. * types: - primitive - integer - cdr-coded nonlinear cell - ram cell * type encoding: since there are only 4 types, 3 of which are memory addresses, it can be solved with a memory map, and N-2 bit integers. there's one important part i forgot: VARIABLES those don't really fit in the picture.. Entry: partial application vs. curry Date: Sat Sep 1 23:35:14 CEST 2007 curry: ((a,b)->c) -> (a->(b->c)) then partial application is i.e. curry (+) 123 so maybe i should follow christopher in: http://lambda-the-ultimate.org/node/2266 and call what i'm calling curry 'papply' and apparently, partial evaluation != partial application. so how do they differ? Entry: XY and stack/queue Date: Sat Sep 1 23:53:44 CEST 2007 the [d r] thing i described about continuations yesterday is made explicit here: http://www.nsl.com/k/xy/xy.htm XY by Stevan Apter Entry: goals Date: Sun Sep 2 12:44:01 CEST 2007 the reason brood.tex doesn't work well is that i'm not setting goals of the project. i started wandering when talking about categories... so the goals are: - create a language based on the ideas behind Forth, which is * easily mapped to a target (i.e. has very lowlevel elements) * less resistant to static analysis than Forth. * requires small resources in base form (i.e. just some stacks) * contains some highlevel constructs that can be easily optimized, i.e quoted programs ala Joy. * serves as an implementation language for a CONS based language. either a linear or nonlinear one. Entry: references Date: Sun Sep 2 13:15:59 CEST 2007 time to collect some references. Entry: language levels Date: Sun Sep 2 13:46:47 CEST 2007 - macro assembler / virtual forth machine: purely static. macros do not rely on any run time kernel support. - macros with run--time support: some constructs that can not be translated to straight assembler require run time support code. for example indirect memory access using '@' and '!' - dynamic memory: cons what i'm guessing is that i need to get my dependencies straight. this means: - get rid of side--effects in macros (all names are identified in first pass) - create a purely compositional base language with 'required optimization' so where to start? it's a big job, but really needs to be done before i start implementing linear CONS. it looks like the end result here is going to be quite different from what i have now. i'm basicly moving from a linear to a block structured language. Entry: block structure Date: Sun Sep 2 13:54:45 CEST 2007 the real question is: should i implement the block structured language on top of the linear one, or provide a set of macros to translate forth into a block structured language, which is then transformed back into a linear one? it seems reasonable to keep the forth layer as the lowest one, and translate into it. so basicly i need a lexer with list support. time to factor out the basic problems: * stream.ss * stream-match.ss Entry: lazy lists Date: Sun Sep 2 16:35:20 CEST 2007 added stream.ss and corresponding matcher. funny how reverse accumulation is no longer needed when you use lazy lists! maybe i should propagate this to the asm buffer? there is one problem with the asm buffer though: it is used as a stack. anyways.. i can make the lexer lazy. DONE. it's simpler now. Entry: on lazy lists Date: Wed Sep 5 17:29:30 CEST 2007 let's see if i can say something intelligent about this.. what i notice is that streams make you avoid the following pattern: * read list, process, accumulate as push. * reverse the list lexing/parsing fits this shoe nicely. so.. are streams processes? instead of using '@cons', one could just as well write: - read - process - write so what is the difference? it looks the lazy list approach is less general, since it has only one output? multiple outputs need to be handled using multiple lists. while the process view uses one process and multiple streams. and yes, these are processes. since the non-evaluated tails act as continuations. every '@cons' should be read as write+block. so, what about the asm? it still needs to be used as a stack, however, multiple passes can now be done lazily. Entry: onward Date: Wed Sep 5 22:27:58 CEST 2007 i keep getting distracted.. i got some work to do! first one is elimination of side effects in macros: all side effects in the brood application are to be cache only. this is an important part that will open the road for more interesting changes, hopefully leading to a fully compositional lowlevel language with a module system. Entry: monads and map Date: Wed Sep 5 22:40:42 CEST 2007 so.. what about writing a macro for this 'generalized map - not quite a real monad - collect results in a list' pattern? i guess this is just unfold.. no it's not.. got this macro + usage: (define-syntax for-collect (syntax-rules () ((_ state-bindings terminate-expr result-expr state-update-exprs) (let next ((l '()) . state-bindings) (if terminate-expr (reverse! l) (next (cons result-expr l) . state-update-exprs)))))) (define (@unfold-iterative stream) (for-collect ((s stream)) (@null? s) (@car s) ((@cdr s)))) but it looks just ugly, so i'm going to forget about it.. i guess, if this pattern shows up in code, it means i'm not using a proper hof. what about writing it as a hof instead of a macro? i think i'm getting a bit tired.. just reinvented unfold.. no, it's unfold* Entry: linear parser Date: Thu Sep 6 00:25:26 CEST 2007 the parser can definitely be moved to streams. the fact that it contains syntax streams is not really relevant to the structure of the algorithms.. for example: i'm using 'match' in forth.ss it changes a lot: the prototype of the parsers now is @stx -> @stx. but the code should be a lot easier. due to the linearity of forth / compositional code, writing a macro transformer as a stream processor instead of a tree rewriter makes a lot of sense actually.. the preprocessor will translate a token stream -> s-expressions. occurances of syntax-case can be replaced by @match. which is exactly what i avoided in a previous attempt.. maybe i should just create a @syntax-case macro that's similar to the @match macro, taking partially unrolled syntax streams. hmm.. pure syntax-case is a bit clumsy.. but the 'no rest' parser macro i'm using does fit pretty well. something i've been talking about before: syntax-case: matcher for compilation: merge 2 namespaces (pattern var + template) match: matcher for execution: only a single lexical namespace i don't know how to make the pattern more explicit, but it boils down to something like this: if you're match together with quasiquote, you're actually COMPILING something, not computing something. in that case, pattern matching using syntax-case might be more appropriate, even if you're not using scheme macros, because of the merging of template and pattern namespaces. (which have to be mixed explicitly using quasiquoting). actually: syntax-case matches 3 namespaces: - pattern - template - transformer namespace Entry: SRFI-40 Date: Thu Sep 6 10:16:17 CEST 2007 it's been fun, but time to move to a standard implementation: http://srfi.schemers.org/srfi-40/srfi-40.html it would indeed be strange if this were not somehow standardized.. (require (lib "40.ss" "srfi")) but, 40 has problems: http://groups.google.com/group/plt-scheme/browse_thread/thread/637cc74047a7ada9 anyway: thing to remember: streams can be ODD or EVEN http://citeseer.ist.psu.edu/102172.html i'm using EVEN style: (delay (cons a b)) instead of (cons a (delay b)) so what exactly is the problem with http://srfi.schemers.org/srfi-45/srfi-45.html ? it can be seen in @filter, as explained in the srfi-45 document: a sequence of (delay (force (delay (force is not tail recursive. this is because 'force' cannot be tail recursive: it needs to evaluate, and cache the value before returning. srfi-45 solves this by introducing 'lazy' easy to see in: (define (loop) (delay (force (loop)))) ok. so i'm sticking with my own lazy stream implementation. most of it should be fairly easy to replace with some decent standard library later. i don't think i'm doing anything special.. Entry: linear parser begin Date: Thu Sep 6 15:36:48 CEST 2007 - all parsers are @stx -> @stx - parser-rules: easily adapted (used by predicates->parsers) - named-parsers i'm forgetting something.. a parser needs to distinguish between 'done' and 'todo': the driver will stitch the stream back together. otherwise each parser needs to explicitly invoke the driver routine as the second argument to '@append'. the reason we use a driver is to make each individual parser agnostic of it's environment.. in concreto: the current implementation can be largely reused, but list tails need to be replaced by streams. then the remaining question is: does a primitive parser return 2 streams, or a list and a stream? again: - if parser does 1 expansion, it needs to return 2 streams. - if it does multiple, it suffices to return only one it's best to let the driver decide, so the first one is more general. making both streams makes the interface simpler. looks like the only thing this needs is a proper syntax-case style syntax stream matcher so i'm not jiggling too many syntax<->atom conversions. need to think about that a bit better, to see what the prototype needs to be. Entry: parser rewrite Date: Thu Sep 6 22:46:20 CEST 2007 the end is near.. code seems to simplify a lot. need to write 2 more generic parsers: - delimited - nested interesting.. this stream business is deeper than i thought. i do run into a problem though: (values processed rest) what if rest is only determined if processed is completely evaluated? by moving the 'append' to somewhere else, the forcing order can no longer be trusted. does this really matter?? i need a break. ok.. i got it worked out as '@split' which returns 2 values: the first is a stream before a delimiting value, and the second is the stream after. the code i have now needs a certain evaluation order. i can make it independent of that by forcing until the rest-stream becomes true. that works. also got @chunk-prefixed working: which separates a prefixed stream into a stream of prefixed streams. Entry: macro mode Date: Fri Sep 7 13:24:38 CEST 2007 i found out that ';' can just as well be used in macro mode for 'jump past end', if macro mode can only contain prefixed definitions. this will bring multiple exit points to macros. can change this later. anyways.. all parsers are now token (syntax) stream processors. it should be really straightforward from here to: - separate macro and code definitions - perform separate compilation for forth files (macro definitions) about the use of ';' in macros: this probably needs some dynamic variable because of context: a macro representing a forth file != a normal macro. in a forth file ';' means return to sender, in a macro it means jump past end.. maybe i should avoid this? Entry: bored Date: Wed Sep 12 21:28:51 CEST 2007 i had some days off writing an article for folly, and my mind is wandering away from the lowlevel forth stuff.. talking to a friend yesterday i realized i need something different. i'm getting stuck. let's rehash the problems i'm facing right now: - i need pure functional macros: no side effects except hidden in cache / memoization. this requires a true code dependency system. doing this half-assed makes no sense, so i should at least have something like mzscheme, possibly piggy backed on top of it. that however is not easy, since this will probably mess up my namespace stuff. so i'm a bit stuck because i can somehow forsee the problems that are coming after i fix up my macros. - i want to give up on portable ANS forth idea, and design a safe PF-like linear language. the stumbling block there is variables, since it's incompatible with the linear idea. at least, doing it using references to cells. maybe i can use some trick here? can variables be managed externally so they never need to be deleted? can they be seen as data roots like machine registers? something is not right in my intuition here.. EDIT: Mon Oct 8 21:06:17 CEST 2007 Pure functional macros work now, and make things a lot better, but this linear language variable thing i'm still quite puzzled by. Entry: sticking to forth as basis Date: Sat Sep 15 05:07:40 CEST 2007 reading http://lambda-the-ultimate.org/node/2452 forth in the news i'm more and more convinced that forth should be the lowest level, not some block structured higher level construct, which would require more elaborate optimizations. it's best to have the pure control structs (i.e. for next) as direct macros, and implement the higher code block quoting constructs in terms of them. forth has this way with return stack juggling that's very powerful for making new control structures. this is hard to do efficiently when you tuck it all away in combinators.. Entry: brood paper Date: Sun Sep 16 14:47:05 CEST 2007 actually.. it would be interesting to go over my ramblings and make a list of things i got really wrong, or saw too simplistic. then see what solution i got or how i came to understand the issues. - monads are not just hidden top of stack items - the relationship between closures and CONS - syntax-rules and composition - pattern matching and algebraic types - lazy lists vs. generators: lists remain 'connected' - 'natural' compiler structure: scoping rules, quasiquoting and syntax-case (3 levels) - more specificly: quasiquote vs syntax case: when to use macros? is it code or data? - looping and boundary conditions (i.e. image processing) - cdr coding and lists as arrays - importance of side-effect free 'loading' + relation to phase separation. Entry: linear structures, variables and cycles Date: Mon Sep 17 16:06:58 CEST 2007 in a linear structure (tree or acyclic graph if hash consing is used) cycles are not possible. so how do you represent datastructures that have some form of self-reference? the thing we're looking for here is something akin to the Y combinator: instead of having a function refer to itself, a different function is used to turn a function to "tie the knot". let's start with: http://scienceblogs.com/goodmath/2006/08/why_oh_why_y.php i'll try to put it in my own words, see next post. the link above has an interesting comment on self-application. also, the wikipedia page has some interesting links: http://en.wikipedia.org/wiki/Y_combinator so how to you apply this trick to data structures? my guess would be to start from data structures in the lambda calculus, and then making things more concrete. Entry: Y combinator Date: Mon Sep 17 18:55:07 CEST 2007 a fixed point p of the expression F satisfies F(p) = p. the Y combinator expresses p in terms of F as p = Y F. combining the two we get: F (Y F) = (Y F) simply expanding this gives exactly what we want: Y F = F (Y F) = F (F (Y F)) = F (F (F (...))) where the dots represent an infinite sequence of self applications. that's all folks. in order to implement useful recursion, simply write the 'body' F, and Y will take care of the rest. let's make this a bit more intuitive. suppose we want to create a function f which is defined recursively in terms of f. look at F as a function which produces such a function f, F : x -> f the recursion is a consequence of the infinite chain of applications f = Y F = F (F (F ...)) = F f so what are the properties of F? first it needs to map f -> f. and second if a finite recursion is desired, it needs to do this in a way that it creates a 'bigger' f from a 'smaller' one, eventually starting from the 'smallest' f which does not depend on f: this leads to a finite reduction when normal order reduction is used. let's solve this problem in scheme, for Y F = factorial. so we know that: factorial = F (F (F (...))) or factorial = F factorial in words, F is a function that returns a factorial function if it is applied to a factorial function. so the factorial function is a fixed point of F. the Y combinator finds this fixed point as factorial = Y F. the rest is fairly straightforward: a nested lambda expression which uses the provided 'factorial' function to compute one factorial reduction step: F = (lambda (factorial) (lambda (x) (if (zero? x) 1 (* x (factorial (- x 1)))))) the thing which always tricked me is 'fixed point', because i was thinking about iterated functions on the reals used in many iterative numerical algorithms like the newton method. in the lambda calculus, there are only functions and applications, so a fixed point IS the infinite nested application, since that fixed point value doesn't have another representation, while a fixed point of a function on the reals is just a point in the reals. Entry: algebraic data types Date: Tue Sep 18 13:44:48 CEST 2007 look no further.. the plt-match.ss actually has this kind of stuff, at least the pattern matching associated to algebraic types. and i think it is extensible. http://download.plt-scheme.org/doc/371/html/mzlib/mzlib-Z-H-34.html http://en.wikipedia.org/wiki/Algebraic_data_type "In computer programming, an algebraic data type is a datatype each of whose values is data from other datatypes wrapped in one of the constructors of the datatype. Any wrapped data is an argument to the constructor. In contrast to other datatypes, the constructor is not executed and the only way to operate on the data is to unwrap the constructor using pattern matching." Entry: pic network Date: Tue Sep 18 20:40:10 CEST 2007 1. simple: 2 wires 2. robust: working boot loader Entry: parser-tools lexer Date: Thu Sep 20 19:20:11 CEST 2007 i'm replacing the lexer with the one from parser-tools. this is a lot lot easier than writing your own. what a big surprise; too bad i postponed it for so long.. Entry: message passing Date: Thu Sep 20 21:15:05 CEST 2007 hmm.. message passing concurrency seems to be the real solution of tying a core and metaprogrammer together. i should find out how to formalize message passing (i.e. Peter Van Roy and and Seif Haridi's book "Concepts, Techniques, and Models of Computer Programming" http://www.info.ucl.ac.be/~pvr/book.html) Entry: work to do Date: Sat Sep 22 19:42:35 CEST 2007 * documentation * bootloader (+- DONE) * independent of emacs? preparing for waag & piksel, the most important problem to solve is to make the bootloader robust. this is probably best solved as: serial cable plugged -> start console unplugged (i.e. with jumper to gnd) -> start app (at 0x200 hex) all interrupt vectors moved to 0x200 block then this block can be made write-protected, so there's absolutely no way to mess it up -> can eliminate ICD2 connector on boards. Entry: purrr manual questions + necessary fixes Date: Sun Sep 23 13:30:49 CEST 2007 * can i get at least an 16--bit library running without making it stand-alone? * how difficult is it to unify macros and words from user perspective? -> interaction always compiles a 'scrap' function. * is it possible to write all control structures in terms of tail recursion? the more filo ones: * exceptions are imperative features.. is this bad? when is this bad? it's like using continuations, which is interesting for backtracking etc. i'm leaning toward pure functional programming, but some imperative features are really OK as long as they are shielded. i.e. global mutable variables are clearly not. (namespace: single assignment = ok + possible to hack for debug). Entry: new bootloader fixes Date: Mon Sep 24 12:37:41 CEST 2007 i got the monitor working, now i need to get the synth back up. some things that need fixing from the debugging side: * a correct jump assembler (+- DONE: throws exception) * a correct disassembler (+- DONE: lfsr broken) * constants in console (DONE) * cache macro compilation * a command to erase a block of code during upload note about field overflows: for data values, it should be ok: it's quite convenient to assume they are finite size. for example, banked addressing. for code it's an error, since you don't have any control over this while programming. Entry: error reporting Date: Mon Sep 24 14:15:54 CEST 2007 yes, i am at fault here. never really gave it much thought, but it's starting to become a problem. my error reporting sucks. one of the most dramatic problems is the loss of line numbers to relate errors to original code. a solution for this is to use syntax objects everywhere. second is the way errors are handled in the assembler. currently i have some code that's a bit hard to understand: i got used to hygienic macros, and symbol capture looks convoluted to me. maybe i just need to rewrite that first? hmm.. what about systematicly replacing 'raise' with something more highlevel. one of the things that is necessary is a stack trace. there was some talk on the plt list about this recently. let's have a look. there is (lib "trace.ss") which doesn't really do what i need, since it's active. what about taking this error reporting seriously, and giving it its own module? would be good to eventually document all possible errors etc. what about the following strategy: every dubiously reported error will be fixed, no matter what it takes. >> c> ERROR: #: no clause matching 1 argument: (qw) this is a stack underflow error i was thinking about installing an error translator in rep.ss, but this kills the tail position. therefore, errors need to be translated at the top entry point, which in this case is in prj.ss it's really not such a simple problem.. need to define what information i'd like to get: errors need to b e reported at 'interface' level which is either compile/run of files/words. compile errors are most problematic since they need to be related to source location.. Entry: state mud Date: Tue Sep 25 14:05:35 CEST 2007 the prj.ss file should do nothing more than fetching/storing state and passing it to pure functions. i am a bit appalled by the way things work in prj.ss, because this state binding tends to swallow everything.. maybe it's not such a good idea after all? i guess it is still a good idea, but its only function should be to manage state. let's rehash state stuff: * only prj.ss contains permanent state * I/O uses read-only dynamic scope for the read/write ports * macros etc.. are supposed to be read-only cache * all the rest is functional UPDATE: Thu Sep 27 22:56:03 CEST 2007 - moved some functionality to badnop.ss - adopted a left/right column notation for state/function Entry: boot code and incremental upload Date: Tue Sep 25 15:05:23 CEST 2007 the basic rule for forth is: code is incremental. if you need to patch backward, you need to do an erase + burn cycle. how to do this automaticly? it's probably not so hard to solve by performing (CRC) checks on memory. Entry: core syntax Date: Tue Sep 25 18:05:36 CEST 2007 just writing the purrr manual and i got back to this language tower thing... i really need a core s-expression based syntax for code with multiple entry and exit points, instead of forth. Entry: or Date: Tue Sep 25 19:44:24 CEST 2007 Something that's really handy in scheme is a short-circuiting 'or'. i'm in need for something like that to define interactive word semantics: try executable words first, then try variable names, then try constants (or later macros). In scheme this is easy because variables are referenced multiple times, in CAT this is awkward due to explicit copying/restoring of the argument stack. Some backtracking formulation would be nice, but generic backtracking is overkill. It also requires explicit handling of the continuation object. Escaping continuations work fine here, and they can be stored in a dynamic parameter, so no explicit manipulation of continuation objects is necessary. With 'check' being a word that aborts the current branch if the top of the stack is false, using the quasiquote (see next post) this is simply: `(,(foo check do something check more stuff) ,(bar check do something else) ,(in case everything fails)) attempts The apology: In a compositional language, escape continuation (EC) based backtracking might take the role of a conditional expression because it's often easier to go ahead and backtrack on failure than to perform a number of tests/asserts ahead of time which might CONSUME your arguments, so you need to SAVE them first. An EC can be used to restore the contents of the stack before taking another branch. The disadvantage of course is that words that use 'check' are only legal within an 'attempt' context, and are not referentially transparent. I guess this is ok.. same as using catch/throw. I do feel a bit like a cowboy now.. What about distinguishing 'bad' exceptions from 'good' ones? Using exceptions in CAT has always been awkward, but the 'attempts' syntax here seems nice. Entry: quasiquote Date: Tue Sep 25 22:12:34 CEST 2007 what about postscript style [ ] quotation to create data structures with functions? i can't use [ ] or { } since mzscheme sees them as parentheses. only angle brackes are left alone.. so either i'm creating a syntax extension, ie.e (list: (bla) (foo) (bar)) or i use an angle braket structure. since the latter will work, i'm using that: <* *> what about just using the quasiquote here? i'm not using it anywhere else and i'm already using quote. it's only legal on programs: and unquote means: insert program body here. Entry: assembler optimizations / corrections Date: Wed Sep 26 02:05:11 CEST 2007 A) jump size optimization currently i have none. recently i introduced at least error reporting on overflow. i think the deal is that doing it 'really right' is difficult; i'm not sure there exists an optimal algorithm. the simplest approach is: * convert small -> long jump * increment/decrement jumps before/after the instruction * update dictionary accordingly it's probably easiest to do this on an already fully resolved buffer (after 2nd pass). this algorithm is confusing due to the forward/backward absolute/relative destinction. also, doing this without mutation seems troublesome. B) jump chaining was really easy in the original badnop due to use of side-effects. somehow this problem looks as if there's some weird control structure that might help solve this is a more direct way. OK... finding the optimal is apparently NP-complete http://compilers.iecc.com/comparch/article/07-01-037 > [There was a paper by Tom Szymanski in the CACM in the 1970s that > explained how to calculate branch sizes. The general problem is > NP-complete, but as is usually the case with NP-complete problems, > there is simple algorithm that gets you very close to the optimal > result. -John] or not? http://compilers.iecc.com/comparch/article/07-01-040 If you only want to optimize relative branch sizes, this problem is polynomial: Just start with everything small, then make everything larger that does not fit, and reiterate until everything fits. Because in this case no size can get smaller by making another size larger, you have at worst as many steps as you have branches, and the cost of each step is at most proportional to the program size. so, it looks like the simple approach of using short branches and expanding/adjusting + checking is good enough. Entry: platforms Date: Wed Sep 26 05:11:06 CEST 2007 been thinking a bit about platforms. some ideas: * 32 bit + asm makes no sense. GCC is your friend here, and should generate reasonably good code for register machines. split language into 2 parts: POKE for control stuff, and some kind of dataflow language for dsp stuff. * AVR 8 bit makes not much sense either. there is GCC and i already spent a lot of time optimizing 8 bit opcodes.. learning the asm sounds like a waste of time. * don't know if PIC30 makes a lot of sense. it is an interesting platform (PDIP available), and they are reasonably powerful, if not a bit weird. maybe focus on PIC18, and a small attempt to get a basic set of words running for PIC30? Entry: capacitance to digital Date: Wed Sep 26 05:26:25 CEST 2007 http://www.microchip.com/stellent/idcplg?IdcService=SS_GET_PAGE&nodeId=2599¶m=en531579 CAPACITANCE TO DIGITAL CONVERTER To convert the sensor’s capacitance to a digital value, three things have to happen. First, comparators and the flip flop in the comparator module must be configured as a relaxation oscillator. Second, the desired sensor must be connected to the relaxation oscillator. Third, the frequency of the oscillation must be measured. The configuration of the comparator and the SR latch require configuring the comparators, the SR latch, and the appropriate analog inputs. Connecting the sensor to the oscillator requires the control software to select the appropriate analog input to the comparator module’s multiplexer. It must also select the appropriate input to any external multiplexer between the sensors and the analog inputs of the chip. To measure the frequency of the oscillation, TMR1’s clock input must connect to the output of the relaxation oscillator, and a fixed sample period will be controlled by TMR0. To start a frequency measurement, both TMR0 and TMR1 are cleared. The TMR0 interrupt is then enabled. When the interrupt fires, TMR1 is stopped, and the 16 bit frequency value in TMR1 is retrieved. Both TMR0 and TMR1 can then be reset for the next measurement. To keep the accuracy of the frequency measurement consistent, the interrupt response time for the TMR0 interrupt must be kept as constant as possible, so no other interrupt should fire during a measurement. If one does, then the measurement must be discarded and the frequency measurement must start over. Once the 16-bit value is retrieved, the detector/decoder algorithms can determine if the shift in frequency is a valid touch by the user or not. For more information on the interrupt services routine for TMR0, and the initialization of the relaxation oscillator, refer to application note AN1103 on Software Handling for Capacitive Sensing. Entry: todo list Date: Wed Sep 26 19:46:52 CEST 2007 URGENT: * word reference manual the primary goal would be to have documentation available at the command console or during emacs editing, instead of just in paper form. a tutorial can come later. now where do i specify it? * code protect the boot sector (OK) * interaction macros (needs syntax + minor support in prj.ss) * readline console * command line completion * make it installable (-> solve library deps? : install in collects?) * check battery / BREAK resistor * simplify prj.ss into chunks that operate on state explicitly (OK) NOT URGENT: * macro cache (maybe explicit files? read about compilation) * scheme library split (module path handling) * bootloader: automatic boot (2ND) sector patching * assembler changes + functional macros Entry: boot code Date: Thu Sep 27 15:57:28 CEST 2007 this is actually pretty important. once i start sending out kits, it's not so easy to change the bootloader. some things to note: - monitor.state -> (dict ...) format is the only important part - boot sector is independent of any macros : only words count - machine model obviously needs to stay stable (hasn't changed in years) - binary api (the monitor commands) need to stay stable what about making the bootstrap interpreter simpler? get rid of anything other than 'receive transmit execute', and leave the rest to a dictionary? if there's ever a problem for portability or whatever, this might be the way to go: this interface allows to hide all functionality in the dictionary associated to the boot kernel. right now it's still quite pic-specific. some things will become less efficient though.. also, the way the code is organised, sending commands will become more difficult. the set i have now is complete enough, and reasonably efficient. let's keep it simple and stick with the current one. another thing: fixing the boot block. let's try that: setting 30000B to A0 does the trick (CONFIG6H : WRTB) Entry: using the ICD2 pins. Date: Thu Sep 27 15:53:34 CEST 2007 last couple of days were a bit too much on the dreaming side. i need something concrete to fix. i was thinking about simplifying the programming interface. was thinking about using the ICD2 pins to also do debug serial comm. but why? if my boot kernel is stable, this is entirely unnecessary, except for reset! Entry: ramp up to purely functional macros Date: Thu Sep 27 20:57:29 CEST 2007 the parser. STAGE 1: - rewrite 'constant' as a macro definition - separate macros from the body code, which is seen as a single function with multiple entry/exit points. problem still not solved: 'variable' currently, variable creates a constant containing a symbol, and 'code' that performs the allocation later during the assembly phase. so in fact, it's not so problematic. Entry: prj.ss Date: Thu Sep 27 22:59:52 CEST 2007 simplified it a bit: made state ops more explicit, and moved functionality to badnop.ss this looks like a nice approach in general. i do wonder why i still need 'functional state' at the prj.ss level: most state updates are intermingled with microcontroller state updates which are dirty anyway. one thing: it keeps me honest. on the other hand, i'd like to move to some "image" representation. cached macros would be cool. maybe i should look at that now. Entry: macro cache Date: Fri Sep 28 00:04:07 CEST 2007 it looks like the bulk of the 'revert' time is spent in needlessly compiling code. there aren't so many run-time created macros: and constants are currently not 'eval'ed. maybe i should make that so i can snarf them out. hmm.. spaghetti. the problem is that constants are still treated separately. i can't unify them with macros until macros are purely functional so they can be evaluated to see if they produce constant values. solution dependences: file parsing to distinguish macro/code then: purely functional macros then: elimination of assembler constants however, doing the first one requires elimination of assembler constants! looks like this is the reason why i can't oversee the problem: it's quite a big loop. anyways, i can write the parsing step and test it leaving the side-effecting macros intact. then move to side-effect free macros and change the constant parsing to translate constants to macros. so. maybe i need an S-expression syntax first, so i can translate code to it! for macros this is easy: i'm already using one. for composite code however, it becomes more difficult due to the multiple entry-exit points. this can be left alone in a first attempt. Entry: product vision statement Date: Fri Sep 28 01:05:20 CEST 2007 http://www.codinghorror.com/blog/archives/000962.html for (target customer) who (statement of need or opportunity) the (product name) is a (product category) that (key benefit, compelling reason to buy) unlike (primary competitive alternative) our product (statement of primary differentiation) for embedded software developpers who want to program small embedded systems the Brood system is a tool chain that supports incremental bottom up development unlike C our product has integrated metaprogramming through built-in macros. something like that.. interesting. Entry: documentation Date: Fri Sep 28 14:03:33 CEST 2007 write a purrr manual in tex2page by sending queries to the brood system. this should use an interface similar to snot. brood needs to be centered around services, of which snot is one. so let's try this: services with - direct access to brood for SNOT and RL - document generation Does services.ss run inside the sandbox? YES So all calls from snot.ss -> services.ss go through a sandboxed eval. Services.ss itself does not need to take care of this, and can use direct calls. the deal is this: a CONSOLE needs to separate: - TOPLEVEL (represented by eval) - STATE (a data structure stored independent of toplevel) Entry: persistence Date: Fri Sep 28 19:26:54 CEST 2007 i must not forget that the way i use persistence is a SOLUTION, and not the original problem. the real problem is a conflict between two paradigms: * TRANSPARENCY as in MzScheme's module system * image persistence and run--time self modification as usual, my problem is rooted in ignorance. i've been jabbing about the distinction between the two above for a while, but the real problem is compiler compilation time. i need to have a look at MzScheme's unit system. it sould be possible to reload units after recompiling them because they are mere interfaces. Entry: services Date: Fri Sep 28 23:24:59 CEST 2007 hmmm.. i didn't really get anywhere today. but at least i figured out what 'services' should be. it's just the stuff that snot has access to, but without the snot interface. i renamed it 'console.ss' and took it out of 'snot.ss', which is now just a bit of glue. Entry: forth preprocessing Date: Sat Sep 29 15:51:12 CEST 2007 parsing and lexing. it's divided in a somewhat un-orthodox way LEXING there are 2 front ends: forth-lex :: string -> atom stream forth-load-in-path :: file,path -> atom stream the lexing part flattens the load tree. i.e. during lexing, the source code is made independent of the filesystem. PARSING this is where i have to break things, so let's commit first. 1. flat forth stream -> compositional forth stream with macros removed 2. constants -> macros let's see if i understand: constants are bad. there is no way around the fact that constant swallows a value: it's the worst case of reflection. this is not compatible with current parser. keeping it would require lookahead. so 'constant' needs to be replaced entirely by 'macro' in source code. looking at the previous entry [[phase-separation]] what is required is indeed a parsing step that can translate 1 2 + constant x --> macro : x 1 2 + ; forth yes, this is of course possible, but is it really worth it? maybe it's better to clean up the Purrr language semantics now than to carry around the code that allows this. ad-hoc syntax is a nuisance. so, current path: CONSTANTS are being removed. that was easy :) now, for variables. variable abc does 2 things: it creates a macro that quotes itself as a literal address, and it adds code that tells the assembler to reserve a RAM slot. maybe i should use 'create' and 'allot' ? (back to that later) currently the parsing seems to work, except for the macro/code separation step. for this i need a stream splitter. in stream.ss i have '@split', which just splits off the head of a stream, not true splitting. status: - parsing step: ok - load! setep: ok (like previous load, but with macro defs separated) next: - remove all side-effecting macros - change the assembler to take values from macros remarks: * is dasm-resolve still possible? (value -> symbol) status: - monitor.f -> monitor.hex gives the same code Entry: cleanup Date: Sat Sep 29 21:21:54 CEST 2007 core changes seem to be working. the rest is cleanup. TODO: - fix variable (OK) - fix interaction constants (OK) - fix sheepsint (OK) - extract macros from forth file -> compositions + save as cache (OK) - fix interaction macros that reduce to expressions - trick macros into generating their symbol during compilation, and value during assembly. (restore disassembly constants) - clean the assembler name resolver Entry: storing application macros in state file Date: Sat Sep 29 22:31:46 CEST 2007 why not? this solves a lot of problems.. and they are available in source form, so there's not problem to store them symbolically. Entry: profiling Date: Sun Sep 30 03:15:31 CEST 2007 on sight.. but still quite remarkable. loading monitor.f from source to S-expressions takes a lot more time than either compiling the macros or compiling the code to a macro and running it. both are instantaneous. ha! actually, that's very good news. improving the speed of the lexer seems a lot easier to do than improving the speed of the compiler. looking a bit further, sheepsint.f seemed to be faster. the reason is thus the constants. maybe i should just put them back to s-expressions? they don't change much after all. Entry: upload speed Date: Sun Sep 30 03:40:57 CEST 2007 It's quite annoying the upload speed is so slow. I need a way to change the speed on the fly. EDIT: baud rate: commit goes a little bit faster when baud rate is changed from 9600 to 38400, so the limiting factor is probably the flash programming. Entry: parsing and printing Date: Sun Sep 30 16:17:09 CEST 2007 there are a couple of placese in the brood code where (regular) parsing and printing are done in a relatively ad-hoc way using 'match'. maybe i should have a look at extending match to provide better pseudo "algebraic types". EDIT: http://www.cs.ucla.edu/~awarth/papers/dls07.pdf (*) http://planet.plt-scheme.org/package-source/dherman/struct.plt/2/3/doc.txt (*) looks really interesting. also, i need to have a decent look at COLA: http://piumarta.com/software/cola/ EDIT: i changed the syntax for the peephole optimizer to something more akin to algebraic types & matching.. still a bit of a hack, but there's a better adapted quoting mechanism now. Entry: deleted from brood.tex Date: Sun Sep 30 19:25:53 CEST 2007 Some important assumptions I'm making to support the current solution are that code updates need to be made \emph{while running}, and that the target is severely \emph{resource constrained} such that all compilation and linking needs to be done off--target. This excludes \emph{late binding} of most code. Another assumption I'm making is that some binary code on the target will never be replaced, and will drift out of sync with the evolution of the language in which it was written. An example of this is a \emph{boot loader}. Such code needs to be viewed as a black box. This approach violates transparency. To give this section some context, I have to make my \emph{beliefs} more explicit. I believe that a compiler is best implemented using pure functional programming, because it is in essence a \emph{function} mapping a source tree to a binary representation of it. This idea is easily extended with \emph{bottom up} programming, where part of the source tree generates a compiler to compile other parts of the source tree. In order to make this work, I believe you need \emph{transparency}. By this I mean that all \emph{reflection} (compiler compilation) is \emph{unrolled} into a directed acyclic graph representing a code dependency tree. On the other hand, I believe that a microcontroller is best modeled as a \emph{persistent} data structure. A microcontroller is a \emph{physical object}, and should be modeled as such, \emph{independent} of the compiler that is used to create the code comprising the object state. This is what makes Forth interesting: the ability to \emph{incrementally update} without having to recompile everything. Due to limited hardware support (flash ROM is not RAM), \emph{late binding} becomes problematic, and also induces a significant performance penalty. This makes \emph{early binding} a reasonable alternative: in the end the objective is to at least provide the possibility to write efficient code at the lowest level of the target language tower. This is the heart of the paradigm conflict. Where do I switch from a transparent language tower to \emph{dangerous} manually guided incremental updates? Maybe the question to answer would be: why does one want to have this kind of low--level control anyway? The real answer is that at this moment, I don't really know how to create a transparent system. The real reason for that is that I've been locked in a certain paradigm. Let's explore what would happen if we lean towards any of the two extremes. If the whole system were transparent, the controller code would need to be treated as a filesystem if incremental updates were still to be used. After code changes, one could simply recompile, relink and upload only the parts that changed. This is the sanest thing to do. Entry: misc improvements Date: Sun Sep 30 21:35:40 CEST 2007 note that 'load' as it does currently doesn't 'commit'. actually, that's not how it's used mostly! also, automatic commit might be nice for compile mode.. on the other hand, compile mode is kind of an advanced feature also. Entry: structures for music Date: Mon Oct 1 05:48:46 CEST 2007 this is more of a tutorial pre. i saw aymeric was using the stack to store sequences, which is not a good idea.. i see 2 other ways: flash and ram. i kinda like the x / . approach for pattern synths. the trick is to do multiple voices, so i really need some kind of multitasking. say i have 3 patterns : bd o . . . o . . . bd ; : sn . . . . o . . . sn ; : hh o . o . o . o . hh ; what do o and . do ? let's assume that recursion is not allowed in these patterns. what can we hide in a single invocation? a simple trick is to use the dictionary shadowing: the words could call some fixed word, which is re-implemented later. : instrument do something ; : bd o . . o . . bd2 ; : bd2 . . o . . o bd ; we could have: : o instrument yield ; : . yield ; hmm.. it's probably better to directly use names instead of this name-capture thing. if recursion is disallowed, it should be possible to store each thread in a single byte, so a lot of threads are possible. in that case, an explicit interpretation and automatic looping might be better, using routing macros. Entry: purrr reference documentation Date: Mon Oct 1 16:13:01 CEST 2007 documentation for each macro. this contains 2 things: - stack effect (type) - 1 line human readable doc which possibly points to more information. so a word's meta info looks like (+) ((type . (a a -- a)) (doc . "Add two numbers")) if i can't do types yet, i should at least put the stack effect in a form that can be used later to do types. it's also probably a good idea to add meta-data separately to not clutter the code. so, how to infer types? from the lowest level (pattern matching macros) i can infer a lot. first some cleanups: i'm taking out the 'compiled' field in the word structure, because it's better to just save the source of macros before they're being compiled, instead of trying to recover them later. what about word-semantics? i forgot the reason why sometimes it cannot be filled. been poking in the rpn.ss internals and i guess it's best to have the state tx take a compiler for backup. but, this doesn't work for some other reason i can't remember.. tata: spaghetti. let's see if i can hack around it now by simply providing a language name for backup. Entry: i need closures Date: Mon Oct 1 20:26:22 CEST 2007 yep.. too much crap going on with trying to call from prj -> base and having to pass arguments. EDIT: when i wrote 'compose' i made sure to not allow composition between words with different semantics. however, i'm not so sure if that's a good idea.. i only want to use closures on functional words, not on state words. maybe is should let go of this control freakish behaviour since the source rep is only debug: it doesn't work relyably for all words to reconstruct from that source.. Entry: dsPIC Date: Tue Oct 2 03:46:01 CEST 2007 maybe it's time to try it out, and gently grow it into being. some challenges: - 3 bytes / instruction - 16 bit datapath - addressing modes flash block erase size is 96 bytes, but address-wize this counts as 32 instruction words. The dsPIC30F Flash program memory is organized into rows and panels. Each row consists of 32 instructions, or 96 bytes. Each panel consists of 128 rows, or 4K x 24 instructions. RTSP allows the user to erase one row (32 instructions) at a time and to program four instructions at one time. RTSP may be used to program multiple program memory panels, but the table pointer must be changed at each panel boundary. I don't understand why it says 'four instructions at a time' and then later on talks about 32 at a time: "The instruction words loaded must always be from a group of 32 boundary." And the confusion goes on "32 TBLWTL and four TBLWTH instructions are required to load the 32 instructions." this looks like a typo.. let's download a new version of the sheet. got DS70138C now. they're at version E. it's got the same typo. so assume i need to write per 32 instructions + some magic every 4K instructions (updating a page pointer?). apart from the latter it's quite similar to the 18f, just a larger row size size. it looks like this thing is byte addressed, but for each 2 bytes, there's an extra 'hidden' byte! lol ok, there is a sane way of looking at it: the architecture is 16-bit word addressed, but every odd word is only half implemented: instruction width is 3 bytes. it looks like it's best to steer the forth away from all the special purpose DSP tricks like X/Y memory and weird addressing modes. looks like an interesting target for some functional dataflow dsl though. there are 2 kinds of instructions: PIC-like instructions that operate on WREG0 and some memory location, and DSP-like instructions that use the 16 registers. roadmap: - find a 8bit -> 16bit migration guide from microchip - partially implement the assembler to PIC18 functionality Entry: direct threaded forth Date: Tue Oct 2 07:26:49 CEST 2007 i'm toying a bit with the vm forth. and was thinking: it's not necessary to go stand-alone. it's much better to test this vm forth as another target. Entry: type signatures from pattern matching macros Date: Tue Oct 2 14:38:47 CEST 2007 It should be possible to mine the 'source' field of pattern matching macros for types, or at leas stack effect, of functions. the first matching rule is always the most specific one: if that fits a certain pattern. the REAL solution here is to change the pattern matcher to REAL algebraic types instead of this hodge-podge. moral of the story: whenever pattern matching occurs on list structure, what you really are looking for is algebraic types. yes... i'm not going to muck around in this ad-hoc syntax. i need a real solution: something on top of the current tx. i need real algebraic types. there is this: http://planet.plt-scheme.org/package-source/dherman/struct.plt/2/3/doc.txt but for my purpose it might be better to just stick with the current concrete list representation for the asm buffer. what about: (([qw a] [qw b] +) ([qw (+ a b)])) -> ((['qw a] ['qw b] +) `([qw ,(+ a b)])) looks like 'atyped->clause' in pattern-tx.ss is working. it's indeed really simple to implement on top of the matching clauses. looks like 'asm-transforms' works too. i ran into one difficulty though. call it polymorphism. in original syntax: (([op 'POSTDEC0 0 0] ['save] opti-save) `([,op INDF0 1 0])) cannot be expressed in the new syntax. however, this is exceptional. it's probably a good idea to make this polymorphism explicit. EDIT: it is possible to use unquote!! a bit of abuse of notation, but ... let's write the pic18 preprocessor on top of asm-transforms instead of compiler-patterns. ok. done. old one's gone. now it should be a lot easier to write some documentation or type inference.. i tried to tackle the 'pic18-meta-patterns' but i don't seem to get anywhere. current syntax is way to complicated. it really shouldn't be too hard by taking a more bottom up approach instead of trying to use 'callbacks' that force the preprocessing of some macro's arguments. write a single generator macro for each kind. trying again. this is the thing i want to generate: (define-syntax unary (syntax-rules () ((_ namespace (word opcode ...)) (asm-transforms namespace (([movf f 0 0] word) ([opcode f 0 0])) ... ((word) ([opcode 'WREG 0 0])) ...)))) from this (asm-meta-pattern (unary (word opcode)) (([movf f 0 0] word) ([opcode f 0 0])) ((word) ([opcode 'WREG 0 0]))) the thing which seems problematic to me is the '...' more specificly (pattern template) ... -> (pattern template) (... ...) ... that doesn't seem to work. it looks like the 'real' problem here is due to the fact that i'm expanding to something linear.. i'm inserting stuff. i wonder if it's possible to modify the asm syntax a bit so it will flatten expressions. wooo.. macros like this are difficult. i'm currently doing something wrong with mixing syntax-rules with calling an expander directly. best to stick with plain syntax-case and direct expansion: that's easier to get right. the deal was: sticking with syntax-rules as a result of a first expansion worked fine, i just needed to put the higher order macro in a different file for phase separation reasons. so.. the remaining step is to collapse the compiler-patterns-stx phase, and add the current source patterns to the word source field, which would yield decent docs. ok, done. > msee + asm-match: ((((qw a) (qw b) +) ((qw `(,@(wrap a) ,@(wrap b) +)))) (((qw a) +) ((addlw a))) (((save) (movf a 0 0) +) ((addwf a 0 0))) ((+) ((addwf 'POSTDEC0 0 0)))) > that should be easy enough to parse :) CAR + look only at qw. the 'wrap' thing is something that needs to be cleaned up too.. i tried but started breaking things. enough for today. this is what i get out for qw -> qw (((qw a) (qw b) --) ((qw `(,@(wrap a) ,@(wrap b) #f)))) (((qw a) (qw b) >>>) ((qw `(,@(wrap a) ,@(wrap b) >>>)))) (((qw a) (qw b) <<<) ((qw `(,@(wrap a) ,@(wrap b) <<<)))) (((qw a) drop) ()) (((qw thing) |*'|) ((qw thing))) (((qw a) (qw b) ++) ((qw `(,@(wrap a) ,@(wrap b) #f)))) (((qw a) (qw b) swap) ((qw b) (qw a))) (((qw a) dup) ((qw a) (qw a))) (((qw a) (qw b) or) ((qw `(,@(wrap a) ,@(wrap b) or)))) (((qw a) (qw b) and) ((qw `(,@(wrap a) ,@(wrap b) and)))) (((qw a) neg) ((qw `(,@(wrap a) -1 *)))) (((qw a) (qw b) xor) ((qw `(,@(wrap a) ,@(wrap b) xor)))) (((qw a) (qw b) /) ((qw `(,@(wrap a) ,@(wrap b) /)))) (((qw a) (qw b) *) ((qw `(,@(wrap a) ,@(wrap b) *)))) (((qw a) (qw b) -) ((qw `(,@(wrap a) ,@(wrap b) -)))) (((qw a) (qw b) +) ((qw `(,@(wrap a) ,@(wrap b) +)))) i also made a 'print-type' function. for '+' : ((qw qw) => (qw)) ((qw) => (addlw)) ((save movf) => (addwf)) (() => (addwf)) this might be useful.. but what's more useful is the building of a framework that enables this for all functions. it works for the assembler primitives only. Entry: TODO Date: Sat Oct 6 20:54:13 CEST 2007 - live command macros - put live commands in a namespace - add doc tags: math/control/predicates/... - write small tutorial: * assembler + PIC18 architecture * logic, addition and 8 bit programming (hex + binary) * the x and r stacks * route (DONE) * predicates & conditionals * run time computations & ephemeral constructs - fix macro cache re-init + initial state loading (DONE) - fix quoting in macros - fix hardcoded paths (rename brood/brood) - rename compilation stack Entry: unsigned demodulator Date: Sat Oct 6 21:30:01 CEST 2007 the pic18 has a hardware multiplier, which is nice. however, computing signed multiplication takes quite a hit compared to unsigned. i was wondering if i can do an amplitude-only demodulator using only signed multiplications. the entire function is unsigned -> unsigned. signal -> mixer -> I / Q -> I^2 + Q^2 -> LPF [EDIT: deleted a long erroneous entry. the thinking error was about the commutation of the LPF and the squaring operation. the above expression just gives the average signal power.] the correct formula is: X -> (I,Q) => LPF -> || . ||^2 that's completely symmetric wrt to phase. the LPF is straightforward: a simple 1-pole will probably do if i keep the bitrate low. a 2^n-1 coefficient is easy to implement without multiplication. the I=XC and Q=XS multiplications can probably be simplified since X = x-h and C = c-h have no DC components. here h = 2^(bits-1). I = X C = (x-h) (c-h) = xc - hx - hc + h^2 = xc - h(x + c - h) = xc - h(x - h + c - h - h) = xc - h(X + C + h) DC = xc - h^2 which is quite intuitive: take the average of xc, but remove the dc component. Entry: frequency decoding Date: Mon Oct 8 00:26:37 CEST 2007 for krikit, the choice to make is to either decode the whole spectrum (listen to everything at once) or listen only to a single band. this is a choice that has to be made early on.. some remarks. * FFT + listening to all bands is probably overkill. it's not so straightforward to implement, so the benefits should be big. FFT for point-to-point makes only sense when combatting linear distortion. * single frequency detection is really straightforward. the core routine is a MIXER followed by a complex LPF. the output is phase and amplitude. * using a sliding window average LPF together with orthogonal frequencies allows for good channel separation. this works for steady state only, so some synchronization mechanism is necessary. * sending out a single message to multiple frequencies: easy to do with pre-computed tables for 0 and 1. phase randomisation to avoid peaks is possible here. * i'm afraid of linear distortion due to room acoustics.. maybe FM/FSK should be used? * if non-linear distorition is not a problem, DTMF frequencies are not necessary. * using exact arithmetic, it is easy to update/downdate a state vector for rectangular window LPF. this update can be performed at the input of the mixer. * bandwidth limitation for transmission. http://en.wikipedia.org/wiki/Olivia_MFSK Entry: network debugging + pic shopping Date: Mon Oct 8 15:27:25 CEST 2007 - avoid remote reset: use WDT - central power gives panic switch - use a standard bus protocol for comm (I2C ...) The 18f1220 doesn't have I2C, so it might be better to go for a different component. Lowest pin count is 28. Let's take the one with the most memory to have some room for tables and delay lines. I'm thinking about 18f2620: 64kbytes flash, 3968 bytes ram (maxed out) This is also a nice target for a standalone language. These are the same, with some things missing. EEPROM (b) FLASH (kb) 18f2620 1024 64 18f2610 0 64 18f2525 1024 48 18f2515 0 48 Entry: 8 bit unsigned mix -> complex 16/24 bit acc Date: Mon Oct 8 15:32:30 CEST 2007 I've been toying a bit with a mixer + accumulator building block, and it seems it can be quite simple. Some remarks: - Perform signed offset correction out of the accumulation loop. - Perform update/downdate for rectangular window at input due to commuation with mixer. - As long as the result of accumulation fits in the word length, overflow is not a problem. If a signed number X is represented by an unsigned number x, the difference is X = x - h, where h = 2^{n-1} is 'half'. Per signed multiplication there is an offset of h^2 = 2{2n-2}. What this means is that once per 4 accumulations, the correction term disappears due to word overflow if 2 bytes are used. However, the maximal filter output occurs at full scale input, which will overflow the accumulator if more than 4 accumulations are used, so maybe it is better to use a 3 byte state. In any case, if the number of accumulations is a power of 2, removing the unsigned offset is a simple bit operation. Entry: transmission Date: Mon Oct 8 16:46:26 CEST 2007 Using hardware PWM with 8 bit resolution I can send out at 39kHz, assuming fosc = 40HHz. This is still well beyond the maximal frequency at about 3kHz, and won't pass the speaker, so an analog filter is not necessary. Differential drive (half bridge) could be used. One thing to note is that only ECCP (enhanced) can do multi-channel PWM. The normal PWM is only single output, and all 28 pin chips have just 2 x CCP. The 18 pin 18f1x20 has a single ECCP, and the 40/44 pin 18f4xxx also have one. Looks like that's quite a limitation.. On the other hand, a CMOS inverter could be used on-board.. Is that worth it? Probably not. A simple coupling condenser will do the trick. Entry: self programming 5V Date: Mon Oct 8 18:01:24 CEST 2007 Something i just noticed in the 2620 datasheet: self-programming works only at 5V ? Entry: no apology! Date: Thu Oct 11 01:43:40 CEST 2007 i tried a couple of times this week to explain the "ephemeral" macro idea, but it's just insane. i need a real solution: - macro code needs to know whether a certain word is defined or not. - if a partial evaluation can't be computed, the error should be: * none, if the corresponding library code can be found. * "partially implemented literal construct" or something.. what i do need to explain is "Why leaky abstractions are not necessarily bad." This is a core Forth idea quite opposed to the safe language ideal. I'm using a lot of that stuff, and I guess it's good to make a list of these. looking at the code, this 'need-literal' error only happens in 3 places: toggle set and bit? i just took them out: they refer to code words now, up to user to implement. Entry: Purrr semantics Date: Fri Oct 12 16:25:44 CEST 2007 As explained in Brood, there is only a single semantics of a Purrr program: it is a compositional, purely functional language. A Purrr program consists of a set of (recursive) macro definitions, and a ``body'' which defines a compilable function with reduced semantics. It would be really cool if i could get rid of the explicit 'compilation' step, and make everything just declarative. What i'd like to do is to apply this approach to scheme. Maybe that's what PICBIT is doing? Entry: train notes about syntax, semantics and metaprogramming Date: Fri Oct 12 17:41:09 CEST 2007 I can identify 3 distinct uses of macros: - control flow (begin ... again) - optimization (1 2 +) - explicit meta (using the words '>m' and 'm>') The latter is actually the same as the first. The 'm' stack is like the 'r' stack: it is used to implement nesting constructs. conceptual problem: jumps, can be solved by writing all jumps as recursion and using higher order functions (combinators). together with using only a single conditional statement, the solution is to enable syntax for quoted macros. this leaves: * conditional (IF) * quoting operation (LAMBDA) * dequoting operation (APPLY) The core ideas behind the macro language are: * purely functional (no side effects) * everything is first class * purely compositional (no syntax) Then, the target langauge should inherit as much as possible from these properties: * functional word subset (data stack) * possibility of HOF (with/without closures) using byte codes? * mostly pure compositional semantics, with a little syntax sugar Construct a powerful metaprogramming system by starting with a pure language, and making the transition (projection) from pure/ephemeral -> non-pure/concrete explicit. In Purrr this is the decision to use macros or words to impelement functionality. Is metaprogramming a form of message passing? Sending "reconfig" messages? MERGE TODO: - check how PLT classes solve name spaces issues + use this for macro namespace. - fix macro quoting and nesting. maybe write program as list of macros instead of 1 macro now? it's isomorphic, but possible to manipulate. - don't solve nesting in source preprocessor: that is to remain regular, and the parser is to be explicit (compilation 'meta' stack). maybe this requires a real extensible descent parser? - check how Factor implementes closures - make interaction words extensible - check >> and 2/ simulation and partial evaluation words Entry: notes remarks Date: Fri Oct 12 17:43:50 CEST 2007 There are only 2 kinds of distinct primitive macros: - partial evaluation macros (written in pattern language) - nested structures (written in CAT) Composite (recursive) macros can combine both. This seems to be the way to explain how things are going + a way to clean up the code a bit and reduce the number of primitive nesting macros. Apparently, that's already accomplished.. on the other hand, these are entry points to a type inference system.. edit: just re-implemented >c and word>c as pattern matching words. I am in trouble: I want to explain why I diverged from explicit Forth--style metaprogramming to move to compositional macro semantics with partial evaluation, and why at the same time i'm not going full length: instantiation is still limited to a subset of the full macro semantics. The thing is: having metaprogramming constructs in the language disguised as 'compatible semantics' is a good idea: explicit primitive macros can be reduced quite a lot. So what's the question?? Entry: debug bus Date: Fri Oct 12 17:57:41 CEST 2007 - identical clients - ad-hoc 1-wire instead of SPI/I2C/async/... - host = master - binary tree-like physical structure - cables/connectors ? - multihost: just use shared terminal EDIT: maybe an ad-hoc network is best to avoid at first.. let's get something simpler working before trying crazy stuff. Entry: quoting macros Date: Fri Oct 12 19:53:43 CEST 2007 this looks a bit like the final frontier. currently i can't write Forth in terms of a compositional language. with the current pattern matching language, it would be trivial to do so if i had a representation of anonymous macros. basicly i want: [ 1 + ] [ 1 - ] ifte that's easy enough if '[' and ']' are part of a parser preprocessor. however, anything defined in terms of those, like 'if' 'else' 'then' needs to be implemented as parser macros also! this complicates things.. i see only 2 solutions: - implement all nested words as parser words - figure out a way to unify parsers and macros what about this: allow the use of syntax '[' and ']' as a macro quoter, but write words like 'ifte' in terms of Forth, instead of the other way around. again: i'd like to have an explicit compilation/macro stack lying around, however, quoted macros are nice to have. this is non-orthogonal, but does it really matter? i don't know what to think about this.. Entry: Haskell Date: Sat Oct 13 15:32:18 CEST 2007 I've been looking for an excuse to use Haskell for something non--trivial. The demodulator (and unrelated, iirblep filter) might be a good problem to tackle. OTOH, the real exercise is probably to write a prototype in Scheme, test it, and then write a specific compiler to translate that algorithm into C or Forth. So maybe best demodulator in scheme (see filterproto.ss) and iirblep in Haskell? Entry: the purrr slogan Date: Sat Oct 13 18:45:18 CEST 2007 in order to explain what purrr actually is, it is best to set these two points: * Purrr is a macro assembler with Forth syntax. It is implemented in a purely functional compositional macro language. * Because of the similarity of the procedural Forth langauge and its meta programming language most metaprogramming can be done by partial evaluation, blurring the distinction between the concrete procedural language, and the ephemeral macro language. In a sence: PE is not just an optimization, but an *interface* to the metaprogramming language. * The PE is implemented as greedy pattern matching macros (is this important?) Entry: removed from purrr.tex Date: Sun Oct 14 16:52:10 CEST 2007 \section{The Big Picture} Purrr can be used in its own right, but it is good to note that Purrr is part of the Brood system, which is an experiment to combine ideas from Forth, (PLT) Scheme and compositional functional languages into a single coherent language tower. Purrr can be seen as an \emph{introspective boundary} in this language tower: the core of Purrr is to be the basis of this language tree, but the scope of Purrr is limited to a low--level language with Forth syntax and semantics and some meta--programming facilities disguised as Forth macros. For example, it is not possible to access the intermediate functional macro representation directly from within Purrr at this moment; this still requires extension of the compiler itself using the Scheme and CAT languages. This separation between the Purrr language and its implementation serves to to keep the programmer interface to Purrr as simple as possible, while the detais of the language tower are worked out to eventually lead to a more coherent whole. Purrr by itself is reasonably coherent, although it is somewhat limited in full reflective power by this language barrier. Eventually, Purrr should be just an interface (with Forth syntax) to the low level core of the compositional language tower in Brood. Because Purrr is implemented only for the Microchip PIC18 architecture, there is no tested \emph{standard} machine layer: most functionality is fairly tied with the PIC18. I am confident however, that refining the split of the current code base into a shared and platform specific component is fairly straightforward. Due to the ease with which to create an impedance match in a Forth like language, I am refraining from an actual specification of this standard layer until the next platform is introduced. By consequence, the border between the machine model and the library might shift a bit. Purrr's macro system is the seed for a declarative functional language. Such a language would have no explicit macro/forth distinction as in Purrr. Entry: new ideas from doc Date: Sun Oct 14 16:52:21 CEST 2007 It looks like things are getting cleaner: by taking this partial evaluation thing serious, CAT primitives can be largely eliminated. Just the words >m and m>, together with some stack juggling words like m-swap, are enough to implement the whole language. I just need to clean up a bit more so this idea can be sealed as a property: no primitives except for a stack! For documentation purposes it might now even be a good idea to write most code in compiler.ss and pic18-compiler.ss in Purrr syntax, leaving only the true primitives in s-expr syntax. EDIT: that's a bad idea until the forth syntax can represent everything the s-expr syntax can. The remaining cleanup brings me to the backtracking for/next implementation. With just quoted macros and a 'compile' that executes macros, this can be removed from the primitives. Entry: writing lisp code in emacs Date: Mon Oct 15 01:38:51 CEST 2007 watching slime screencast * insert balanced paren: M-( with prefix arg Entry: quoting macros Date: Mon Oct 15 17:10:44 CEST 2007 Apparently, it was already implemented. I rewrote the for/next backtracking so now it's expressed as recursive macros, except for the part that tests the data structure constraint. I guess what i have now is that compositional language forth dialect. The only problem is that my Forth parser doesn't support it. I just need to write some macros to transform code that uses literal quoted macros into other constructs. Start with ifte: ;; Higher order macros. (([qw a] [qw b] ifte) ((insert (list (macro: if 'a compile else 'b compile then))))) /me got big smile now :) Entry: practical stuff : starting a new project Date: Wed Oct 17 14:13:13 CEST 2007 I need to make my old 18F452 proto board work again, so this entry is a seed for a "getting started" doc: how to get from nothing to a working project. EDIT: i'm switching to a 18F2620, so doing it over again. Assumptions: * the project is part of (your branch of) the brood distribution * you're using darcs version control 1) Make a directory in brood/prj, and add to darcs cd brood/prj mkdir proto darcs add proto 2) Copy the following files from another project. i.e. prj/CATkit and add them to the darcs archive cd proto cp ../CATkit/init.ss cp ../CATkit/monitor.f . darcs add * 3) Edit the init.ss file to reflect your project settings. skip step 4-6 if you have a chip with a purrr bootloader 4) Edit monitor.f for your chip That file includes the support for the chip in the form of a statement: load p18f2620.f Look in the directory brood/pic18 to see if such a file exists. If it does, go to step 5). If not, you need to create one and generate a constants file from the header files provided by Microchip. I.e.: cd brood/pic18 ../bin/snarf-constants.pl \ < /usr/share/gputils/header/p18f2620.inc \ > p18f2620-const.f The .INC file can alternatively be found in the MPLAB distribution, in the MPASM directory. Now you need to create the setup file for the chip. Start from a chip that is similar cp 18f1220.f p18f2620.f And edit the file to reflect changes necessary for chip startup and serial port initialization. Don't forget to add the files to darcs, and send a patch! darcs add p18f2620*.f darcs record -m 'added p18f2620 configuration files' darcs send --to brood@zwizwa.be http://zwizwa.be/darcs/brood In case you can't send email from your host directly, replace the "--to brood@zwizwa.be" option with an "--output darcs.bundle" option and send the resulting darcs.bundle file. 5) To compile the monitor in the interactive console type this: project prj/proto scrap 6) Make a backup copy of the monitor state. cp prj.ss monitor.ss And flash the microcontroller using the monitor.hex file. In case you're using the ICD2 together with piklab, the command line would be: piklab-prog -t usb -p icd2 --debug --firmware-dir \ -c program monitor.hex Here is the directory containing the ICD2 firmware, which can be found in the microchip MPLAB distribution. 7) Next when you start the console, go back to the project by typing: project prj/proto 8) Now you can start uploading forth files using commands like: ul file.f This will erase the previously uploaded file and replace it the new one. If you want to upload multiple files, use the 'mark' word after upload to prevent deletion: ul file1.f mark ul file2.f Now the next 'ul' will erase file2.f before uploading a new file. To erase files manually, use the 'empty' word. --- LIVE MODE ONLY --- bin/purrr project prj/CATkit ping Entry: this is a simultaneous fix/todo log for the previous entry Date: Wed Oct 17 14:28:38 CEST 2007 - add default entries to dictionary on init - single baud rate spec? mine it from forth source, or the other way around.. - standard naming for the state file? - for chips that come with a bootloader: need to save the pristine file - fix state file rep so it is a standard s-expression tagged with 'project' - fix absolute path - add 'serial' tag to port - add a 'chip erase' or a fake one using "mark empty" the 3 different state files: - init.ss "most empty" state - monitor.ss state file of bootloader only - prj.ss current state these names are set as default, but can be overridden. ok. done. 'monitor.ss' is never written by the application, so ppl with just a monitor.ss file can revert to just that file (not implemented yet). Entry: operations on dictionaries Date: Wed Oct 17 15:34:26 CEST 2007 I'm trying to factor dictionary operations a bit. I already ran into 'collect' which takes a list of tagged pairs, and collects all occurences for each unique pair. Doing this stuff purely functional becomes difficult if performance is an issue: naive algorithms are quadratic. Hash tables could accellerate. It seems overall that mutation is the thing to choose here.. Trying to write these hierarchical combination things i'm getting convinced that it's a bit of a mess.. (name . value) pairs are well defined, but hierarchical structures require polymorphy. To make the analogy with ordinary functions, basicly you're dealing with a function that maps a value to a value OR another function.. Maybe the whole abstraction is broken? I need to think about this.. something profound seems to be hidden here. I'm going to hack around it for now. I think I get it.. and it's trivial again. A hierarchical hash table (HHT) is an implementation of a finite function which maps tag SEQUENCES to values. All operations on HHTs have the semantics of operations on finite functions. From this follows that paths need to be created if a value is stored. It doesn't make sense to have to create the directory before storing a value. Otoh, storing a value in a tag sequence, where one of the top nodes is not a hhash is an error. Entry: PIC write protect Date: Wed Oct 17 20:20:23 CEST 2007 write protection works well and all, but i can't get it undone! i think it works without problem in mplab, but using the piklab programmer erasing the chip doesnt seem to work... what is needed is a full chip erase. it doesn't look like piklab is doing this correctly. on to installing mplab again.. OK i got it: memory that's protected requires a BLOCK ERASE, and such an operation needs Vdd > 4.5 Entry: macro nesting Date: Thu Oct 18 14:15:39 CEST 2007 time for the hiary problem: the syntax-rules -> syntax-case equivalent for macros. what do i need: there is only one decent way of doing this: use scheme metaprogramming. i like forth and all, but for numeric stuff, it's just easier to have variable names.. let's invent some new construct: \ load a scheme file implementing macros load-scheme filename.ss been hacking a bit, but i need a plan.. * s-expression files contain scheme expressions, not forth files with s-expression syntax. this effectively needs a scheme parser down the line, something that can convert the inline atoms to proper invokation. what about this: make it possible to load plt modules from forth. modules are stored as a single s-expression. hmm... again.. some questions: * how to store a module definition in the state file, so it can be instantiated? all macros in the '(macro) dict get evaluated using def-macro!, which does: (define (def-macro! def) (ns-set! `(macro ,(car def)) (rpn-compile (cdr def) 'macro:))) rpn-compile evaluates `(macro: ,def) so this won't work to store modules. it's probably best to represent the macros differently in the state file, so it's just scheme code, and then create a module: evaluator. that's the main problem: * how to store the source of things that generate macros, in this case a scheme module, so they can be re-instantiated from the state file. * do this without introducing ANY limit on what can be included in the scheme file. * without introducing yet another special case. in fact it's probably better to remove a special case driven by this requirement. let's go back to how macros are parsed. ok. they are included as a (def-macro: . ) expression in the atom stream. i guess this needs to change to include a (def-module: . ) form. why not change the def-macro: thing to a more general def-scheme: syntax? (def-macro: name . body) -> (def-scheme: (def-macro! name body)) or.. have def-macro! support modules. i guess that's the simplest way. ok.. changed the tag to "extend:" and changed the function that implements the extension to "extend!" i'm running into some bad behaviour.. need to formalize Entry: forth translation Date: Thu Oct 18 16:58:21 CEST 2007 Time to formalize the forth parsing. Some notes: - it's actually just a lexer: no nested structures are handled in this stage: all is passed to the forth macros, which use the macro stack to compile nested structures. - FILE: the first stage does only file -> stream conversion. this includes loading (flattening the file hierarchy) - PARSE: the second stage does 'lookahead' parsing: all non-compositional constructs get translated to compositional ones. this also includes macro definitions. The problem I run into is the FILE stage, which also needs to inline scheme files, but gets messed up by the forth parser. I just need to tag them differently. Entry: error reporting Date: Thu Oct 18 22:26:25 CEST 2007 using 'error' instead of 'raise' is a good idea since continuation marks are passed. the rep.ss struct marks CAT words, so something resembling a trace can be printed. the cosmetics can be done later, this is good enough for now. done. maybe want to convert some exceptions that are clear enough back to raise so they don't print a stack trace. (reserved-word time-out) Entry: hardware prototyping Date: Fri Oct 19 11:22:00 CEST 2007 TODO: - sine wave generation - debug network - connect a modulator and a demodulator the first one seems rather trivial to me, so let's do the network today. first thing is to give up on an ad-hoc bus: that's ok for uni-directional stuff, but bidir is a pain. so let's go for something standard. i got the samples in yesterday. got them running on the breadboard with intosc. if we can pull off the project on 8MHz, we can run on 2xAAA cells: the 18LF2620 need only 2V, but they need 4.2V @ 40MHz. i'm going to stick to intosc for now. next: I2C * 2 lines are used: RC3 = clock, RC4 = data, these need to be configured properly by the user. on the 18F2620 their only other function is digital IO. * registers: - SSPBUF = serial data I/O register - SSPADD = device address - SSPCON1, SSPCON2, SSPSTAT = control registers * errors: - write collision * firmware controlled master mode: seems it's just more work, so never mind.. Entry: TODO Date: Sat Oct 20 13:30:29 CEST 2007 - get I2C working between 2 18F2620 chips on breadboard at intosc, as fast as possible. - fix purrr.el : stupid broken indentation is annoying the hell out of me. clean up the file first, then automate the indentation rules generation etc.. Entry: message passing interface Date: Sat Oct 20 14:00:30 CEST 2007 Since I2C is a shared bus architecture, care needs to be taken to place operation in a sane highlevel framework. The interface i want is asynchronous message passing. Messages should either be bytes, or a sequence of bytes (in which case 'message' contains the size, and the a/f regs contain the message) message address i2c-send Let's suppose for now there is only a single process per machine, and build multiple process dispatch on top of single process. To do this bi--directionally, an event loop needs to poll for messages. Dispatching of highlevel messages (internal addresses) can be done as a layer on top of single message passing. So i need a send and receive task, and make sure they don't collide * it's always possible to RECEIVE, so that should be the background task. this simply waits until a message arrives. * it's only possible to SEND if the bus is free, so a SEND might block. The problem is that a message might come in while waiting to send out a message. Therefore messages need to be queued. The moral of the story: A send can never block a process, only a receive can. So what is a task? It is a function that maps a single input message to zero or more output messages. The output can be zero in a meaningful way, because the task has internal state. So basically, a task is a closure, or an object. The driver routine can be a single task, since the hardware is half-duplex. See pic18/message.f for the iplementation attempt. Something to think about: the ISR needs to be completely decoupled from the tasks that generate output messages. This is the whole point of buffering: if there is straight line code from RX interrupt -> computation task, the tasks that might run a long time will not be pre-empted. So: The RX ISR and the dispatch loop are distinct. what it looks like (yes i need to pick up hoare's book again..) \ Message buffering for a shared bus architecture. The topology looks \ like this: \ wire \ | \ | G \ A E v F \ wire ----> [ LRX ] ----> [ LTX ] ----> wire \ | ^ \ . . . . . . . | B . . . . . | D . . . . . . . \ v C | \ [ HRX ] ----> [ HTX ] \ \ Code above the dotted line runs with interrupts disabled, and \ pre--empts the code below the line. Communication between the two \ priority levels uses single reader - single writer buffers. The 6 \ different events are: \ \ A) PIC hardware interrupt \ B) RX buffer full condition \ C) TX buffer full condition (execute task which writes to buffer) \ D) wakeup lowlevel TX task from userspace \ E) wakeup lowlevel TX task from kernelspace \ F) PIC hardware send \ G) wakeup lowlevel TX task from bus idle event \ \ A task is an 'event converter'. The 4 different tasks are: \ \ LRX) convert interrupt (A) to tx buffer full B and tx wakeup E \ HRX) convert tx buffer full (B) to rx buffer full (C) \ HTX) convert tx buffer full (C) to tx wakeup (D) \ LTX) convert wakeup (data ready: D,E) to hardware send. \ \ The pre--emption point is A: this causes no problems for the \ low--priority task because of the decoupling provided by the receive \ buffer. The only point that needs special attention is the LTX task, \ which can be woken up by different events D, E and G, and care needs \ to be taken to properly serialize message handling. To do this, both \ D and E should invoke LTX with interrupts disabled. For E this is \ trivial: just call the LTX task, for G is is already ok since it's \ an isr, so D needs to explicitly disable interrupts. \ Entry: todo today Date: Sat Oct 20 15:52:49 CEST 2007 - write highlevel buffer code and try out with current serial before moving to I2C - write mini 'hierarchial time' tutorial for sheepsint - check mail just sent to technocore for delails of the next couple of days. haha.. did none of them :) i suck at planning. what i did do is to write a synth tutorial that is an introduction to the hierarchical time thing + some explanation of a pattern language. what this doc is leading me to is the need for some kind of dynamic variable binding for code words: i already have 'hook.f' but something more general should be used. something which directly deals with variables. Entry: re-inventing C++ Date: Sat Oct 20 16:11:22 CEST 2007 i'm running into the need for polymorphy: i want to express generic algorithms in a sane way. because of the philosophy of purrr, this has to be done in a static way, with dynamic built on top of that later maybe. Oei this is going to lead to a whole lot of doubts about namespace management.. Let's concentrate on the practical issues first. EDIT: i'm going for name mangling.. see below. Entry: hierarchical time Date: Sat Oct 20 18:13:38 CEST 2007 One thinking error i made is: if a note word is SYNC followed by CHANGE, then you can't compose words that start at the same sync. as a result, SYNC needs to follow CHANGE, and the toplevel invokation needs to provide proper synchronization. Entry: the 'i' stack Date: Sat Oct 20 23:14:45 CEST 2007 what about this: i'm using an extra byte stack, and 'x' is a symbol that's useful in other contexts.. why not call the stack the 'i' stack, since it's already used as a loop index in for .. next loops? hmm.. great idea, but not really feasible without an automated identifier replace.. it's everywhere. Entry: dynamic words Date: Sun Oct 21 00:50:39 CEST 2007 basicly, i need to find words to properly handle execution tokens. there are 3 uses for a symbol related to dynamic code: * declare * invoke * change behaviour if it's avariable, invokation will be explicit: because i don't want the thing on the stack, an extra level of indirection should do it: 2variable BLA BLA invoke : changeit BLA -> ...... ; another possibility is to use a parser word, which i'm not so keen on using. what syntax is better depends on the usage: do invokations dominate, or do behaviour changes? i used the "->" word in ForthTV to set the display task: that's a single vector, invoked in only one place, but muted in a lot of places. let's go for this approach. results in vector.f (hook.f is basically the same, left there for forthtv) Entry: todo Date: Sun Oct 21 16:15:32 CEST 2007 - hierarchical time - highlevel buffer code (requires some polymorphy) - tonight: fix purrr.el, clean up stuff in doc/ Entry: hierarchical time Date: Sun Oct 21 16:37:48 CEST 2007 so what's the problem? you want to have a class of words which "snap to" a timing grid, but you want to be able to call a collection of fine scale words from coarse scale words, without messing up the sync. the problem is that if you do: : foo 8 sync-tick bar bar ; : bar 7 sync-tick .... ; there are too many waits: "8 sync-tick" followed by "7 sync-tick" waits for the next 7-scale tick. somehow the sync word needs to know that the current time is already ok. either: * assume that the caller does the outer bounds, and have callees do only subdivision. this works, but is cumbersome. * find a way to see that we're running synchronized. how can a 0->1 transition in bit n be recognized in the bits < n? they're all 0. but that's not very helpful. damn i need coffee. the question to ask is: did we recently sync? this can be answered by copying the whole counter register to some place, and computing the diff. this also allows to trigger on edges. what about this: use some dynamic scoping for syncing. there is only one word 'sync' which will synchronize on clocks given the current time scale. for each time scale one needs: a word that can compute the current phase count. this needs a bit offset and the last sync point. bit offset might be easily stored as a bit pattern. global: the counter the last sync point \ compute time difference from last saved sync point, using mask to \ ignore fine scale. : sync-diff sync-counter @ sync-last @ - sync-mask @ and ; macro : sync-inphase? sync-diff nfdrop z? ; forth actually that doesnt solve anything.. it's quite easy to wait until a condition changes, but it's a lot less easy to determine whether the condition just happened. really, the only thing i see is to have patterns like this: _|_|_|_ which can be nested in larger scale patterns like _______|_______|_______|_______ _|_|_|_ _|_|_|_ _|_|_|_ _|_|_|_ there the first and last syncs are removed, and only the subdivision is synced to. it's then te responsability of the caller to turn things on and off. it looks to me that this is a real pain to work with.. maybe i should just write a couple of words and see if it's actually sane to get something working.. one thing i thought about was to bind the current sync level to the word "|" : hihat [[ noise 10 for | next ]] ; where the [[ and ]] save and restore the synth config on the x stack. that's 7 bytes per level, which is a bit too much probably, so stick with manual saving/restoring. ok. there really is only one decent solution: escape continuations. in order to make proper use of synchronization, the caller needs to indicate how long a word is allowed to last. now, instead of that, think of there being only one voice at all time, which simply accepts events from a separate entity. so the synth looks like: [ CONTROL ] -> [ VIRTUAL SAMPLE PLAYER ] -> [ CORE SYNTH ] each virtual sample is a word that loops forever. this requires multitasking. Entry: generic functions Date: Tue Oct 23 16:09:09 CEST 2007 When trying to implement the buffer algorithm, i ran into the need for abstract objects: each buffer (queue) is going to have the following interface: read write read-ready? write-ready? (maybe.. in case buffer-full condition is used..) I have enough with a static object system: anything dynamic has to be handled explicitly on top of that using byte codes (route) or vectored words. So what is needed is simply a static (compile time) method dispatch. Should there be special syntax for messages, or do we just use a single flat namespace, with some words dedicated as messages? For example: 'read' could be such a message: always requiring a literal object. This seems simplest, let's try that first and change it if it is not appropriate. So: - WHERE is 'read' defined - HOW is 'read' defined Suppose we use a 'method' keyword for creating new methods. This probably trickles down to making the parser also generic. Let's use CLOS terminology. So what am I doing? I am roviding a means for static namespace management so I can write generic algorithms (as macros). As of this point NO effort is made to implement dynamic generic algorithms: this should be built on top of the static version. My approach is going to be very direct: if more abstraction is needed i will fix it later. Currently multpli dispatch is not yet implemented. The interface should be: class BLA \ create a new object (a macro namespace) method FOO \ declare a new method object BLA method: FOO ... ; \ define a new method FOO of object BLA BLA FOO \ invoke method FOO for object BLA So, how to implement.. This was the easy part: ;; Dictionary lookup. (([qw tag] [qw dict] dict-find) ([qw (dict-find dict tag)])) Now the thing to do is to store the dictionary somehwere. This has to mesh with the macro definition part of purrr.. let's see (using s-expr macro definition syntax on the rhs) class BLA == (BLA '()) method FOO == (FOO 'FOO dict-find compile-message) Here 'compile-message' depends on what's exactly stored in the dictionary: macro objects or a mangled symbol. It's tempting to just go with symbol mangling: that way ordinary syntax can be used, and interface to the rest of the language is really straightforward. Let's go for the simple symbol mangling, which doesn't even need dictionaries: A class is a collection of methods. Classes are identified by a symbol. A method is a macro which dispatches to another macro based on the symbol provided. class BLA == (BLA 'BLA) method FOO == (FOO 'FOO dispatch-method) : BLA.FOO ... ; FOO BLA == 'FOO 'BLA dispatch-method Entry: problem in macros defined in forth syntax: quote doesn't work properly Date: Tue Oct 23 17:43:02 CEST 2007 suppose i want this: : broem ' broem ; == (broem 'broem) how to do that? currently this just gives an ifinite expansion because the quote is not recognised. why? because inside the 'definition' parser, the parsing words won't work.. this is probably a good thing, but quote does need to work.. let's separate parsing words from quote parsing. the lex stream should be made a bit more clear. FORTH -> [load flattener] -> [forth stuff: parsing words + definer environents] -> [quoting] -> SEXP Entry: locals for macros? Date: Tue Oct 23 20:08:46 CEST 2007 Once more than 50% of a macro's code is stack juggling words, something needs to be done about it. The macro below is a typical 'multi-access' pattern: an EXPANSION instead of a CONTRACTION. \ transfer bytes from one object to another macro : need not if exit then ; : m m-dup m> ; : transfer-once \ source dest -- swap >m >m ' ready? m msg need m-swap ' ready? m msg need m-swap ' read m msg m-swap ' write m msg m-swap m-drop m-drop ; forth What i really want is a locals syntax for macros that perform a lot of expansion: : transfer-once { src dest } ' ready? src msg need ' ready? dst msg need ' read src msg ' write dst msg ; The macro system already has a syntax for locals, so i just need to add this to the parser + choose the right semantics (code or data). EDIT: also, what about just . (dot) for name binding operation? Entry: locals Date: Tue Oct 23 21:53:04 CEST 2007 Actually i did this before. I guess in brood-2 there's a syntax that takes words like this: (a b | a b +) Resembling Smalltalk's syntax for anonymous functions. i just saw Factor also uses the vertical bar. What i could do is to combine this with my special quoting syntax: (a | a) == execute (a | 'a) == identity Following the rationale that words are mostly functions, and constant functions are the exception. This kind of syntax took me a while to get used to, but it makes a lot of sense: has lead to a lot of simplified mixing of scheme and cat code. So what about combining that with destructuring? ((a . b) | 'a 'b +) Hmm.. Let's leave that as an extension. There's no reason not to however.. I think I need a dosis of good old fashoned confidence to go for the quoted approach. What is more important: to stay true to the fact that symbols are functions, or to go for the lambda-calculus approach of using symbols as values + explicit application. Even though it looks strange, the issue is: do i stick with my previous realization that his is a good thing dispite it's strange look. So the choice is either (classic): (a b | a b +) == + (a | a execute) == execute this has the interesting property that permutations are easily expressed. or do i go with my approach (a b | 'a 'b +) == + (a | a) == execute What I could do is to use 2 forms of binding, and i guess that's what i did before. have | do the stuff abouve and || do the normal thing, or the other way around. (a : a) == execute (a | a) == id (a : 'a) == id using the ':' has the added benifit of reminding you of a "definition". Entry: lambda Date: Wed Oct 24 23:11:02 CEST 2007 Having had a night to sleep on it, i think it's going to be: (a b | a b) == id * Lambda is simply too important to gratuituously do different. * Data parameters are used more than function parameters, which in turn are easily quoted. * It is compatible with current stack comment notation. Entry: implementing lambda Date: Thu Oct 25 13:20:50 CEST 2007 apparently i need to be careful where to introduce local variables in the syntax expansion. as long as there's a lambda expression enclosing a (xxx: a b c) macro, all lexical variables are identified properly, but in this they are not: (define (bar? x) (eq? '\| (->datum x))) (define (represent-lambda c source) (let-values (((formals pure-source) (split-at-predicate bar? (syntax->list source)))) #`(make-word '#,((c-language-name c) c) (quote #,source) (lambda #,(if (null? formals) #'stack #`(#,@(reverse formals) . stack)) #,(fold (lambda (o e) (dispatch c o e)) #'stack pure-source))))) the 'dispatch' operation doesn't recognize lexical variables yet, because the enclosing lambda macro hasn't updated the symbols.. so lambda syntax should be introduced at a higher level. i need a shortcut, only for macros, and then work up the abstraction if necessary. the thing to extend is the 'macro:' form itself. hmm.. i'm making a bit of a mess of it.. the lexical scoping for the macros is a bit special, and is probably best handled using the pattern matching transformer stuff: the lexical variables in macros should be bound to literal arguments in the assembly buffer. (a b | a b +) -> (([qw a] [qw b] it) (insert (list (macro: 'a 'b +)))) which is really awkward in the current composition.. it's probably easiest to make a special purpose matching word as a straight lambda expression. something like: (match stack (((('qw b) ('qw a) . rasm) . rstack) (let ((a (literal a)) (b (literal b))) (apply (macro: a b +) (cons rasm rstack))))) Actually.. This is quite universal, except for WHERE to find the arguments.. Anyways, let's get on with it. (make-word 'macro-lex: '(a b \| a b +) (match-lambda* (((('qw b) ('qw a) . rasm) . rstack) (let ((a (macro: 'a)) (b (macro: 'b))) (apply (macro: a b +) (cons rasm rstack)))))) The first macro using lexical variables in synth-soungen.f macro : sync bit | \ -- begin yield bit tickbit low? until begin yield bit tickbit high? until ; forth Subtle ay :) Entry: theory Date: Thu Oct 25 21:07:55 CEST 2007 in order to finish brood.tex, it looks to me that type theory is not really the most important thing to brush up on: partial evaluation is. there's a lot of stuff here: http://partial-eval.org/techniques.html i need to give some proper attention. if only to relate my intuitions to things people have spent some thought on. Entry: multiple exit points Date: Thu Oct 25 21:48:31 CEST 2007 instead of writing macros containing 'exit' which are really a loaded gun, it might be better to write a proper while abstraction that uses multiple conditions. unfortunately, an 'and' is not very easy to optimize.. macro : need not if exit then ; : m m-dup m> ; : transfer; src dst | \ -- begin ' ready? src msg need ' ready? dst msg need ' read src msg ' write dst msg again ; forth why is this complicatied: because i don't want to use 'and'. what i want is a word 'break' which breaks from a loop on a condition. maybe 'transfer;' is good enough: since i already have arbitrary WORD exitpoints, i can use this to get any control structure exit point: it also prevents juggling of the control stack (macro stack). Entry: move Date: Thu Oct 25 22:19:09 CEST 2007 for this i need 2 pointer registers. thing is: i'd like to use the x stack's register to do this a bit efficient, but then i can't use for .. next ! implementation detail anyway.. Entry: buffers Date: Fri Oct 26 13:04:40 CEST 2007 next on are data buffers. i have some code that uses 14 byte buffers together with some dirty trick of storing read/write pointers in one byte for easy modulo addressing. i could dig that up again? what is a buffer? - 2 pointers: R/W - base address of memory region (statically known) - size (statically known) suppose i represent it as 2 literal values: rw-var offset see buffer.f for draft (committing now) but.. isn't it wise to write some code for generic 2^n buffers? where a buffer consists of 2 variables, a mask indicating its size. ok, did that but it leads to more verbose code. a different strategy could be to store the read pointer or difference at the point where W points, this saves a cell that's normally used to distinguish between empty and full. hack for later.. anyways, i stick with the current: its probably good enough. i need to move on. nibble-buffer.f tested. Entry: 0= hack Date: Fri Oct 26 13:29:59 CEST 2007 i'd like to figure out a way to efficiently implement the 0= word, wich turns a number into a condition. the problem is that 'drop' messes up the zero flag, so i used a 2-instruction movff trick before.. but using drop should be possible when using the carry flag. hmm.. nfdrop is only 2 slots.. i don't think i can do better really. Entry: I2C comm Date: Fri Oct 26 16:16:20 CEST 2007 how to get this going? the typical 'debug the debbuger' problem: I2C is going to be used for the debuggin network, but until that works master/slave: to preserve symmetry, it might be wise to use a dedicated single master node which runs debug code, so all the kriket nodes can be identical (slaves). ideally, all cricket chips are free from ICD2 and SERIAL ports, and have only power, ground, and I2C clock and data. send/receive: let's stick with the ordinary monitor protocol over I2C. the thing to do is to make a hub. Entry: SD-dac Date: Fri Oct 26 22:07:00 CEST 2007 A Sigma-Delta Modulator (SDM) can be thought of as an error-accumulation DC generator: given a constant input, it will generate the correct average DC output, with a quantization error noise spectrum that is high--pass. A First-order SDM is an extremely simple circuit: it consists of an accumulator with carry flag output: at each output instance, the current output value is added to the accumulator, and the resulting carry bit is taken as the binary output, and discarded. I had this idea of running an 'inverse interrupt' machine: instead of loosing time in ISR, just run an infinite loop, but allow at each instance one primitive to run, which needs to spend an exact amount of cycles. Probably not worth the hassle, but could be interesting for really tight budget. Anyways, this could be an alternative to PWM for kriket sound generation. It should in theory give better quality. but probably that also needs a deeper accu. With fast interrupt it's only 3 instructions: movf OUTPUT, 0, 0 addwf ACCU, 1, 0 rlcf PORTLAT, 1, 0 assuming it's bit number 0 in port, and the rest of the bits we don't care about (i.e. are inputs) the problem here of course is that it's not just output that counts: the output also needs to be computed. Looks like it's not really worth it. Best to use PWM interrupt with plugin generator code. At 2Mhz to get the carrier above audible frequencies would put the divider at 64, and the carrier at 31.25 kHz. (The interesting thing here is that it could also be used for bit-bang midi output at the same time :) To get this going: best to add a small modification to sheepsint to switch it into PWM mode. Entry: FM sheep Date: Fri Oct 26 23:16:24 CEST 2007 ok.. let's see what's necessary to make an FM (PM) synth in style of Yamaha oldies. using a proper synchronous fixed time sharing approach a lot is possible: 1. 31.2 kHz 1 x SDM output 2. 7.8 kHz 4 x 8 bit synth voices 3. 9.7 kHz 32 x envelopes for all this i have 64 instructions. one envelope per operator is more than enough. i've been checking out the code for table lookup, and it can be brought down to 4 instructions movf PHASE movwf TBLPTRL tblrd movf TABLAT but i doubt if 8 bit phase resolution will be enough.. Entry: hub board Date: Sat Oct 27 15:07:51 CEST 2007 make hub board, first for serial, then for I2C. the idea is that a hub board can be placed inbetween a normal serial board and a PC host: it's only goal is to provide control over the serial slaves. the condition is that all slaves have identical code, which means that host indeed can switch without problems between different slaves: [ PC ] --- [ HUB ] === [ S1 ] === [ S2 ] === ... requirements: * the interface that implements this should be transparent: there should be no need for calling code on the hub directly. (except for debugging the hub where the host has just hub's dictionary). i suggest to do this we use the next slot of 16 interpreter commands to pass through monitor commands to the hub. again: if i manage to get things working this way (async serial hub) i have no need for I2C to do networking.. in fact, in order to get I2C working i better build a proper debug network! and more: if i get this serial passthrough to work, moving to a synchronous 1-wire approach should be no problem. ok, i have 50 solutions now.. TODO: - make it work for serial = standard - use serial to bootstrap 1-wire - MAYBE use I2C after that, probably too complicated Entry: 1-wire revisited Date: Sat Oct 27 15:47:27 CEST 2007 yes, why not.. it's a cheap hack but might be worth it. and i already have provisions for it on the CATkit board, so the solution should be re-usable. (CATkit: COMM is RA4). let's stick to the ordinary monitor protocol with RPC semantics: (host asks question, client responds / acknowledges). this is already half-duplex, so fits nicely in a shared bus context. a simple start bit, 8 data bit, stop bit could be used for comm using the following waveform: 1 1 X 0 1 1 X 0 with X the 'shared bus' point, we can have a bidirectional link: * there's always power in a cycle (at least 50%) * bus is high when idle * there's a sync point 0->1 for slave sync * the send/receive is software control protocol could be somthing like: * master just sends (start bit, 8 data bits, stop bit) for the CATkit board, the sync could replace the fixed TMR2. let's try the following: * fix CATkit's no-serial cable detection. (OK) * drive a CATkit board with a square pulse * use TMR2 to perform timed read or write next: config RA4 * open drain output (needs external pullup - master side?) * does have protection diodes to both sides so, in theory it should be able to feed the chip through the protection diodes.. but as far as i can see, it doesn't boot properly. after adding a diode RA4 -> VDD it boots on DC. i don't understand.. so, on to the controller. from the host side, everything is synchronous. so timing should not be an issue. driving a couple of busses in parallell poses no extra problems. hardware: the dallas 1-wire bus apparently drives the targets through a resistor, instead of a transistor. i was wondering how to prevent hazards on the bus, and this is probably it: brief inspection shows that a faulty client can bring down a network easily by shorting during charge phase. a resistor also limits the charging current. so i guess resistors are good. (i wonder if the weak pull-ups can perform this task.. probably better not.) pic has quite a large maximum current sink (25 mA), which would determine the minimal size of the pullup resistor, i.e. at 5V the minimal is R = 200 ohms. simplifications WRT dallas 1-wire: * one slave per wire: no elaborate synchronization protocol necessary: all flow control is done in software using the purrr protocol. (host initiates transfer by sending a couple of bytes and waits for reply) * multiple slaves: they need to be addressed. in that case, some protocol is necessary. i.e. addr = 0: broadcast, no reply. otherwize: address followed by a couple of data bytes. * can use a 4-phase regime 10XY, where the receiver samples inbetween X and Y. * in case no comm is needed: master leaves line high: no unnecessary drain when pulling the resistor low. using RA4 on the 18F1220. and for sending? can probably use an 18F1220 as a hub too, if it uses just one output. which output to use? only RA4. maybe one bus is really enough? this way i could use simple RCA splitter cables to build a network. ok, i thought i needed an open drain output. apparently not: just switching between 0 and Z is enough. Entry: CATkit/krikit debug board Date: Sat Oct 27 21:39:41 CEST 2007 * in debug mode: one bidirectional power/clock/data per slave (raw byte protocol: no address). this makes it a drop-in for the normal async serial io for the monitor. in 'midi' mode the port can easily run unidirectional shared. bidirectional shared is a software problem that can be solved later. * using the 18f2620 for driver. the package is small enough to be practical. it can run without xtal at 8Mhz and has on board i2c for more elaborate networking later on. it has enough pins to add some status output. (i.e. RGB led) * port B is used for communication. RB4:RB7 have interrupt on change, so could be used for more elaborate slave comm later. * running CATkit on a full line through 1k gives a 2V drop = 2mA.: that sounds about right. since this is low bw debug comm, it sould be possible to just leave the line idle = high. that means no clock is coming in. so what about this: - run CATkit TMR2 at a higher rate, i.e. 31.25 kHz. this would give * a decent timebase for SD sample tests * a 7.9 kHz bitrate for debug comm * ability to send MIDI data from CATkit board I wrote the code for the network debugger. The 4-phase modulator and receiver transmitter framing words are done and tested. The remaining thing is how to switch between receiver and transmitter. Probably something like this: - start with receiver - receiver gets idle -> check tx buffer -> tx / rx gets data -> start rx state machine - transmitter stop -> check tx buffer -> tx / rx data -> tx state machine these can be taken into one loop, and activated depending on rx/tx flag. Entry: sheepsint urgent todo Date: Sat Oct 27 23:06:44 CEST 2007 LIST MOVED DOWN Entry: nasty sub bug? Date: Sun Oct 28 14:32:32 CET 2007 the following code leads to incorrect asm: 123 @ 124 @ - dup movf 123, 0, 0 subwf 124, 0, 0 that should be subfw ?? the problem is in "123 @ -" took the - and -- words out of the 'binary' meta patterns and fixed. Entry: rtx Date: Sun Oct 28 16:41:44 CET 2007 looks like it's +- working, at least the transmitter. one little problem still, if client syncs to 0->1 transition, what happens when it picks up in the middle of a data stream? suppose #x55 which is just a bunch of: 0100 0111 ... syncing to the right frame is not a problem: per bit there is only one 0->1 transition to sync to. so the problem is that each client should start with an idle line. it's the same problem as async serial. so.. receiver for sheep. let's stick with a RX state machine only. the deal is this: - interrupt on change: detect 0 -> 1 reset TMR2 + RX state machine all logic from hub.f can be re-used, except for the top sequencer, which should be route ; ; ; rx-bit ; Entry: comm on catkit Date: Sun Oct 28 17:38:26 CET 2007 there are 2 ports left: RA4 RA6 both are not very interesting: no interrupt on change, or interrupt facility. interrupt pins that can be reused are: RB5 (INT1/TX) RB2 (INT2) not without cutting traces or removing R8 RB0 (INT0) not without removing last pot RB5-RB7 (KBI) multiplexed with switches RB4 (KBI0/RX) can it be done using polling only? i.e manually synchronize on each start bit or something. need to think a bit more, but it looks like manually polling is going to be problematic. the easiest thing is really RB2/INT2: it's a proper interrupt, and its functionality is not used atm. maybe i should leave catkit out of it and try to get it to work on krikit first.. catkit needs an update anyway, and this could be a nice addition. reminder: - ditch AUDIO- for INT2 - external rectifier diode - serial RX 100k pull-down - fix pot distance - fix switch distances - room for LED Entry: Manchester Date: Sun Oct 28 19:30:21 CET 2007 i'm wondering whether it's not simpler to use Manchester code. (BPSK with square waves) symbols are 01 and 10 once synchronized, the signal can be locked by allowing resync in on the fixed transition at half symbol. syncing can be done on an idle line, all one (10). catch: for uni-directional with sender = master this works fine, but bidirectional is problematic. Entry: eliminating the pullup resistor Date: Sun Oct 28 19:40:41 CET 2007 In case there's one slave only, the pullup resistor can be eliminated by using a current-limiting resistor to prevent short-circuit on collision. Entry: slave on krikit Date: Sun Oct 28 19:59:46 CET 2007 Got one spewing 123, now need another one listening. Slave uses RB0 (INT0). Apparently i can't pull the line all the way down.. Probably on-resistance (i'm pulling down 100 ohm..) Sequencing is an interplay between INT0 and the TMR2. INT0 -> reset timer phase + call 'rtx-next' TMR2 -> 'rtx-next' The other one got 123, and some shifted version out of sync. To get better sync during debugging, bytes could be interleaved with a 10 bit idle preamble. This would guarantee resynchronization after the first faulty reception. Entry: strong 1 Date: Mon Oct 29 05:17:06 CET 2007 --- Vdd | [Ru] /--[Rl]-o SLAVE I/O | | MASTER o---o------o--|>|--o SLAVE Vdd | === C | --- GND 0 1 2 3 0 1 X X phase 1 is 'strong drive' directly from Vdd, not through a pullup resistor. this avoids strong sink currents and large voltage drop. during phase 0 and 1, MASTER is OUT. also if it's sending in 2 and 3. when receiving, master is Z, so Ru pulls up the line. a slave can still mess up by pulling a line high, but the short circuit is prevented by Rl. Entry: intermezzo: macro vs. return stack Date: Mon Oct 29 15:50:14 CET 2007 actually, this is quite simple. if i change the terminology a bit, compilation of local labels for jumps and run-time control flow using execute and exit could be unified somehow. Entry: about named macro arguments Date: Mon Oct 29 19:46:11 CET 2007 maybe it's better to stick to prefix syntax to not gratuituously move away from forth syntax. after all, : 2@ var | var @ var 1 + @ ; is not too much different from : 2@ | var | var @ var 1 + @ ; It will also simplify the implementation. Entry: urgent stuff Date: Mon Oct 29 20:52:04 CET 2007 time flies. i need to get debug network running today. it should not be more than patching the interpreter to the rtx: the hub should just be a loop that polls the serial port, and possibly executes some special purpose commands. the slave needs a new dispatch table connecting to rx, tx from the slave rtx. the so todo: - get this debug patch-through to work: nothing fancy, just repeat - fix some of the urgent problems Entry: hub interface Date: Mon Oct 29 23:25:15 CET 2007 i'd like to do this with changing as little as possible. connect to a hub just like any other project, but there should be a way to execute its application without needing knowledge about the dictionary of the hub device. let's change interpreter.f to \ token -- : interpret #x10 min route ; receive ; transmit ; jsr ; lda ; ldf ; ack ; reset n@a+ ; n@f+ ; n!a+ ; n!f+ ; chkblk ; preply ; ferase ; fprog ; e-interpret ; the last word should lead to a reset if it's not implemented, or to the interpretation of an extra set of byte codes. in any case, it is required to be filled in by specific monitor code. now: if there's no extension implemented, should invalid commands be ignored or not? there's no proper way to react to invalid commands, since they can quote the following bytes, leading to a completely non-interpretable state... just reset is probably good enough. another problem: if the hub just passes through, how to control it after switching to passthrough mode? serial break is an option. need to figure out how to send that in mzscheme though.. there should be a more elegant solution, but this requires either the traffic to be quoted, or the new interpreter to actually understand (parse) the traffic to see what comes through. the latter is not so easy because of quoted bytes. a better solution is to completely override the boot interpreter. that way all traffic can be properly redirected. i guess i'm making it too difficult. the real problem is: this hub thingy doesn't fit in my debug or run view: the cable can't determine whether a boot interpreter should be started or not. let's start there. next: name for the protocol.. i'm going for E2: it's the binary representation of 0 and 1: 0100 0111 = E2 with lsb first. Entry: the big questions Date: Tue Oct 30 00:25:54 CET 2007 probably the huge shot of caffeine i got today, but i'm in delusion / big-idea mode again.. i run into a lot of bootstrap problems. today's boostrap problem is debugging the debugger. somehow i think bootstrapping is really the only significant problem.. it's the "getting there" that's important practically, not so much the staying: that should be obvious. i find it a facinating subject. i need to read more about it: * need to play with piumarta's cola stuff: objects and lisp as ying and yang (though lisp has it's own ying and yang: eval and apply, i wonder if this is the case for objects? probably something with v-table lookup). * need to read about 3-lisp and reflective towers * i'm not so sure if writing a proper language bootstrap is valuable, but somehow it looks like yes. brood is a bootstrape exercise really. i'd like to end up, not necessarily at scheme but at a dynamic language to run on small machines.. maybe cola is the way to proceed? another thing i need to read up on is partial evaluation and C code parsing and refactoring, but that's secondary really.. maybe bootstrap is indeed the only real problem Entry: parsing again Date: Tue Oct 30 04:07:36 CET 2007 * added packrat parser code from Tony Garnock-Jones this should "end all _real_ parser woes" when i switch to a different syntax frontend. * for the forth regular parser, i just need to add proper syntax for a regular syntax stream pattern matcher: i have no real recursive parser need for the forth (really out of principle: to stick to the roots and make the language simple to understand. there's something to say about a simply parsed language when teaching!) * the only reason i'm using syntax streams is to be able to recover source location information and to use syntax-case. the latter is probably not the right abstraction. what i want to say is something like: (parser-pattern (macro forth) ((macro forth) ----)) where the '' is bound to a syntax stream from portable-packrat.scm : (packrat-parser expr (expr ((a <- mulexp '+ b <- mulexp) (+ a b)) ((a <- mulexp) a)) (mulexp ((a <- simple '* b <- simple) (* a b)) ((a <- simple) a)) (simple ((a <- 'num) a) (('oparen a <- expr 'cparen) a))) i read on the wikipedia page that a packrat parser is necessarily greedy. i'm not sure in what sense.. Entry: finite fields Date: Tue Oct 30 07:32:21 CET 2007 http://www.lshift.net/blog/2006/11/29/gf232-5 in 8 bit, the biggest prime is 2^8-5 = 251 i'm not sure what this is useful for though.. some error checking / correcting stuff? the article talkes about "a" finite field, as if it mostly doesnt matter which.. ah.. the wikipedia article on coding theory mentions subspaces of vector spaces over finite fields. a naive way would be to use i.e. a 3-space in the 4-space over GF(251). Entry: fixing the assembler Date: Tue Oct 30 16:35:18 CET 2007 made the dictionary into a parameter to move code from internal -> external definitions. now i need to abstract away the control flow of the assembler: eliminate assemble-next let's see, the type of control used is: * comma -> expand to list of instructions * comma0 -> same, without updating instruction pointer * register -> dictionary mutation primitive ones: * again (retry) -> retry assembly with updated (dictionary?) state properties: * assembling an instruction is 1 -> 0, 1 or more looks like the major difficulty is in assembler operations that recurse.. currently it's handled by just pushing an opcode in the input buffer and calling next. i'm going to make this recursing explicit. how should non-resolved symbols be handled? just returning the instruction seems best. i do need to fix absolute/relative addressing. what about leaving restart to the sequencer? the idea is to provide some expansion to plug-in assemblers (asm-find) so the point where i need to make some changes is the way 'here' is used: chicken and egg: * can't determine 'here' untill all previous instructions were assembled. * can't assemble instruction intil it is know how far a forwared reference is. what about trying to solve this with backtracking? is that overkill? maybe backtracking with memoization? maybe assembly itself is cheaper than memoization :) maybe every instruction should be compiled to a thunk that takes just the absolute address? hmm.. need some time to sort it through.. it should be possible to write this in a lazy way.. roadmap: - get it to work like it did before - change the implementation of 'here' to a parameter - create a graph data structure from 'label' - figure out the control flow for some backtracking like thing - write some graph opti (i.e. jump chaining) another remark: having 'labels' as pseudo instructions is bad. they shoul really be true graph elements: pointers to instructions. hmm.. i need a break. lap.. goes wrong 'somewhere' :) maybe i should fix the 'here' thing first before trying to get it to run, since it's somehow messed up. i need to start over: Entry: here kitty Date: Tue Oct 30 22:54:39 CET 2007 what with 'here' ? now that it's separated out a bit more, it's easy to see it is a bit of a mess: i'm using org-push and org-pop so i can't just eliminate it.. i need to separate these concerns: * ORG / ORG-PUSH / ORG-POP = telling where things go it's easier to cut out the intermediate part and handle it separately. a bit of a crazy way of doing things.. * 'here' = using self-referencing blabla i need a proper way of expressing all these dependencies: once 'org' stuff is dealt with, and the absolute/relative problem is solved, the remaining problem is one of relaxation. Entry: relaxation problem Date: Wed Oct 31 00:45:26 CET 2007 some choices need do be made before enough information is present, but instead of completely starting over (backtracking) the form of the problem is such that the intermediate solution can be updated. as long as a complete dependecy graph is present, the solution is quite trivial: just recurse over all dependencies. some hints for finding the right data structure 1. instructions that do not reference code locations, either as jumps or just as literal words, are irrelevant and can be ignored. 2. labels point _in between_ instructions 3. keep the cause of events abstract: any instruction that has a reference can grow. 4. this is related to functional reactive programming let's stick to the idea of instruction cells: each cell contains a single symbolic opcode with arbitrary length. thinking of this as cells sending messages to each other, there are 2 kinds of messages: - tell next cell it has moved - tell cells that depend on a label they need to update looks ok at first, but for non-contiguous code that doesn't have a non-decreasing code distance between several nodes, this might not terminate.. if i make sure code never shrinks, this should be ok though.. hmm... i need to read a bit about this. i guess in general it's "linker relaxation". so most important notes: * downward updates from a size change can be eliminated * to ensure termination, only expand/contract in one direction: that way it will at least stop at the case where all references are expanded. * if a size change happens as a consequence of an update Entry: a more traditional approach Date: Wed Oct 31 03:43:54 CET 2007 http://compilers.iecc.com/comparch/article/07-01-038 There is a type of assembler that does exactly the same thing on every assembly pass through the sourcecode. Pass 1 outputs to dev>nul and is full of phase errors, pass 2 has eliminated most (or all) phase errors (output to nowhere) and pass 3 usually does the job in 99%+ cases whereupon code is output. On each pass through the sourcecode (or p-code in your case) you check for branch out of range, then substitute a long branch and add 1 to the program counter, causing all following code to be assembled forward+1, then make another pass and do the same thing again until no more branch out of range and phase errors are found do to mismatched branch-target addresses. That doesn't require my esoteric approach and seems a lot simpler really: just keep it running until the addresses stop changing. So just do as before, but: * keep a phase error log * use a generic branch instruction which gives short or long branches * every pass is completely new * split 'old' and 'new' labels, make new labels mutable? * put 'here' in a dynamic variable * make a quick scan for labels to find out undefined ones NEXT: prepare assembly code so multiple clean passes are possible: - get rid of 'mark' for example. - put 'here' in a parameter - remove all dictionary manipulations - find a way to handle var and eeprom.. maybe separate pass/filter? the goals is clear enough.. just some disentangling to do first.. different approach: * use the previous approach, but keep the dictionary after every pass (clean it inbetween) * keep a log of the name registerations to determine phase errors. Entry: comparators and square waves Date: Wed Oct 31 05:00:41 CET 2007 before trying anything with sine waves, it makes sense to at least have a go at pure binary singnals spanning the entire bandwidth. i'm curious as to how far i can completely eliminate amplification, and use only a comparator? i do loose all signal presence detection capability, and amplify noise tremendously. but this does transform everything into a software / filtering problem. i guess with some good codes i can actually get things through.. Entry: shopping for opamps Date: Wed Oct 31 19:59:42 CET 2007 @ maxim for low voltage rail-to-rail. i can get as low as 2.7V for MAX4167 5MHz, 1.3mA (DUAL) MAX494 0.5MHz, 0.15mA (QUAD) Entry: name spaces and objects Date: Wed Oct 31 23:32:41 CET 2007 i'm trying to figure out how to make the name manging work well enough to create static metaprogramming interface which supports generic programming at the macro level. * write algorithms in macro form * instantiate them statically as many times you need what i'm really missing is higher level macros. with those, i can build anything i want really.. so why is it impossible to have those? i probably need to give up on forth syntax.. (let me finish my verbose buffer code before i try to answer..) ok. i don't know really. let's first try to get things like this out of the way: : bbf.tx-empty>z bbf.tx buffer.empty>z ; : bbf.rx-empty>z bbf.rx buffer.empty>z ; : bbf.rx-room>z bbf.rx buffer.room>z ; : bbf.tx-room>z bbf.tx buffer.room>z ; : bbf.>tx bbf.tx buffer.write ; : bbf.>rx bbf.rx buffer.write ; : bbf.tx> bbf.tx buffer.read ; : bbf.rx> bbf.rx buffer.read ; : bbf.clear-tx bbf.tx buffer.clear ; : bbf.clear-rx bbf.rx buffer.clear ; what i want is just ' bbf.tx- compile-buffer ' bbf.tx- compile-buffer i can't even do variables since they are macros.. * yeah i need to be able to generate macros * and fix name clashes within a compilation unit: both words and macros. maybe the trick is really to define 'compilation unit' properly? in my current approach, a macro can't pop up during expansion of code. i need to get the philosophy right: * a flat namespace is nice for an application: everything is concrete. we're "among freinds" and last names are not necessary. * it sucks for writing library code the solution in mzscheme that works for me is functions + modules. local module namespace can be used for small specialized utility words. i'd like to have something like that in forth. the problem is: i'm taking a really static stance in which macros play a central role, not functions. this works as long as macros are sufficiently powerful, which means higher order macros. now, let's pull the problems i'm having apart: i wrote some buffer code, which is just macros. to instantiate a buffer one doesn't simply do "bla create-buffer" or something, but it is necessary to specialize a lot of functions manually. that's completely unacceptable. Entry: higher order macros Date: Thu Nov 1 01:06:36 CET 2007 In order to solve some particular template problems, i'd like to have higher order macros. this amounts to instead of splitting up a source file as: MACROS -> PROCEDURES splitting it up as ... -> MACROS^2 -> MACROS -> PROCEDURES of course, there should be no limit to the tower. The real problem is: i have no sane syntax space left! In macros i can do this: macro : make-a-123 ' a-123 *: 123 exit ; which is already pretty ugly because of quoting issues. But what am i going to invent to make higher level expansion work? One thing is sure: taking out the reflection (making macros side-effect free) killed the possibility of generating names at compile time, EXCEPT for function labels. But those are really just data: it's a hack that doesn't really count. So I have a GOOD THING: independent declaration instead of sequential variable mutation for creating new macro names, that causes a BAD THING: limited reflection due to improper phasing. Actually, I already knew that, but i'm starting to feel it now: artifical limits are no good. Even if they serve a higher goal.. Maybe that makes them not artificial? The limit i created is actually there for a reason: to use partial evaluation to make it possible to perform compile time operations without the need for an explicit meta language: without the need for quotation like `(+ ,a ,b) or it's beefed up syntax-case / syntax-rules variant. (funny how the only 'meta' part of the language is the macro stack: it punches holes in reality somehow ;) So let's pat myself on the back: * the current macro / forth thing is GOOD. it is easy to use, easy to understand, and avoids most quotation issues that arise in practice by relying on partial evaluation. it gets pretty far without the need for an explicit metalanguage. * it's NOT GOOD ENOUGH because it's the top level: it can't be metaprogrammed itself! The metaprogramming operations i'm looking for are those that create new macro NAMES. Creating new macro BODIES should not be so terribly hard: it is in fact what should be used for the quotation based language. So the core of the business should be the question why this works in scheme: (define-syntax make-macro (syntax-rules () ((_ name body) (define-syntax name (syntax-rules () ((_) body)))))) box> (make-macro bla (+ 1 2)) box> (bla) 3 Entry: poke Date: Fri Nov 2 04:27:04 CET 2007 i'm taking a day off.. so technically i'm not allowed to write in this log. however, i got into PF today, and wrote a rant on the PF list about mapping and partial evaluation. maybe it's time to start writing poke, or nailing down the requirements. the idea behind poke is to have a machine model for DSP like tasks that can be setup (metaprogrammed) by say a scheme system. the idea behind an application is this: 1. a program is compiled for a VM. 2. a new VM is instantiated (on a separate core/machine) 3. the VM now runs in real-time: doing its own scheduling and stack based memory management, being able to communicate with its host system and other VMs each VM is a linear tree stack/tree machine. i'd like to do this without writing a single line of C code: have it all generated. that's the only way to be serious about generating *some* code. it should have an s-expression interface with which it talks to a host scheme system. this acts as message passing: no shared state allowed. this syntax should have an easy extension for binary data. it should be 'ready' for multiprocessing. what i mean with this is: each processing core should be able to run a single machine instance, so instances should be able to talk among each other in a simple way, and there should be a schedulure available on the VMs to handle the message passing. i was thinking about a 'binary s-expression' approach to limit inter-machine communication parsing overhead. data should still be list-structured though, and word-aligned. for human interface, a simple front-end could be constructed. arrays can be allowed for ease of wrapping binary data. internally, cons based lists are used for all representation. cdr coding is used to be able to represent programs linearly. memory inside the machine consists of stacks only. each machine uses a limited set of data types, making re-use lists efficient. aim for the highest possible gcc code generation efficiency: i see no point in targeting anything else than gcc, so all extensions are allowed. i just checked (see doc/gcc/tail.c) for tail call support and it seems to work when putting functions in a single file. it also works putting the functions in different files apparently. that's good news. state passed: 3 stacks: DS/RS/AS the target language should be a pointer-free safe language. this is going to be a bit more difficult, probably have to split in safe / unsafe parts. the 'system' language and the 'inner loop' language are different and should be treated as such. i probably should start with the latter and build the control language as a layer on top. the former is a forth-like language extended with linear tree memory and the latter is a multi-in-multi-out language to be combined with combinators. 1. all C code generated: need a generator. 2. message passing interface using s-expressions. 3. run-time memory (stacks/trees) is locally managed 4. other (code) memory is static/readonly, loaded by host 5. safe target language (from a certain point up) so poke seems like a really straightforward extension to forth. getting it compatible with PF will be quite something though.. all this is pretty low priority. the only difficulty is how to deal with pointers for optimizing the linear stack/tree data structures. 'safe poke' :) Entry: mix Date: Fri Nov 2 05:31:52 CET 2007 then the thing that could be used immediately in both PDP, PF and PD modules: a language to describe inner loops and iterators, to yield C code that can be straight linked into the projects. Entry: instantiating abstract objects Date: Fri Nov 2 15:19:40 CET 2007 i'm giving myself one hour to think about how to fix the verbosity of the following code: macro : tx #x100 tx-r/w #x0F ; \ put buffers in RAM page 1 : rx #x110 rx-r/w #x0F ; : tx-ready? tx-empty>z z? not ; : rx-ready? rx-empty>z z? not ; : tx-room? tx-room>z z? ; : rx-room? rx-room>z z? ; forth 2variable tx-r/w 2variable rx-r/w : tx-empty>z tx buffer.empty>z ; : rx-empty>z rx buffer.empty>z ; : rx-room>z rx buffer.room>z ; : tx-room>z tx buffer.room>z ; : >tx tx buffer.write ; : >rx rx buffer.write ; : tx> tx buffer.read ; : rx> rx buffer.read ; : clear-tx tx buffer.clear ; : clear-rx rx buffer.clear ; the ONLY difficulty here is that i can't generate macros, including variables. is there an other way to solve the problem? is it possibly to hide everything in one single macro? yes. if tx-empty>z is never expanded as a function this is actually possible. then what remains is just: macro : tx #x100 tx-r/w #x0F ; \ put buffers in RAM page 1 : rx #x110 rx-r/w #x0F ; forth 2variable tx-r/w 2variable rx-r/w tx >buf tx buf> maybe i can somehow make an 'un-inline' function work? like memoization? something which gets me half way there is a blocking read/write operation: only for dispatch loops this then becomes problematic. conclusion: i guess it's ok to go for this approach: On the subject of code reuse, there are 2 options. Either you write it as procedure words, or as macros. Using the procedure word approach will lead to smaller code size but slower speed (since run-time dispatch is probably necessary). Using the macro word approach can lead to fast inline code which might be not optimial for code size. Entry: e2 debugging Date: Fri Nov 2 17:41:24 CET 2007 current setup: hub (master) connected to krikit (slave) which runs a loopback. there is communication, but somehow a start bit gets lost. there are 4 places where it can get lost: 1. hub transmit (OK: clear on scope) 2. slave receive (OK: sending #xFF all one gives reply) 3. slave transmit (OK: reply has start bit) 4. hub receive i have no trigger scope or logic analyser so i need to construct a steady state error condition i can sync my scope to. i can measure slave transmit if i manage to add some wait code in the hub. such code is probably necessary for other purposes. so. running a couple of experiments makes it clear that 1-3 are ok. the problem is with the hub receive that doesn't see the start bit. i don't see the problem. as far as i can isolate it, somehow the start bit gets missed by: - the rx state machine is in the wrong state - the rx/tx switch comes a cycle too late - ... i need something that's easier to test. i suspect the rx/tx switching is the cause, so maybe i can make a better switcher? i did notice a slightly borked waveform for the startbit however.. let's see if i can get a better view and see where that's coming from.. that was wrong. i start over: - fixed timer compensation, now at least the signal is stable - clearly to see that there's a phase problem i'm wondering if it's not just a speed problem. timer is running every 64 clocks.. well.. it's easy to test by just running it slower really. YES! it was.. running 4x slower fixes the problem. time to do some profiling then! Entry: e2 + interpreter Date: Fri Nov 2 23:21:38 CET 2007 i'd like to make 'transmit' and 'receive' late bound. that way it's easy to switch the interpreter's default I/O. but i need to do it cheaply: using vector.f and execute.f requires too much boot space. wait.. that's the case for catkit. for the 2620 i have a lot more room. maybe i should go that route then, and solve the catkit problem when it poses itself. time to make some decisions: * allow both serial + e2 ? * build e2 in boot loader? actually, i do need e2 in boot loader. working as a safety measure. hmm.. let's get it to do what it needs to do first. ok. i can ping krikit. fixed the saving/restoring of the a reg so i can access the stack. code upload doesn't work yet. i guess it has to do with a missed 'ack' due to interrupts being disabled. maybe i should build in the ack in the fprog? NOTE: about saving the a reg. if there are interrupts, the a reg needs to be saved anyway (or it's use protected with cli), so maybe it's best to just always save on clobber? alternatively, always save on clobber in isr. i added an ack to fprog and ferase, but apparently that's not enough. one line can be written, then it messes up. some code is needed to properly resync the transceiver after programming so it picks back up at the next idle INT0. for debugging purposes, i should make a version that uses polling only, so it can be used to setup interrupts. thinking about it, i probably need to modify all opcodes so they give a sync themselves, so no buffering is required. (uart has 1 byte). hmm.. it's not so simple really. actually, it is: all interpreter tokens have RPC semantics: they return at least one value, except '00' which is a sink, and 'reset' which can't have a return value. the 'ack' opcode can then be eliminated, an possibly replaced with 'interpret'. nop, reset -> no ack receive -> ack transmit -> value jsr, lda, ldf -> ack n@a+, n@f+ -> stream of bytes, no ack necessary n!a+, n!f! -> ack ferase, fprog -> ack chkblk, preply -> stream of bytes, no ack necessary this should get rid of the requirement to have buffered io. remaining timing issues can be handled with appropriate delays. an interesting extension when 'receive' and 'transmit' are made dynamic is to have them read from memory. that way a small program could execute from ram. Entry: boot protocol changed Date: Sat Nov 3 02:27:28 CET 2007 * fprog and ferase now give an 'ack' themselves. this is necessary for receivers that suffer when interrupts are disabled. * the #x00000000 password is eliminated: with boot code protection this isn't necessary. Entry: separate compilation + name spaces Date: Sat Nov 3 15:14:27 CET 2007 as a consequence of the way compilation works, it is possible to rely on the fact that, per compilation unit, names can be overwritten. what i mean is that it is possible to 'load' the same file twice, but with different words/macros bound in its environment. this comes close enough to the 'dictionary schadows' paradigm i'm used to in PF, and which actually works pretty well: it avoids the need of a name space mechanism fairly effectively. an extension to this could be to allow for exports: provide only those macros and words necessary. then another extension: why not install the macro source in the target dictionary? there's no realy reason not to, and it makes 'mark' work for macros (given that i delete and re-instantiate the macro cache). or.. i could use this as an indicator for using the macro cache or not. one thing that has been bugging me: if i define a word or a macro, i do want it to override the previous word or macro. i should make a list of the name space trade--offs for writing a forth really. Entry: roadmap Date: Sat Nov 3 15:51:02 CET 2007 - get programming to work over e2 (restart receiver after fprog: add a macro hook to interpreter.f) (done) - fix acks in interpreter.f and tethered.ss (done) - make it work without interrupts and put it in the boot loader - figure out the 'strong power 1' phase, and test with slave. - test over longer twisted pair cable. Entry: no middle road Date: Sun Nov 4 01:04:57 CET 2007 some thoughts about 'accumulative' code, due to lack of a better word. in light of the recent remarks about higher order macros, i have the impression i am mixing 2 paradigm in a not so elegant matter: 1. functional programming, mzscheme's independent compilation with 'unrolled' metaprogramming dependencies. 2. the accumulative image model, where a language grows by accumulating more power, which then can immediately be used to define new constructs. i knowingly took out a part of 2. to get purely functional macros that could be safely evaluated for their value at interaction time. however, the the interactive compilation does work in an incremental way. it looks as if i am forced into some middle road compromise. Entry: embedded programming in 2007 Date: Sun Nov 4 15:53:35 CET 2007 the question i really like to answer: without too much bias (the the tool i wrote) what is the point of writing static, early bound code in 2007, even if we're talking about microcontrollers. * is there really a 'complexity barrier' below which one HAS TO move to quasi manual compilation and allocation? * will this barrier remain in existence, or will better tools make a more high-level approach possible? EDIT: some things i was thinking about yesterday: * leaky abstractions are hard to work with. starting from assembly and "thinking up", using purrr to help you write the application is the right approach. starting from some high-level understanding of the language and having to learn all its limitations doesn't really work. the problem seems to be the manual resource management: time, space, and synchronization between global variables, and hardware devices. * it seems i loose most of my time in low-level configuration issues which give little feedback on error, and dealing with situations that are hard to debug due to dependence on external events. low level design really is a debugging problem: setting up experiments to try to isolate errors. hence the use of loads of specialized (hardware) tools used in professional environments. Entry: concatenative introduction email Date: Mon Nov 5 18:44:15 CET 2007 Dear All, Allow me to introduce myself. My name is Tom Schouten. I live in Leuven, Belgium and I'm 32 now, if that helps paint a picture. I've been interested in concatenative programming for a while and lurking here and there.. To educate myself, I wrote quite a lot of code in the last couple of years, and I'd like to share some of the results, but maybe even more the resulting questions. (warning: long post, story of my life :) My background is in electrical engineering. My heart lies in music DSP. I've been working up the ladder from electronics, to machine language and C/C++, through Pure Data (a data flow language) to Perl & Python to finally end up at Scheme and functional programming. I'm flirting a bit with Haskell, but really just read because most recent interesting functional programming texts use that language. http://en.wikipedia.org/wiki/Pure_Data The problem I'm trying to solve to guide me a bit is "Build tools to write DSP code, mostly for sound and video processing, in a high level language." I ran into limits of expressiveness writing video extensions in C for Pure Data, about 4-5 years ago. Apparently there are no freely available tools that solve this problem, so I take that to be my mission. About 3-4 years ago I started writing Packet Forth (PF) as an attempt to grow out of my C shoes. It was at the time I discovered colorForth, and I was wondering if I could create some kind of cross breed between Pure Data and Forth. PF now looks a bit like Factor on the outside, though is less powerful. PF uses linear memory management (data is a tree), with some unmanaged pointers for data and code variables. PF's point is a to be a scripting language which tosses around some DSP operations written in C. It doesn't aim to be a general purpose language. The darcs archive is here: http://zwizwa.be/darcs/packetforth/ Some more more highlevel docs aimed at media artists here: http://packets.goto10.org I got a bit frustrated with the internals of PF, mostly because there is still too much verbose C code, and a lot of C preprocessor macro tricks that could best be done with a _real_ C code generator. So I dived a bit deeper into Forth, and early 2005 I started at the bottom again: I wrote an indirect threaded forth for Pure Data (mole), and started BADNOP (now dubbed BROOD 1), an interactive cross compiler for the Forth dialect Purrr, an 8-bit stack machine model for Flash based PIC Microcontrollers. http://zwizwa.be/darcs/mole http://zwizwa.be/darcs/brood-1 Mole made me 'get' Forth finally: the first versions of PF were mostly blind hackery to get to know the problems before the solution. For mole, I actually followed tradition a bit more (Brad Rodriguez' "Moving Forth"). This lead to a more decent PF interpreter. The forth I wrote to write the cross-compiler for the PIC Forth was a mess. I was experimenting with some quotation syntax but realized that what I was really looking for was lisp, or a lisp--like concatenative language. At that time, early 2006, I discovered Joy, so I ditched the compiler and rewrote it in CAT (not Christopher's Cat) which was written in scheme (BROOD-2). After some refactoring and rewriting due to beginner mistakes I am now at BROOD-4, with the CAT core written as a set of MzScheme macros. This CAT is a dynamicly typed concatenative language with Scheme semantics. I consider it an intermediate language. Currently it is only used to implement the Purrr compiler (a set of macros) and the interaction system. http://zwizwa.be/darcs/brood Purrr is as far as I know somewhat novel. All metaprogramming tricks one would perform in Forth using the [ and ] words are done using partial evaluation only. I've tested this in practice and it seems to work surprisingly well. I am still struggling a bit with the highl level Purrr semantics though. Concretely, it is a fairly standard macro assembler with peephole optimization. Nothing special there. On the other hand, its macro language is a purely functional concatenative language which is 'projected' onto a real machine architecture after being partially evaluated. I tried to explain these concepts in the following papers: http://zwizwa.be/darcs/brood/tex/purrr.pdf http://zwizwa.be/darcs/brood/tex/brood.pdf (for the latest versions it's always better to use the .tex from the darcs archive) The latter needs some terminology cleanup, but it contains an explanation of the the basic ideas, and an attempt to clarify the macro semantics in a more formal way. I'm interested to learn what i need to read in order to frame these concepts in proper CS speak... It looks like I'm either terribly ignorant of something that already exists (I went through a couple of stages of that already), or I found a clean way of giving cross-compiled Forth a proper semantics. On a lighter note, I'm using Purrr to build The Sheep, a retro synth inspired by 1980's approach to sound generation. It runs on CATkit, and has been used successfully many times in beginner "physical computing" workshops, as electronics is called in non-engineering circles :) http://zwizwa.be/darcs/brood/tex/synth.pdf http://packets.goto10.org/packets/wiki/CATkit (the scary guy in the picture is not me :) Entry: krikit board design decisions Date: Mon Nov 5 17:52:11 CET 2007 - 4 x AAA -> need at least 5V. alternatively, use a 9V cell and a transistor for speaker output. - RGB led onboard - debug connector = battery connector (RCA plug) Entry: TODO Date: Mon Nov 5 21:40:23 CET 2007 list has moved to TODO file. Entry: polling E2 interpreter Date: Mon Nov 5 23:15:31 CET 2007 it's not entirely without trade-offs to choose for a polling interpreter for E2 in the boot code. PRO: independent of interrupt routines which is useful for debugging application isrs. CON: completely synchronous and non-buffered. this requires some careful coding in order not to miss any data. Maybe the boot code should contain both versions? This leads to objects really: a vtable is a dynamic route word. 2variable stdout : do-stdout stdout invoke ; : e2-stdout stdout -> route rx ; tx ; on ; off ; hmm.. i messed up slave.f: diff tomorrow.. Entry: macros and procedure dictionary Date: Tue Nov 6 06:03:54 CET 2007 maybe the trick is to just get rid of the distinction between procedure words and macros: a single namespace, with procedure words being equial to : bla 123 compile ; this combined with a preprocessing step that identifies all labels in the source text. a single namespace is easier to understand. separate compilation units gives shadowing, while inside a single compilation unit circular references are possible. what i want this to move toward is a more and more static declarative structure. maybe i should re-implement namespaces and build them on top of the mzscheme module system. i doubt the solution i can live with eventually will be significantly different than mzscheme's.. maybe a bit more liberal? or is that just because of current implementation? maybe i should make the compiled macros into a real cache, and store a master version as a s-exp tree.. Entry: redefine Date: Tue Nov 6 15:59:00 CET 2007 i need to * make it illegal to redefine macros: they use a caching mechanism which replaces names with values (procedures). * make it illegal to define a label that is already a macro the real problem is that redefines need a proper semantics in CAT. for the forth, i think shadowing redefines are best: 'empty' is practical, and it should work for macros too. CAT is currently designed so redefines are illegal: this allows the use of values instead of boxed values. some possible routes out of the mess: - prohibit redefinitions - use shadowing + proper cache - use boxed values (reset the code inside the 'word' struct) a deeper question is: why not use mzscheme namespaces for all macros? answer: because i rely on late linking. is there a way around this? it probably makes it too complicated, since i need to figure out a way to map it to BOTH modules and units.. let's stick to the current hash table name space, and go fore the boxed approach: mutate the words themselves, instead of their hash table entry. OK. that seems to work. remaining problem: defining words that are macros. a way to solve this is to define each word in the dictionary as a macro, compiling [cw ] let's not.. i've added a warning, which made me realize that i do use this: macros can call words with the same name as a fallback. that mechanism might be more worth than a safety net.. no, a safety net is more important: can fix the delegation using a symbol prefix. what about doing this automaticly? the last matching pattern is always mapped to a runtime call? i do need to fix dangling macros though. let's see if i can run into that case again.. ok, it's clear: a dangling macro can be disastrous. this is a mess.. assume there are 2 classes of macros: CORE and PRJ. PRJ needs to be flushed whenever the project changes. i am not sure whether macros from CORE will actually bind to those defined in PRJ. there is no such plugin behaviour as far as i can tell, but nevertheless it is possible to go wrong so i should do this: flush cache = - invalidate all prj macros (make them raise an exception) - detach them from the namespace looks like i got it now: ns-flush-dynamic-words! + support Entry: asm rewrite Date: Wed Nov 7 00:58:54 CET 2007 found another asm bug: variables get allocated on each pass now. this doesn't seem to be fatal though, just inefficient. sheepsint works, so it can't be the weird hub.f bug i'm chasing.. [EDIT] several things might change here, but it could be a good idea to keep the current operation until i have time to clean it up a bit. cleanup would be: - move 'here' to a separate dynamic variable - handle different dictionaries better. the problem now is that 'allot' gets called multiple times without reset.. it's probably best to filter it out in a preprocessing step. Entry: sheep transients Date: Wed Nov 7 04:45:41 CET 2007 'sound' needs to be a stack: a circular one, initialized with valid sounds, or a delimited one so a sound can end in 'done' to fill the rest of a control slice with another sound. the point of this is to create a concatenation at run time. it is of course possible to do this at compile time, but the fun would be in *mixing* sounds.. i think i have the solution there: each pattern tick a 'program' is erased, and filled with instruments that are played after each other, with the last tone = silence. Entry: low impedance signal source Date: Wed Nov 7 07:00:27 CET 2007 i'm trying to understand the difference between these 2 statements: * for a low--impedance source you best measure current, while for a high--impedance source you best measure voltage. * a current source has high impedance, and a voltage source has low impedance. the deal is that these are 2 different kinds of "measurement" because of the entire different scale of energy involved: for sensors, you want maximum energy transfer, but to "measure" a current or voltage source, you want minimal energy transfer. looking at a sensor as a voltage or current source, you want to "max it out". implementation: so, doing this with an opamp is really trivial. bias (+) on Vdd/2, feed back from (out) to (-) using Ra, and connect the current source between the virtual ground (-) and (+). R /---/\/\/\/\--\ | __ | | | \ | /--||--o-| - \_______|___o Analog -> uC |\ @ _o_| + / | ||@ | |__/ |/ @ | \----o | o--/\/\/\/--o---/\/\/\/--o GND 2.5V 5V then: * connect the speaker to 2 analog inputs, so they can be switched in analog high Z mode: not good to bias digital ins at Vdd/2 * run the opamp and bias network off of digital output let's see if analog Z1/Z2 can be PWM outputs. no such luck.. maybe use a transistor to shield the detached (Z) driver pin from the Vdd/2 bias voltage. or just not use pwm... remaining question is if the opamp, when powered down, can take a large differential input voltage. EDIT: the circuit doesn't work without a capacitor: the coil is a short circuit at DC, connecting (-) and (+). due to nonzero offset voltage, this saturates the amp. EDIT: i understand now why measuring current is not a good idea. the impedance of the device is dependent on frequency: 0 for DC, rising linearly. if you measure current, the signal will have a strong low frequency content. however, if you measure voltage through a resistor that's say 10x larger than the stated impedance, the response is flattened out since the resistor dominates. so the classic one works better: SPK Rg o /---/\/\/\/\--\ | | __ | | Rs | | \ | o--/\/\/\--o-| - \_______|___o Analog -> uC |\ @ ____| + / | ||@ | |__/ |/ @ | | | === Cs | | | | Cn | o---o---||--o | | o--/\/\/\/--o---/\/\/\/--o GND 2.5V 5V Here Cn reduces noise by lowering the AC impedance wrt to the high DC impedance point at (+). Rs = 47R and Rg = 100k give decent results. About 2000x or 66dB. The values of Rs should be as low as possible to reduce noise. I'm comfortable now i understand the trade-off. EDIT: i switched to using closed loop current measurement again, this time limiting the overall gain to about 100x (using Rg=1K). followed by a second stage with 100x gain this seems to work better. i suppose my original problam was just due to too high gain, running into opamp limitations. EDIT: going back to the circuit with Rs: in that one the opamp's input could be decoupled from what goes to the speaker by switching the (-) and the top of the bias network to ground. Entry: e2 hub Date: Wed Nov 7 17:37:52 CET 2007 now i need to find a way to program the e2 hub. on problem in the boot protocol is that i have no way to packetize the stream.. what i want is a hub which is mostly in repeater mode (for the commands 0->15) but responds to other commands itself. there are 2 alternatives. either write a 'fake' interpreter which simulates the state machine that parses the debug input stream, or change the protocol so it is delimited. the former is a stupid short term thinking hack.. let's packetize. hmm... this is quite a change again: thinking about optimizing the problem. oops bad word :) a way to do this quickly is to just prepend every message with the size. that way the core interpreter can ignore it, but the repeater can transfer without being able to interpret. let's try that first. notes - should put 'ack' at 1, so a stream of ones gives ack messages hmm.. i chickened out. it's a lot of changes at once. lots of places to go wrong. will cost me some frustration.. let's find another way, go for the stupid hack. if i can make the message length not dependent on context, meaning a previous message length, i can probably derive the lengths manually. the only problems here are the block transfer words, stuff that comes back from the uC can be echoed without problems. (i'm thinking about ping reply..) ok, made the protocol context-free in the host -> target direction. so, next: * make hub understand protocol (OK) * add hub commands * move to polling implementation in bootloader hub commands: these should be an abstract interface for things one would like to do with a hub. arbitrarily set client (0=hub) on client off client hmm.. i don't see it so clearly. what am i trying to accomplish? by default i should be in 'hub application' mode, but it should be possible to switch to hub debug mode too. the latter can be permanent switch (requiring access to the dictionary to switch back). i got it sort of figured out now. TODO: * make hub switch between hub-interpreter and interpreter using external resistor. do this when hub is finished. * until then, find a way to start the repeater without having the hub dictionary loaded. Entry: how much amplification? Date: Thu Nov 8 01:22:04 CET 2007 i have 12 bit at my disposal. amplification is mainly determined by the ratio of distances. i'm measuring current, which should be proportional to sound pressure, which is 1/r^2 so, suppose i use a gain factor of 64 = 8^2, this gives a range ratio of 8. say 1 - 8 meters: don't put them closer than a meter, and further than 8. Entry: poke & precompiled loops Date: Thu Nov 8 06:14:42 CET 2007 i think i need to separate out the c code generator so i can start generating code for PF and Pure Data, which will not be anything forth-like so doesn't really belong in brood. in fact, the way i think of it now (something akin to functional reactive programming) it will be quite the opposite. Entry: RS next order Date: Thu Nov 8 17:42:19 CET 2007 * linear regulators * 9V clips / battery holders * transistors? * high ohm resistors * xtals + caps / resonators * blue bell wire * schottky diodes * small signal diodes Entry: more modem design decisions Date: Fri Nov 9 02:44:47 CET 2007 - modulation or not? some modulation is necessary since i can't transfer DC. but frequency response looks really non-flat to go for a wideband approach. i need to experiment. - FIR or IIR what is needed is a decimating filter. i can probably get much further with a crude windowed FIR than an IIR. Entry: demodulator Date: Sat Nov 10 15:12:58 CET 2007 i'm not going to waste time on trying out a pure square wave modulation. let's stick with some simple demodulator, and have a look at the numbers. i currently have the debug net running at 8kHz. this should also host the filter tick, which consists of: - read adc + update filter state - once every x samples, wake up the detector tasklet what if i start out with using a FSK, because it requires no synchronization, and use a square window where the 2 frequencies are placed each other's zero. so.. square window does have perfect rejection for the harmonic frequencies. it's only the stuff that lies inbetween that is problematic. ok, this is obvious. the problem can be entirely moved to synchronization and linear distortion due to transitions. if the receiver listens during a steady state part of the signal, perfect rejection is possible. so the main questions are: - how to synchronize? - how to limit transitions? which brings me back to PSK.. maybe it is just simpler to use? as long as the start of a symbol can be detected (threshold) and the phase can be corrected (preamble) the rest seems not so hard really. again, from a different angle: demodulating PSK is a synchronous mixer followed by a low pass filter. i assume that a rectangular window is going to be good enough as an LPF, which just leaves the problems of signal detection and synchronization. if i leave the non-synchronized receiver on constantly, it outputs a 24 bit complex number. during synchronization this needs to move toward zero. the phasor will rotate once per window length. which direction? if the direction is known, it's possible to detect a crossing. the direction is determined by the rotation direction of the mixer phasor. i need to up the frequency: 2 MIPS is not enough. maybe i should do that first.. then the output stage then the receiver then a decoder. next actions: * have a look at PSK31 demod code * build the output stage (either PWM or SD) * build board + move to 40 MHz (monday: can't find xtals, maybe test on the dude?) Entry: PSK31 Date: Sat Nov 10 18:56:01 CET 2007 PSK31: Peter G3PLX http://det.bi.ehu.es/~jtpjatae/pdf/p31g3plx.pdf some ideas from the paper: * this is a protocol for live communication. Error correcting codes introduce delays. * use relaxed bandwidth for the filter for smaller delays and lower cost. * take advantage of high frequency stability of modern HF radios * demodulation by using 1 bit symbol delay and comparison. ??? i dont get this one. * synchronize using the amplitude modulation component! * viterbi decoder for convolutional code Entry: a single port for debugging Date: Sat Nov 10 20:37:18 CET 2007 wait a minute. if i manage to plug the E2 protocol through to the icd port, i could standardize on a single set of connectives. however, the connection is not standard, but it is 4-wire (can run over telephone cable) is synchronous and has a clock too. what this would solve is the bootstrap upload problem, which is a nasty one.. Entry: transmission bandwidth Date: Sat Nov 10 20:51:22 CET 2007 something i never really understood: Fig. 4 in the PSK31 paper shows the bandwidth for random data. why is this so wide? why is reversal not the highest bandwidth? other questions: try to explain what this 'bit delay' demod is + how the amplitude demodulation sync works. Entry: BPSK synchronization Date: Sat Nov 10 21:25:37 CET 2007 there are 2 kinds of synchronization necessary: carrier synchronization, and bit clock synchronization. the former can use a PLL, the latter can use the 1->0 transition. suppose the following bit encoding: 8N1, with 1 = idle, and 0 = start bit. during idle the phasor needs to be predictable. this is either a fixed value, or an oscillation between 2 signal states. picking the former this gives 1 = carrier 0 = inverse carrier during idle, the synchronizer works: this is a PLL state machine which turns a single phase increment left or right depending on which quadrant the phasor is in. there are 3 bits determining quadrant. there needs to be an AGC which reduces the 24 bit phasor to an 8 bit phasor for easier demodulation and synchronization. Entry: so why not use AM? Date: Sat Nov 10 21:50:43 CET 2007 somehow both FSK and PSK seem too complicated. maybe i should start with AM, then later (never) continue down the road and try FSK (double AM) and PSK (with synchronization). the most important interference we're going to find is bursts. these should be able to eliminate using stop bits: 1 = on, 0 = off, which means a burst will probably lack a stop bit. algo: continuous square window filter with signal detector feeds into simple sampler. if the sampler is not active, every 1->0 transition will wake it up. before starting with AM, i can just use some noise modulated protocol. hell, anything that can get a 1 accross. Entry: roadmap Date: Sat Nov 10 22:16:08 CET 2007 EDIT: * try strong phase and run it off E2 (OK) * level detector, use the RGB led. Entry: E2 next Date: Sun Nov 11 01:02:47 CET 2007 apparently the E2 signal interferes quite a bit with the amplifier, which is not such a big surprise. so i guess it's time to mature the debug network a bit: * switch to idle mode (keep high) when there's no host -> target comm. * find out what the initial 'missed ping' is all about. i'm going to add stop bit checks to at least eliminate 1->0 bus glitches as a source of errors. that wasn't the problem... something is wrong with bus startup. maybe i need to make sure 'off' will actually switch on power state? looks like the error is with the slave init.. i get a predictable reply to a ping after bus-reset: > 13 >tx > rx-size p 3 > rx> px 2D > rx> px AD > rx> px F7 > rx-size p i made a little progress here: > hub-init ERROR: time-out: 1 > rx-size p 0 > 5 >tx 8 >tx 0 >tx > rx-size p 1 > rx> p 131 > 9 >tx 2 >tx > rx-size p 2 > rx> px FF > rx> px D0 > this sends the commands for fetching the 2 bytes at rom address #x0008, which indeed should be #xFF, #xD0 this is reproducable. so i can conclude that the bytes get received properly. something goes wrong in either the slave transmitter or the host receiver.. i give up.. i can't find it. a workaround which seems to be stable is to send an 'ack' which will send back a garbled byte. apparently, unplugging and replugging the E2 connector gives the same behaviour: first byte coming from slave is corrupted. so it can't be host side.. Entry: amp notes Date: Mon Nov 12 15:29:42 CET 2007 i changed the circuit back to 1K input impedance, 100K feedback in first stage. the second stage has 1K + 100nF, and 100K feedback, and I have no idea why this works: less noise, and it seems to have a good response in the intended range.. maybe because most sounds have a 1/f response? i don't know... it responds well to whistles, which is nice. this is a 10 kHz pole... so it's basicly set up as a differentiator? maybe because i have GBW rolloff this works? i'm puzzled. i tried with a LM358N and it gives a lot more noise. i tried TL072CN and it gives too much bandwidth! so, i use a compensated integrator with 1K/100n in the source and 220K/4.7n in the feedback section. looks like this is final enough.. maybe beef up the amp just a tiny bit more.. PARTS LIST: 2 x 220K 1 x 100K 2 x 10K 3 x 1K 2 x 15pF 1 x 4.7nF 2 x 100nF 2 x 10uF 1 x 10MHz 1 x 18F2620 1 x TL072CN 1 x LED(red) 2 x 6 PIN HEADER C2 4.7nF /-----||----\ | | SPK Rg 220K | R2 220K | o /-/\/\/\--\ o--/\/\/\---o | | __ | | __ | | Rs 1K | | \ | R1 1K | | \ | o--/\/\/\--o-| - \___o--||--/\/\/\--o-| - \_____o LINE |\ @ ____| + / C1 100nF __| + / | ||@ | |__/ | |__/ |/ @ o---------------------------/ | | === Cs | | 10uF | | | | Cn | | 10uF | o---o---||--o | | o--/\/\/\/--o---/\/\/\/--o GND 2.5V 5V First stage gives 220 x amplification. TL072 (TI version, i'm using ST version) has a GBW of 3 MHz, with 220 x amplification this gives rolloff at 13 kHz. so for the first stage i'm good. Second stage is a band pass filter with 22 x amplification: G . . . . ._________ /. .\ / . . \ / . . \ 1/t2 1/t1 t1 = R1 C1 = 100us -> 10kHz t2 = R2 C2 = 1000us -> 1kHz because f1 > f2 the gain is not R2/R1 but R2/R1 * f1/f2. a bit quirky, but it works.. maybe i should try with exchanging the time constants so f2 > f1. looks like these changes keep the transfer function the same, with a = sqrt(10) R2 -> 1/a R2 C2 -> 1/a C2 R1 -> a R1 C1 -> a C1 so, there's a reason to do it like i did! the capacitors are smaller. so where's the trade-off? maybe noise due to large resistors? however, when f2 > f1 the gain is independent of the capacitors. let's make this a bit more intuitive. what happens when C1 is made 10x larger, so f1 = 1kHz, and C2 is made 10x smallr, so f2 = 10kHz? the gain is now 10x more, so then the gain can be reduced by making R2 10x smaller, which again requires C2 to be 10x larger. so the net effect is: R1 -> R1 C1 -> 10 C1 C2 -> C2 R2 -> 1/10 R2 this gives a 1uF capacitor. so alternatively R1 can be made 10x larger, which requires C2 to be made 10x smaller. giving 10K and 470pF respectively. (EDIT: this is what i did. works fine). makes more sense now. so is there a ciruit that has independent frequency and gain? hmm.. i just tried the LM358N again, and it gives good results also. guess the TL022 was just too low bandwidth? yup. 0.5 Mhz. hmm the LM358N is only 1MHz ? Entry: building the first krikit boards Date: Mon Nov 12 15:41:18 CET 2007 - not using E2, use serial + icd2 instead - xtal 40 MHz operation pins to determine: * opamp + bias power * analog input (maybe first stage also) * speaker out and figure out if the opamp can take 5V when it's powered down, otherwise it needs an extra pin to pull the (+) input to ground also. Entry: new sheepsint default app Date: Thu Nov 15 20:00:15 CET 2007 something like this: buttons: * noise on/off * xmod * reso * reset = silence take button state from ram, uninitialized, so it survives reset. xmod control uses 2 x 2Hx - 20kHz log reso needs robustness for reso freq < main osc freq 3 x frequency nobs 2 knobs left.. maybe some modulation? osc 2 frequency + modulation index. (formant / noise frequency) Entry: fake chaos Date: Sat Nov 17 02:27:51 CET 2007 following the same line as the formant oscillator, a fake chaotic filter could be made. such an oscillator (some / all?) contain unstable oscillations that are 'squelched'. the points where such squelching happens are randomly distributed, but the bursts themselves are quite stable, leading to an approximation as randomly spaced fixed bursts. does this work with the current setup? no.. it uses a random period. that's different. so... using the reso algo, it boils down to randomizing p0 with fixed p1 and p2. randomizing could be fixed + variable. the question is when to update the period. the easiest is a fixed rate.. continuous updates seem to work. now p0 is modulated with a uniformly distributed, taking care not to over-modulate so p0 wraps around. everything is moved to prj/CATkit/demo.f Entry: amplifier noise Date: Mon Nov 19 21:25:32 CET 2007 for the next iteration of the krikit board, it might be a good idea to improve the amp a bit. there are 2 things to consider: * input stage noise (impedance) * filter/amp stage capacitor values vs. noise and power consumption Entry: shopping Date: Mon Nov 19 23:57:44 CET 2007 AITEC: - perfboard? VOTI: - perfboard RS: - perfboard (RS: 206-8648, manuf: RE 200 HP) - oscillator - 9V battery holders + linear regulators - 8pin sockets - small signal diodes - high ohm resistors - blue,black bell wire Entry: krikit todo Date: Tue Nov 20 17:58:26 CET 2007 - determine pins: analog in, opamp enable, - output transistor: speaker out pin. - debug net: E2 / serial: minimal slave complexity solution also for catkit: it might be best to connect the E2 bus to the serial TX pin, which is multiplexed with an INT pin. (also for 2620? no, but can be connected externally.) Entry: ditch E2 ? Date: Tue Nov 20 18:16:27 CET 2007 simple TTL serial with a bit of careful programming to ensure enough 'on' time (basicly, large enough cap or some extra '1' bits in the data) might be a better approach, since it doesn't require a special decoder in the target chip. i could use a 'standard' here: the stereo minijack used in some A/V equipment. ftdi sells them too apparently: http://www.ftdichip.com/Images/TTL-232R-AJ%20pinout.jpg or: leave the choice between E2 and serial open. given a bit of delay on the client side when sending, and a proper 'listen' phase on the host, a serial protocol can be used using the same hardware as the E2 bus: 1K TX o--/\/\/\/\----\ | RX o----------o---o---o BUS | /--|<|--/ | VDD o--o | === 10uF | GND o--o---------------o GND another thing to do would be to make the E2 protocol compatible with the hardware uart. the problem here is the factor 5... (+) (-) SERIAL client + hw simple 4 wires SER+POW client simple 3 wires E2 client complex 2 wires hmm... the thing which i find most attractive is the possibility to have a POWER socket that can be used as a comm. the rectification diode also acts as a protection diode this way, and diode drop is not really a problem when powered from 3V-5V. so the main question is: how to make SERIAL run over 2 wires, given the setup above? this is a sofware problem: how to synchronize. the question is whether this extra synchronization effort will lead to more slave complexity, with the bound being the E2 rx/tx. POW: works as long as cap is large enough RX: always works TX: works as long as host leaves room on the cable.. so the problem is for host to create a window. this shouldn't be too big, for POW reasons.. to solve the timing issues here, it looks to me that complexity will be a lot higher. i guess it's best to stick to E2, but keep open the possibility of unidirectional serial comm. conclusions: (4) serial + separate power over 4 lead telephone wire (3) serial, power from data, using stereo audio cable (2) E2: 2 wire, power connector which one for krikit? Entry: problem chips Date: Wed Nov 21 20:40:14 CET 2007 18LF2610-I/SP doesn't seem to want to program.. ok. that was stupid. they're not self-writable. Entry: krikit pins Date: Wed Nov 21 22:31:09 CET 2007 input works seemingly without problems. output is going to be a bit more problematic.. i think it's not a good idea to drive the speaker with the pin directly.. for 2 reasons: 8 ohms is to high a load and the drive point needs to be tolerant for analog voltages (a CMOS input is not, and i'd like to use the PWM) with the current setup, a PNP switch is probably best. so, design variables: * PNP / NPN (cap to ground or Vdd) * suppression diode? * feed from battery (9V) or Vdd (5v regulated) EDIT: 1K with PNP on 13/RC2/CCP1 EDIT: ok. make sure the speaker is not full-on, the transistor gets really hot. EDIT: running into a problem: i'm using high = off, which apparently the PWM doesn't like: it gives a single spike. so i need to explicitly turn off PWM. EDIT: the chip resets unexpectedly. trying now with ICD2 attached: seems to be stable. so something's wrong with my reset circuit probably. could be power supply stuff. some spikes.. Entry: standard 16 bit forth Date: Thu Nov 22 04:53:21 CET 2007 i keep coming back to a standard forth for sheepsint. purrr18 is there to stay as a low level metaprogrammed machine layer, but teaching it is a real pain.. maybe the time is there.. maybe a safe language is not the way to go. maybe a simple forth is more important? maybe standard is important after all? i have a lot of design choices to make, like building the interpreter on top of a unified memory model or not.. [ mostly triggered by ending up at the taygeta site (from e-forth) ] more questions. if i want to make a standard forth platform, wouldn't it be better to go for the 18f2620 with a resonator and a linear regulator, and add a keyboard in and video out while i'm at it? why not the dspic then? ( because i didn't port to it yet, tiens! ) so possible projects for januari: - portable forth on top of purrr18 - linear safe language on top of purrr18 - dspic assembler + compiler - a home computer based on 18f2620 strategically, portable forth seems to be the best option, since this solves most of the documentation issues.. dspic and the home computer are more of a lab thing. the linear safe language is something i need to figure out how to do first in PF context. portable forth could use 'code' words to switch to purrr? maybe it's a good exercise all in itself to try to write a standard forth, and not care too much about optimization etc.. i have my non-standard forth now, so it's good to aim for the average. Entry: the circuit again Date: Thu Nov 22 18:59:33 CET 2007 because the input impedance is so low, the 10uF cap is really not neglegible! in fact, this gives a 10ms time constant with a 1K resistor. that's 100Hz, but the filter cuts of at 1kHz, so it's ok.. it's not ok for what i wanted to do, which is to use only a single transistor to drive the speaker without a capacitor. this might reverse polarize the cap: it's probably best to keep the switching frequency high enough so this doesn't need to happen.. i wanted to replace it with a ceramic one, but that needs at least 1uF. wait a sec.. maybe it's just not possible to drive the cap to a negative voltage? yup.. the + side of the cap will be at saturation voltage. Rg 220K /-/\/\/\--\ | __ | Rs 1K | | \ | /--/\/\/\--o-| - \___o o Vdd | _| + / | === Cs | |__/ >| | 10uF | |---/\/\/\---o SPK | o Vbias /| | | o--------------------------/ | |\ @ | ||@ |/ @ | 0 GND i wonder if it's possible to make the circuit such that the transistor doesn't blow up if SPK is driven low for too long. check this: http://www.winbond-usa.com/products/isd_products/chipcorder/applicationbriefs/apbr21a.pdf somehow the DC path needs to be blocked or at least limited. hmm.. i think there's really no better way than to use switching: that produces the least amount of heat in the transistor. just be careful to not drive it too long, and use minimal DC: start wave 'touching' ground instead of symmetric around 2.5V. i'm ordering some BC640, which are TO92(A), 1A dropin for A1015. i had a BC516 PNP darlington on my list, but the BC639 is 1A. for the switching loads i care about, i don't need high beta. Entry: crap.. transistor won't switch off Date: Sat Nov 24 20:50:04 CET 2007 using 78L05 regulator for the chip, but wanted to drive the speaker straight from the 9V battery. problem is, i can't switch off the transistor if i'm not using open drain output and a pull-up resistor.. so i guess just stick with connecting the speaker to the regulator output, and up the regulator a bit... tho it's 100mA, maybe it can take a bit of peak current? do i have everything now? i guess.. maybe an on/off switch, but that's easy to do later. also, if possible, add a connector for the led, so it can be brought out to the box. Entry: got the first carrier on the mic amp Date: Sat Nov 24 13:03:50 CET 2007 sort of a little mile stone.. but no time for celebration yet. it's at 600Hz, which seems low.. putting it higher gives less response. and it's quite distorted. higer frequency, more distortion. ok.. so the resonance frequency of the speaker, measured by moving it over the table, is about 625 Hz. which is of course the reason why i get such a good response at 610Hz :) this will be really hard to get out, so why not use it? use either fm or pm at that frequency, and adapt the filter / amp accordingly. so what about this: make the amp go from 450Hz to 1kHz, and use the resonance frequency of the speaker as carrier wave. 22K 100nF (450Hz) - 1M 1nF (1kHz) - gain = 45 init-analog pwm-on 1 freq ! 2 att ! 0 nwave half of that seems to work fine too.. 305Hz. what about a golden ratio FSK modulation scheme? that way the harmonics of the lower one won't interfere with the higher one.. EDIT: sticking to one carrier seems best in light of this resonant peak. Entry: direct threaded forth Date: Thu Nov 29 11:02:39 CET 2007 a couple of days of rest doing admin stuff.. going to amsterdam today for the final sprint. some things that crossed my mind: DTC FORTH VM * purely concatenative virtual machine code: implement literal as i.e. 8 bit literal, and 8 bit shift + literal: code never accesses IP. same for jumps. * unified memory model is probably more important than speed: it allows for other memory mapped tricks, since memory has a single access point. the real problem is that instruction fetch is built on top of the memory model. maybe this can be optimized somehow? DEMODULATOR * bitrate ~ bandwidth^(-1) this can easily be seen in the response of a sharp filter: it rings a lot, so can't accomodate much time variation. * AM (data=power) -> PM (signal + data=phase). i'd like to stick to a single carrier. the reason is that the resonant peak of the speaker is something that best can be used instead of fought. i didn't measure it, but it at least looks and sounds quite sharp. comparing waveforms on the scope, i would guess 12dB through both sending and receiving speakers. * sampling rate is only dependent on bit rate, not on carrier frequency: aliasing sampling can be used. this means that in order to accomodate more processing power on the same chip, the bitrate can just be lowered. and the combination of both: since this is going to be quite math-intensive, it's probably best to choose for a bit more highlevel approach: construct a couple of decent abstractions, maybe some easy to use fixed point math routines, instead of perfectly optimal ones.. the chip runs at 10MIPS. if i aim at 100bps, that's 100 instructions per bit, if i aim at 10bps, that's 1000.. the idea is to get it to work first. Entry: math routines Date: Fri Nov 30 12:21:49 CET 2007 time for math routines. some design decisions: * signed/unsigned * bit size * saturated/overflow it would be nice to be able to reuse these later in the DTC standard forth as math routines. i do have special need here, in the sense that the input is only 8 bit. the main problem is the multiplication routine. the standard has a 16x16 -> 32 signed multiplication. 2 approaches for the filter: * simple: 2nd order IIR bandpass * matched FIR filter as in PSK31 i have enough memory to perform FIR filtering. let's focus on trying to understand the PSK31 demodulator. till now i only found code examples, no highlevel pseudocode or diagrams. here Peter G3PLX talks about AFC (automatic frequency correction): http://www.ka7oei.com/fsk_transmitter.html#FSK31_Explained with PSK apparently the frequency correction doesn't need to know anything about the data, since the spectrum is symmetric around the carrier. i'm not sure whether AFC is necessary in my scheme: all recievers and transmitters are stationary, and there's no wind. on the scope however i did see some slight variation in period, but this was probably due motion of the speaker/mic (just sticking up by its pair of connecting wires). "To get in sync. the PSK31 receiver derives it's timing from the 31Hz amplitude modulation on the signal. The Varicode alphabet has been specially designed to make sure there's always enough AM to keep the receiver in sync. Notice that we can extract the AM from the incoming signal even if it's not quite on tune. In PSK31 therefore, the AFC and the synchronisation are completely independent of each other." So it's not completely true that the AFC doesn't need to know anything about the data: data needs to be 'rich enough'. But the trick of getting the AM straight from the signal is interesting. This means i can probably proceed nicely from AM -> PM Some alarming notions here: http://www.nonstopsystems.com/radio/frank_radio_psk31.htm "Like the two-tone and unlike FSK, however, if we pass this through a transmitter, we get intermodulation products if it is not linear, so we DO need to be careful not to overdrive the audio. However, even the worst linears will give third-order products of 25dB at +/-47Hz (3 times the baudrate wide) and fifth-order products of 35dB at +/-78Hz (5 times the baudrate wide), a considerable improvement over the hard-keying case. If we infinitely overdrive the linear, we are back to the same levels as the hard-keyed system." What i saw on my scope, is a strong 2nd harmonic, probably due to non--linearity caused by the DC bias in the speaker. Using some kind of feedforward correction based on a measurement it is probably possible to correct this when it becomes a problem: the transmitter is simple enough so all kinds of wave shaping corrections could be introduced there. "The PSK31 receiver overcomes this (ED: side lobes due to square window) by filtering the receive signal, or by what amounts to the same thing, shaping the envelope of the received bit. The shape is more complex than the cosine shape used in the transmitter: if we used a cosine in the receiver we end up with some signal from one received bit "spreading" into the next bit, an inevitable result of cascading two filters which are each already "spread" by one bit. The more complex shape in the receiver overcomes this by shaping 4 bits at a time and compensating for this intersymbol interference, but the end result is a passband that is at least 64dB down at +/-31Hz and beyond, and doesn't introduce any inter-symbol-interference when receiving a cosine-shaped transmission." "PSK31 is therefore ideally suited to HF use, and would not be expected to show any advantage over the hard-keyed integrate-and-dump method in areas where the only thing we are fighting is white noise and we don't need to worry about interference." So maybe it's not necessary yet? Since we're using a single frequency in the first attempt, a demodulator that rejects nearby signals might not be required. Anyway. Conclusion: i need to have a look at the exact algorithm used for matching + synchronization. Entry: demodulator.f Date: Fri Nov 30 21:07:31 CET 2007 i had some bottom up code (what can be done efficiently) using 8x8->16 unsigned multiplication and 24bit accumulation. this works well for rectangular windows, but not so much for non-rectangular. maybe rectangular is enough since we don't have interfering signals? anyway, it might be wise to look at how to do a windowed one.. i guess the idea is like this: make the window obey some kind of average property that can be removed using maybe a separate accumulation of the signal. it doesn't look that hard: ** is inner product [ s(t) + s_0 ] ** [ w(t) + w_0 ] so there are 3 correction terms: s(t) ** w_0 == 0 w(t) ** s_0 == 0 w_0 ** s_0 which requires the average signal s_0 as the only variable component, which needs to be scaled with the window DC component (can be 2^...) and a fixed offset. so i can basicly use the same unsigned core routine for general complex FIR filters: renamed the macros to mac-u8xu8.f, and added complex-fir.f Entry: drop dup Date: Fri Nov 30 23:02:59 CET 2007 optimization: drop dup -> movf INDF0 Entry: implementing the filter loop : complex-fir.f Date: Sat Dec 1 10:27:37 CET 2007 i have it down to about 31 instructions for an unsigned multiply accumulate operation 8x8 -> 24, and an accumulation. both can be combined after the loop to correct the offset. offset compensation is implemented now, and all sharable code has been moved to macros in mac-u8xu8.f routines are tested and seem to work just fine. so: filter coefficients are centered at #x80, but the accumulator will shift one position to the left to compensate, so the filter coefficients behave as s.7 bit fixed point with inverted sign bit if the accumulator is seen as s.15.8 this means that: 11111111 -> +0.1111111 00000000 -> -1.0000000 Entry: -! Date: Sat Dec 1 11:49:04 CET 2007 \ value addr -- : -! >r negate r> +! ; this subtracts the number on the stack from the variable, not the other way around. note that this has the argument order of '!' not of '-' the reason for doing it like this is that this occurs the most: subtract a value from an accumulator variable. Entry: subsampling Date: Sat Dec 1 13:43:35 CET 2007 A baud rate that sounds like 16th notes would be nice, which is about 8 Hz. A carrier of 600 Hz, this gives a ratio of 75. The sampling rate needs to be > 16Hz, so let's take the one in the neighbourhood of 32Hz. Care needs to be taken though when using aliasing: the frequency error will amplify. Let's see.. using 10 Mhz, the subdivisions become: 2^20 -> 9.5 Hz baud rate 2^18 -> 38.1 Hz sampling frequency 2^14 -> 610.4 Hz carrier frequency 2^12 -> 2.44 kHz 4 x carrier 2^7 -> 78.1 kHz PWM frequency the carrier/baud rate here is 2^6 = 64 going from carrier -> sampling frequency is a subdivision of 16. what's the error of the oscillators? CSTLS10M0G53 has 0.5 % precision. times 16 that becomes 8.0 % which is quite a lot.. so probably it does need continuous phase compensation?? another reason to not subsample is to get better noise performance and better frequency rejection due to longer integration time. using 4 x carrier frequency is still only 2.4 kHz which is at 2^12 subdivision, or 4k instructions per sample, which is absolutely no problem. this gives 2^8 samples per symbol. another thing to think about: synchronization. this can be implemented using time shifts or phasor rotation. the latter is probably not a good idea due to problems with filter matching. so actually, the carrier needs to be significantly oversampled, or at least mixed.. i think i need to make a new table with variables.. Entry: synchronization Date: Sat Dec 1 14:34:11 CET 2007 if there are enough symbol alternations present this causes significant AM modulation which makes synchronization easy: sync to the zero crossings. this means the preamble needs to be 01 transitions.. probably best to use simple async with 1 = idle = transition and 0 = no transition. next: * AM send * AM receive Entry: multiplication again Date: Sat Dec 1 15:16:13 CET 2007 funny how this is starting to be an exercise in multiplication routines :) since i'm using unsigned multiplication for the filter for efficiency reasons, i have to implement signed multiplication in cases where correction can't be moved to outside of a loop, which is the generic case. the 16bit multiplication performs correction using conditional subtraction. for 8 bit it's probably easier to use conditional negation, since that doesn't require extra storage. this sucks.. -1 * -1 overflows to -1.. maybe it's better to use a representation that can actually encode 1, even if this means giving up one bit of precision? s1.6 Entry: userfriendly Date: Sun Dec 2 10:37:42 CET 2007 i need to weed a bit in the userfriendlyness and SIMPLIFY the way some things are used, because it seems as if some combinations cannot be made. i wanted to create scheme code that generates forth macros, but it looks like this is not so easy! another thing is 'splitting' the host and target, so the host can run some kind of query program in cat or scheme.. maybe the 'current-io' parameter should be set back again in prj and scheme modes? Entry: clicks Date: Sun Dec 2 13:03:06 CET 2007 need ramp-up and ramp-down to prevent clicks. ramp-up time should be in the order of 20ms = (50Hz)^(-1) after ramp-up, carrier fade in should be used. this can use the 'attenuation' variable. OK. using 25ms ramp to bias. now, how to intialize the phase? OOK: can start envelope with -128 (amp = -1) carrier with -64 (amp = 0) BPSK: needs carrier fade-in. looks like BPSK sounds smooth enough without envelope fade-in when starting the carrier at phase = -PI/2 doing the same now for AM, so there's no problem with envelope frequency = 0. Entry: transmitter Date: Sun Dec 2 15:08:19 CET 2007 time to get the transmitter sorted out, so i can make a standalone device that sends out a known data sequence: * combine the framed rx/tx with sending/receiving * figure out OOK and BPSK transition based send modes return to zero, i don't see the point in that, so transition based seems good enough. let's say 1 = trans, 0 = notrans. this has the advantage that an idle line is the richest signal, good for sync purposes. transition based is easiest to implement using the current code. in case transition based is not desired (i.e. because it accumulates error), this can still be pre-coded as long as a transmission starts with a known oscillator phase. Entry: signal rates revisited Date: Sun Dec 2 18:50:57 CET 2007 4 independent frequencies: * PWM TX rate: determines high-frequency + aliasing noise * carrier: only important for path (i.e. speaker reso) * baud rate: bandwidth -> noise sensitivity * RX rate: selectivity (related to FIR length) (EDIT: carrier and baud rate are not independent wrt data filter qualoty. see below) important for the receiver are : - baud rate, which limits the maximal integration time (dependent on symbol length). - RX rate: enables longer filter lengths, which gives more selectivity and noise immunity. it doesn't make sense, for constant baud rate, to up the RX frequency, but keep the FIR length constant, so: FIR = k . (RX / BAUD) RX = Fosc / OPS where k is the number of symbols the FIR spreads over, probably 1 or 2. and OPS is the amortized number of operations per sample (processing and aquisition). the filter is 32 in the current implementation. this gives about 300kHz at 10MIPS. looks like we have some headroom.. anything more than 8kHz is probably not going to make much sense. ( I was thinking about noise and dithering, and that at this high frequency because of absence of noise there will be no 'extra' sensitivity due to the dithering at levels close to the quantisation step, but there probably will be extra due to pwm effects. So it looks like all small bits help.. ) EDIT: another variable i forgot to mention is symbol rate vs. carrier. using a mixer, it is desirable to have large separation between the two so a simple data filter can be used. Entry: matched filter Date: Sun Dec 2 21:22:33 CET 2007 differential BPSK data stream: . .___. . .___. \ / \ / \ / X X X ./ \.___./ \./ \.___. 1 0 1 1 0 Using cosine crossfading as implemented in modulator.f is effectively the same as using symbols 2 baud periods wide with a 1 + cos envelope. this wavelet is the output filter which maps a binary +1,-1 PCM signal to the shaped BPSK signal. This output filter needs to be matched in the receiver. Now, about matched filters.. A matched filter in the presence of additive white gaussian noise is just the time-reverse of the wavelet: one projects the observed signal vector onto the 1D space spanned by the wavelet's vector in signal space. This gets rid of all the disturbances orthogonal to the wavelet's subspace. When the noise is not white, the noise statistics are used to compute an optimal subspace to project onto, such that most of the noise will still vanish. I don't have noise statistics, and I'm not going to use any online estimation, which leaves me to plain and simple convolution with the time-reversed signal. I do wonder what all this talk is about 'designing' matched filters for PSK31... Entry: phase synchronization Date: Sun Dec 2 22:27:44 CET 2007 I'm confusing 2 things: * bit symbol synchronization * carrier phase synchronization Using a complex matched filter, phase synchronization can be done entirely by using an extra phase rotation operation: it doesn't really matter what comes out, as long as: - the matched filter's envelope is synchronized - we're using I and Q filters It's clear that mismatching the symbol clock has a lot less effect than mismatching the carrier phase. When compensating the carrier phase, we compensate what's aliased down after subsampling at symbol rate. This still needs 2 separate synchronization methods: bit symbol synchronization (which sample point to start filtering) and carrier frequency/phase compensation. Entry: recording sound Date: Mon Dec 3 10:35:52 CET 2007 let's go for this: a symbol is 256 samples. This allows easy buffer management. This fixes the sample rate at 4.88 kHz recording seems to work ok: it's at 8x the reso frequency of the speaker: scratching it over a newspaper gives a nice saturated wave with period about 8. maybe it's time to add gplot in the debug loop. Entry: IIR or FIR Date: Mon Dec 3 15:39:49 CET 2007 IIR: * mixer + lowpass * sync mixer to carrier * start bit detection = zero crossing * no messing with blocks * only approximate matching FIR: * possible to construct optimal matching filter * no phase distortion * synchronization more complicated (filter freq is fixed?) It looks like IIR + PLL is really simpler to implement (same at every sample block, no buffers necssary). NEXT: have a look at how to implement a PLL. ( Actually... It should be possible to mix the assymetric tail of a stable IIR filter in the transmitter! Though not simple due to rounding.. Something like that can probably not be computed exactly, so this would require a bit more expensive transmitter.. ) Using a PLL it's probably best to first try to synchronize to a clean carrier. Since a mixer is necessary as part of the processing chain, that can be used to perform the correction. -> [ MIX ] -> [ LPF ] ----> I --> [ AGC/HIST ] -> bits ^ --o-> Q | | \-------[ OSC ]<--/ The quadrature component can be used as an error feedback. This always works, since it's not present in the signal. reading this: http://rfdesign.com/mag/radio_practical_costas_loop/ two things are mentioned to perform carrier recovery: * squaring + division * costas loop note that's about an analog implementation. so not all roses in the IIR world.. what about taking the best of both? perform carrier recovery using a mixer + PLL and use a similar approach for the data sampling recovery. a nice place to go back to this paper: http://www.argreenhouse.com/society/TacCom/papers98/21_07i.pdf where the signal is sampled using a 1-bit dac, and the mixer has values {-1,0,1}. after integration, an adaptive rotation is performed. Entry: simplified Date: Mon Dec 3 16:56:50 CET 2007 * AGC: or the absolute value of a symbol buffer + compute shift count * INTEGRATE: sum the entire buffer (no sideband rejection) * sample at say 8 points per symbol since there's no filter other than the analog 450-2kHz this should perform pretty bad. but i guess it's time for a fail-safe.. use noise modulation first :) NM -> AM (async) -> PM (sync) a genuine problem doing this al experimental is the program sequencing.. there's a huge difference between being able to do something per sample and having to store some for later.. EDIT: it doesn't make sense to write an AM demodulator without thinking about the BPSK that will follow, so i need to do AM with a separate mixer + LPF. mixer seems straightforward. the remaining problem is the LPF. If i can make that work with simple shifts, where's as good as there.. Entry: triangular window Date: Mon Dec 3 17:33:12 CET 2007 however.. it is probably possible to use triangle windows and 'recompose' things later, since a triangle window is self-similar! given a number of sample points, from this construct 2 numbers: one weighted with ramp up, one with ramp down. these can be easily combined, so one could shift the center of the window and recompute easily. Entry: interrupts Date: Mon Dec 3 17:46:08 CET 2007 looks like the real question is whether or not to use interrupts. doing this as state machines leaves too little room for block-based FIR techniques. i'm also not very convinced about trying the AM first, because i'm already trying to optimize the layout for that algo: i need to go for mixer + IIR LPF, and implement AM in that framework. Entry: data filter coefficient Date: Mon Dec 3 18:31:07 CET 2007 The constraint is: we don't care about the delay, but attenuation shouldn't be too big. What about this: pick the pole at half the bit rate, and round upto the next power of 2. t <- 1/sqrt(2) = (1 - 2^(-p)) ^ t EDIT: how to pick p ? it's easier to use this approach, where we require the decay time to be such that a the response will drop below the 1/2 threshold in one symbol time: (1 - 2^(-p)) ^ t < 1/2 where t is the number of samples in a symbol. this is equivalent, since the t in the previous formula is related to half the baud rate. if t is large (in our case it's 64), the linear term is the one that dominates the lhs, so the above can be approximated by (1 - 1 + t2^(-p)) which gives an expression: p = log_2 (2t) Entry: AM vs PM Date: Tue Dec 4 11:38:18 CET 2007 something i missed yesterday: demodulating AM with a non-properly tuned mixer might give trouble. no, this is not the case as long as both the I and Q components are computed: it only gives a problem for PHASE (which will rotate on mismatch) not AMPLITUDE. Entry: data filter implementation Date: Tue Dec 4 12:03:34 CET 2007 the easiest way to keep precision is to never loose any bits. the data filter has the form: x += (1 - a) x + a u where x is state and u is input, and a = 1 - 2^(-p) the current settings give t = 512 (5kHz sample rate and 9Hz symbol rate). which means p = log_2 (1024) = 10 as the approximation of the bound. speeding up the filter by a factor of 2 gives p = 9. it might be worth relaxing it even further to 8, so shifts are eliminated. ( so, just out of curiosity.. is it possible to use unsigned multiplication? just doing this without thinking introduces a scaled copy of the original modulated signal in the output. if the lowpass filter allows, this might be not a problem: requirements are just 2x as strict. ) problem with signs: it might be simpler to work completely with unsigned values since signs make multi-byte arithmetic more complicated (need sign extension). a simple solution is to run the multiplication as signed (to get rid of the component at the carrier frequency) but run the filter accumulation as unsigned. the DC component in the result is completely predictable and can be subtracted later. first experiment i measure something: noise is at around 5 and maximal measured signal is around 150. that's a significant difference. now it's time to map the 24bit range to something more managable. now to be careful not to overflow the filter input: it seems reasonable to ignore the lower byte. looks like i have a bug in the signed 16bit multiplication routine. EDIT: yep.. type TOSL replaced with TOSH Entry: better debugging tools Date: Tue Dec 4 15:07:04 CET 2007 i need a way to print reports from ram.. before this can be done in a straightforward way, the interaction language (which will need to be cat or scheme) need to be defined properly + some way of adding code like this to the project needs to be defined. what i need now is a way to inspect 24 bit numbers.. what about adding inspectors to the code? these are forth words that send out data in the form of a buffer. i could then make inspectors for any kind of thing. EDIT: yes.. i really have a good excuse to make proper debugging tools. just fixed the prj console to be able to connect to the target. was thinking about properly specifying interactive commands as an 'escaped' layer over the target interaction.. basicly every possible 'island' in the code needs to be extensible.. most importantly: macros, interactive words, prj words, scheme code, ... EDIT: considering the amount of time i'm loosing to get this thing going, it might be wise to standardize on method.. i.e. all 16bit signed fixed point or something. Entry: double notation Date: Tue Dec 4 16:15:02 CET 2007 there's some things to distinguish: 1 x 1 -> 1 word (standard words) 2 x 2 -> 2 _word (16bit variants of standard words used in DTC) 1 x 1 -> 2 2word, 3word etc.. nonstandard, any combination 1 x 2 -> 2 that makes sense 1 x 3 -> 3 1 -> 2 ... Entry: costas loop Date: Tue Dec 4 23:09:21 CET 2007 Have a look at the HSP50210 datasheet. It gives a nice general idea about how a PSK receiver would work: 3 tracking loops (AGC,carrier,symbol), user selectable threshold, matched filter (RRC or I&D), soft decisions. Entry: saturation Date: Wed Dec 5 12:27:13 CET 2007 It's important to be able to prevent wrap-around distortion. Some kind of saturation mechanism might make this easiest: it's easier than carrying around high precision data. So where to saturate? Most straightforward is the LPF, but at first glance it's better done at the point where power is calculated, since LPF seems to have enough dynamic range. The properties of a non-saturated word are: * sign word byte is #x0000 or #xFFFF * both words have the same sign bit This can be reduced to: sign word + lower sign byte == 0 OK Entry: weird LPF output Date: Wed Dec 5 15:42:12 CET 2007 i think i need to focus on building some more debug tools today.. something's going wrong and i can't find the cause. the problem is amplitude modulation in the LPF power output, going from 100 -> 400, with a period of about 35 = 140Hz. and a component at 4 x that frequency, not locked, which is probably the carrier. i measure this with the modulated signal, and with an unmodulated carrier. go one by one: it's probably best to try to eliminate the DC offset, so at least that is not drowning the signal component, which is a lot smaller.. EDIT: this is already happening: sample is converted to signed then multiplied. ok.. questions * why is there a 1/8 Hz component in the power output? i would expect the power to be smooth.. not modulated -> this is just noise. the level is really low, so it's the accumulation of (2^(-8) * u). * why is there a 1/64 Hz component in the power output? EDIT: the frequency is a mixer mismatch = 1/8 - 1/8', where 8' is the not quite =8 measured carrier frequency. * what does the filter input look like? dit some input signal measurement, and the first thing i notice is that the carrier frequency is quite off. i get 28/4 is T=7 instead of T=8. which would give a beat at 1/56. that might explain a lot... ok.. i get it. the convolution of these 2 spectra: | . | cos(w1 t) | . | cos(w2 t) 0 gives: | |.| | cos(wd t) cos(ws t) 0 with wd = w2-w1 and ws = w2+w1 the sinewave that gets folded near 0 will interfere with the signal data! so this approach just doesn't work without synchronization! it looks like the only way to do this is to either have proper synchronization, or use a band pass filter, not a mixer. AM: first order lowpass with complex coefficient, followed by output power computation. PM: requires AGC or cartesian->polar conversion for properly scaled Q -> phase feedback. the quick and dirty way is to just filter the absolute value of the input. then add a more selective filter. hmm.. i still need to kick out DC, so better go for the frequency-selective filter. Entry: 1/8 or 1/4 frequency filter? Date: Thu Dec 6 17:12:32 CET 2007 it's probably easier to separate the problem in 2 parts: (1 -1 ; -1 1) with sqrt(2) amplitude, compensated by a single arbitrary multiplication to get the gain to 1-2^(-8). this requires at least 16bit. incorporating the scaling factor in the matrix seems to lead to the same precision problem, but requires 4 multiplications instead of one. so what about a 1/4 filter? that's even simpler, and doesn't require a sqrt(2) scaling factor, so the (1-2^-8) scaling can be done without a multiplier. so.. the lpf filter i had before can be re-used. the only thing to add is to cross-add the filter states, and add in the input signal. rotating the signal can be done using a 4-state state machine, which will add/subtract the signal to/from one of the states. +x +y -x -y give this approach it's probably also possible to reduce the LPF state from 24 to 16 bit. check this. in a stable regime, using 2 bytes, the high byte will have the amplitude of the input at the frequency, so at least for strong signals it would be stable (gain = 1). looks like it has only effect on noise and rejection. Entry: too much carrier drift Date: Thu Dec 6 19:16:16 CET 2007 so, to get a bit of full-circle understanding: why not mix a signal to DC and filter its absolute value? looks like the thing i did wrong was not the mixer, but the place where the smoothing is going on. or: * mix + filter: isolates a frequency region * full-wave rectify + filter. 2 filter operations are essential here, so it's probably easier to do only one, and instead of full-wave use the amplitude/power of a filter. But but but... maybe the filter is actually too sharp? I measured the carrier at 1/7Hz, expecting it at 1/8Hz.. i'm missing a parameter: bandwith and time decay are related, but increasing the sample rate.. look: this is just a shifted one-pole filter: it's the equivalent of passing the difference signal 1/7-1/8 to the lowpass filter. that probably won't survive.. so i'm stuck with the same problem: the carrier shift is much more than the bandwidth of the signal! it's about 80Hz at 600Hz, while the signalling frequency is around 9Hz. this means i have to do something about it... it's either going to be manual tuning, or adaptive tuning.. synchronous demod is starting to look like the only solution. or i should just use the 2 filter approach of above: * wide filter to eliminate noise: it should be wide enough to capture the carrier tolerance. * narrow filter to perform demodulation after full-wave rectify. it's starting to look like synchronous is going to be a lot less hassle.. again.. what do i need? an AGC to normalize the Q output such that i can use it as feedback to phase offset. go over this again.. something's wrong. Entry: cordic Date: Fri Dec 7 09:36:31 CET 2007 the most elegant solution seems to be to use a cordic I,Q->A,P transform, so both the AGC and PLL have proper data to work on. For use in the demodulator, the constant scaling factor is not a problem. What I would like to do is to perform sequential updates: use Q to update I and then use the updated I to update Q. With a=s2^(-n) this amounts to: | 1 a | | 1 0 | | 1 + a^2 a | | | * | | = | | | 0 1 | | a 1 | | a 1 | Which is no longer a scaled rotation. Correcting this looks like more hassle than just performing the update in parallel. I don't need a lot of phase resolution. 8 bit is definitely enough. Hmm.. Is going to be a lot of work.. Entry: simplified PLL Date: Fri Dec 7 11:21:40 CET 2007 What about using a 2 bit phase detector which just detects the quadrant and accordingly adjusts the frequency? -2 | -1 ----+----- +2 | +1 With + meaning counterclockwise. Since we're not using the Q component, both directions of I should be allowed, so a better approach is: +1 | -1 ----+----- -1 | +1 Filtering this signal and using it to increment the frequency gives the right amount of feedback near the lock. In the phase diagram, what needs to be done is to slow down the oscillator. The design parameters here are: - smoothing of the phase error - gain of the phase error I'm not too sure about oscillations though.. Maybe linear error response is an essential element? I guess i'm missing some experience here. Gut feeling says it should be possible to design a PLL by filtering a 2 bit phase detector. Gut feeling also says that this will lead to oscillations. I'm off track again. These are the choices to make: - go for CART->POLAR transform with high resolution (i.e. 8 bit) - use AGC and Q component for feedback. The latter seems simpler. Maybe i should try that first. Cordic isn't as straightforward as i thought since it needs a barrel shifter. Which could be implemented using the multiplier, but then why not use proper coefficients? So.. AGC. Stick to the mixer algo, but figure out how to perform variable gain so the error signal used to drive the phase adjustment is properly scaled. Estimate the gain using a filtered sum of absolute value of the I and Q components. Entry: PLL analysis Date: Fri Dec 7 13:32:07 CET 2007 Using linear system theory: around the error=0 point, the system is linear and behaves like a controlled integrator. We control frequency (velocity) and out comes phase (position) which is the integral of frequency. Such a system with a proportional controller is stable because it is first order with negative feedback. It can be sped up by increasing the gain. However, faster also means more susceptible to noise of the control signal (in the PLL case the Q signal) This is in absence of a disturbance signal. This can be modeled by a signal d which drives the integrator directly. In the PLL case this is the frequency mismatch. This will result in some permanent error. The ratio between the 2 is determined by the error amplification. Questions: * add or subtract from rx-carrier-inc ? -> depends on whether one wants to sync to +I or -I * how to prevent mixer drift? -> looks like the DC component of the error should not have any influence? Entry: discrete control systems Date: Fri Dec 7 14:00:14 CET 2007 Looks like the thing i'm confused about is the difference between analog control systems and digitial ones. An analog 1st order proportional control system can never overshoot, but a naive discretization of this can! The problem here is instability of integration methods. Entry: the problem with the frequency offset Date: Fri Dec 7 14:48:07 CET 2007 i think i found it: really stupid.. first i thought it was an oscillator problem. didn't occur to me to try with 2 different boards to see if that's actually the case. anyways, after trying, i got exactly the same result. looking at the code i find this: : sample> 16 for wait-sample next 0 ad@ ; which, if the processing takes longer than 1/16 of the clock period, is wrong of course! the solution is to solve this using the timer, or perform the sampling in an isr. let's try the postscaler. OK. need a break.. what i'm doing wrong is to use the integral of the error to compute the frequency.. frequency should be just F_0 - e. after the break.. looks like i'm still making too many mistakes: of course, if i just restart the tracker at a random point of carrier phase, chances are that there are going to be some transient fenomena. i just need to run it longer probably. OK: sync works to plain carrier. Entry: synchronization to modulated carrier Date: Sat Dec 8 11:30:05 CET 2007 i tried to following: use the sign of I to steer the direction in which the feedback works. works ok for clean carrier, but in full reversal this leads to problems. looks like a conceptual problem. maybe the synchronizer should be slowed down? in a sense that a symbol transition, which moves through a zero feedback point (in which the carrier is effectively not controlled), has no noticable effect on the setting of the tuner, but when this transition is complete, full feedback is in effect to pull the oscillator in sync again. using just Q feedback, the PLL seems to stabilize around Q = -120, with an amplitude of about 30. say -128, that's -#x0080 #2000 -> #1F80 it's 1/64 th of the frequency, which is exactly the difference between symbol rate and carrier: the PLL locks to another attractor.. a simple solution seems to be to limit the PLL frequency correction. anyways, the sign stuff is necessary. Entry: symbol synchronization Date: Sat Dec 8 12:07:52 CET 2007 because i'm using locked synthesis and no non-synced downmixing, the symbol synchronization can be derived from the carrier synchronization. so maybe i should forget about syncing to the modulated carrier? pulling the oscillator in sync using a plain carrier might help a lot actually. let's try a 7/1 test tone. Entry: first packets: pll and reversals Date: Sat Dec 8 13:25:20 CET 2007 apart from some problems related to gain (probably too much drive which kicks the PLL out of sync: moving the things apart gives better results.) it seems to work just fine. looking at an I,Q plot i suspect the slow rise of the I signal is not due to filtering, but due to loss of sync: Q gets thrown off, and the PLL needs to re-sync. maybe it's more important to filter the error feedback.. aha! it seems as if the PLL switches to the negative frequency attractor. indeed. with wide spaced reversals it is clear that Q moves from around -13 to +13 the problem is that by suddenly moving from subtract to add changes the frequency of the oscillator from bias+corr to bias-corr. how to solve this? aha.. maybe it's not necesary to flip the sign? since a phase reversal in the I plane doesn't change the Q component? actually, it does. switching of the sign compensation resynchronizes the oscillator on transition to I = +A. it looks like i need a controller with a zero error, which effectively means a PI instead of a P controller. note that i already had an I controller, but that's unstable. i'm measuring an error of about #x10 / #x2000 = 0.2 % -- the spec sheet says 0.5 % max. looks normal. thinking about this PI controller: P + lowpass can't work, because there is no zero-error. so i need an integrator. the problem is the time constant / gain factor. the error (Q) does seem to go to zero now. however, there is stil a transition at the reversal. now that i have a zero error, it's maybe best to multiply the I and Q to obtain the error signal? for after dinner: i'm stuck with yet another scale problem.. fixed point without a barrel shifter is madness.. it might have been better to just implement the tools necessary, even if they are inefficient: it is definitely doable (which is what i wanted to prove really..) but it's difficult. NEXT: - I * Q - AGC preferrably combined such that I * Q and error feedback become simple. i just saturated the error output to +-127.. i get nice results for I amplitude around 100-150. but still: the Q component wiggles when the phase transforms. reading the costas-loop paper mentioned above: the 3rd multiplier is called a phase doubler. it's only point is to make +-180deg both stable lock points. so, i'll write up the problem below. Entry: more questions Date: Sat Dec 8 15:04:31 CET 2007 why does the PLL response oscillate? the analysis by linear approximation i made above showed it was first order.. something's wrong there. Entry: generic lowpass filters Date: Sat Dec 8 16:24:02 CET 2007 it's no longer managable to have these special 1 * 2^(-8) filters.. i need a special purpose 16bit lowpass filter, with saturation, operating on proper 16 bit signed values, with possibly 8 bit coefficients in a decent range. it looks like there's plenty of room to do it in a proper object-oriented fashion. not doing it in proper object-oriented fashion, but a macro operating on 4-byte state: 3 byte signed filter state, and 1 byte unsigned filter coefficient: .00AA Entry: AGC Date: Sat Dec 8 22:24:17 CET 2007 it's not so straightforward, since it needs a division operation.. currently, with the multiplication doubler (also with the sign doubler) locking seems to work fine around 100-150 amplitude. Entry: lock problems on transition Date: Sat Dec 8 22:40:44 CET 2007 i still get the same problem: on transition, the phase is messed up again. maybe the oscillator phase should rotate too? i'm confused... at the point where the I component goes into transition, the Q component gets kicked off. the integrating controller works well: error goes to zero eventually. i just need to figure out why the phase bumps.. something strange tho.. the Q spike only happens on a +1 -> -1 transition. the -1 -> +1 transition is clean. this smells like some kind of wrap around bug.. sending #x01 bytes instead of #x11 bytes seems to contradict this: spike on every transition. Entry: emergency solution: AM Date: Sun Dec 9 00:02:13 CET 2007 tomorrow it looks like the best thing to start with is gain control, to find an optimal feedback coefficient for the PLL. once this works i can try to find a bitrate that works with the phase error still happening. then i could hand it over and try to fix the sync/transition error. normalization: * agc (division + filtering) * arctangent previous conclusion about cordic artangent was that it's hard to do without a barrel shifter.. i can probably unroll most of this by using the multiplier and double buffering. good thing is that this can be used for AM also, without the need for quadratics. EDIT: actually, is should really just do AM by measuring the power. the previous error (large carrier mismatch) is solved. Entry: articles Date: Sun Dec 9 09:15:49 CET 2007 R De Buda "Coherent Demodulation of Frequency-Shift Keying with Low Deviation Ratio" -- IEEE Transactions, 1972, COM-20 pp 429-435. S Pasupathy, "Minimum Shift Keying: A Spectrally Efficient Modulation" -- IEEE Communications Society Magazine, July 1979, Vol 17, pp 14-22. Entry: AM Date: Sun Dec 9 10:13:57 CET 2007 i got very nice reception it looks like. what about the following algorithm: * set threshold to an estimate of the noise threshold (say 50) * wait until something comes in: interpret it as a start bit * find the max amplitude during the start bit * start sampling 9 bytes, by waiting for half a symbol length, and compare to half the dynamic threshold looking at some sampled data of #x55 + start and stop bits, which is 01010101, with 0 = ON, 1 = OFF, it seems that putting the threshold at half is not a good idea.. also, the time it takes to reach from going above noise threshold (50) to the peak of the start bit is exactly the symbol length. maybe it should be compared with a lowpass envelope? tried this, but looks like LPF delay is going to be a problem. however, it should be possible to keep the same filter, but perform the comparison with delayed versions? another possibility is to just save the sample points, and perform the filtering at a later stage. or.. it could just be compared to the previous sample point? if lower it's the reverse? that will probably work just fine: this might give a problem for stable 0 or 1.. next algo: * start sampling s - s_0 after detecting a start bit. s_0 = rise time to threshold level. * collect 10 samples. * postprocess what i'm doing: s_0 = 0, and watching the output of the sampling with a #x55 byte. it looks pretty decent. now trying the number station. next approach: * compare with previous (differentiate) * maybe hysteresis? differentiate is no good. i'm probably fighting something else.. maybe the data rate is just too fast? i had to move from 512 samples to 256.. so looks like something else is going on.. what about this: change the special purpose lowpass filter so it takes 16 bit coefficients, and then reduce the filter pole a bit. Entry: confused Date: Sun Dec 9 16:22:43 CET 2007 let's see.. there's something wrong with my symbol rate. i thought it was 512 samples, but it's 256. corrected for this, i can receive signals. however, it seems the bandwidth is mismatched. so i have 2 calculations that are probably erroneous: * necessary bandwidth -> filter coeff * symbol rate at the transmitter doing some manual experiments, i got the filter pole fixed at #x0100, which gives very nice waveforms. making it bigger only increases the noise, but doesn't seem to influence the shape too much. so everything looks pretty good, but it seems there is too much inter-symbol interference due to the assymetry of the receive filter. i could try to hack around this by doubling each bit, but keeping the envelope constant. got now: * halved symbol rate (transition + stable) * 3 x bandwidth (100 -> 300) now at least the filter makes it roundtrip from 0 to max amp. now try to subtract startbit from each bit, then use sign. this works! reception seems quite robust. at least when it's not receving bogus. so i need a way to eliminate the worst kinds of noise, which are transients that trigger a start bit. these could then be used as human input. it's not very robust tho.. probably i need to compute the maximum, and use half of that (or less..) as a threshold. looks like this is relatively robust: threshold = 1/4 of maximal power translated to amplitude this is 1/2 Entry: next Date: Sun Dec 9 18:32:53 CET 2007 i think it's ok to forget about synchronous stuff for a while... also speeding it up is for later maybe. what i need first is: * extra stop bit to eliminate transients * cleanup code for blocking send & receive. Entry: krikit -> reflections on Forth and DSP Date: Sun Dec 9 22:52:16 CET 2007 looking at the code i write, it is full of global variables (temporary storage for multiple fanout). and inlined early-bound math ops, operating directly on memory instead of the stack. also macros that unfold to criss-cross variable access are much more useful here than compositional forth. the problem with DSP is that speed matters, and it's easy to get to order of magnitude savings by early binding. so macros are important. algorithms are often not terribly complicated. the stress is more on mapping things to hardware. now, i realize i'm stretching it trying to do DSP on a PIC18. it misses essential elements like a barrel shifter, large accumulator, and rich addressing modes. these things REALLY make a huge difference. but, keeping data in memory (registers) makes things relatively fast on a PIC18. if the specs are clear (if the algorithm doesn't change) implementation can be straightforward, though a manual endeavor. but, what i've learned, experimentation REQUIRES more highlevel constructs. i lost too much time and energy in mapping to hardware before things actually worked. which leads me to the following strategy: if experimentation on the hardware is essential, experiment on hardware that 10x faster, or use data rates 10x slower such that high level abstractions can be used. for purrr on the PIC18 and 16/24 bit DSP operations this means: USE A DTC FORTH! when it's done, core routines can be done in purrr18 or in machine language. what i need is: * confidence that 10x speedup is possible * confidence that slowing down ACTUALLY WORKS * patience and discipline to get it to work FIRST and THEN speed it up what i missed in this project is the availability of an easier to use 16bit forth, and a policy for doing fixed point math. the former would have made the latter more easy to use. and second, it is probably a good idea to start looking for a dataflow language: one that * automates the allocation of temporary buffers (variables). * enables abstract boxes (made of networks of abstract boxes) * automates iterated boxes (+ possible 'folding') * separates registers from functions (all feedback = explicit) * frp vs. static sequencing ? so i'm not so sure anymore if forth is really useful for the dsPIC. maybe in the sense that it should map to the 16bit arch just like purrr maps to the 8bit arch, but leave the dsp-ishness alone: provide only an assembler. Entry: local names Date: Sun Dec 9 23:15:27 CET 2007 which brings me to macros and local variables.. i'm using the wrong tool for the job: i can't bind new names to old ones, like in scheme. for example: : bla state | state 3 + ... the 'state 3 +' can't be bound to a single new name. this really screams for a new language syntax and semantics. or at least enable local macro definitions (there's no real reason why not..) : bla state | : foo state 1 + ; : bar state 3 + ; foo @ bar ! ; but.. that's getting ugly. what i want here is some form of pre-scheme. downward closures. (bla : state | (foo : state 1 +) (bar : state 3 +) foo @ bar !) another thing: when allowing local variables, it makes more sense to put them in front of the name, to correspond better to how they are used. (square | dup *) (x square | x x *) and i need to figure out how to solve the anonymous function problem.. i.e. 'define' vs 'lambda'... these are conflicting.. what about using : in a context that requires a named definition. i.e. a global definition or a local let | in a context that requires an anonymous definition, i.e. the argument to ifte. (| ...) is then equivalent to (...) (x square : x x *) vs (x | x x *) a function definition can then be something like (a b c superword : (e : a 1 +) (f : b 1 +) a b + e +) where local definitions are possible at the beginning of a definition Entry: dtc forth Date: Mon Dec 10 14:21:35 CET 2007 a unified memory model is not so hard to implement efficiently. but, a point that could make a huge difference is to use this memory model inside the interpreter. a trade-off between speed and flexibility. i can imagine it being interesting to be able to test code in ram before flashing it.. at the least, the option should be kept open. Entry: RGB led Date: Mon Dec 10 17:57:57 CET 2007 trying to figure out where to put the LED. * all connected to analog ports * one extra digital connector with 220R resistor * all connected to pins, so they can be reverse biased for light detection pinouts: (common anode) | || |||| 4321 4 3 2 1 | R | B | | o--|<|--o--|>|--o | | | o--|>|-------o G the region that's free is between 21 and 26. the anode needs to be connected to a pin that can be switched to analog. on the board the best option here is 23/AN8. leaving pin RB0/INT0 free for debug net might be a good idea. AN9-10-11 are then all digital to control the LED cathodes, AN8 is analog to tolerate the analog voltage. this also won't conflict with the necessary digital outputs already on the board. the anode resistor could go to. the RGB led is connected like this: 26 RB5 o--[220]---o | 25 RB4 o-----o | | | 24 RB3 o--o G | B | | 23 RB2/AN8 o--o--o----o R 22 RB1 o--o Entry: dsp language Date: Tue Dec 11 09:33:07 CET 2007 what is necessary? i could take the PD sound processing as a model. * box = primitive | composite * composite = box + interconnect * things should be parametrizable in grids (from which an iteration structure is defined) * can we have lexical scope? * don't force serialization * don't force naming of intermediates, but don't restrict it either. (box combinators) * allow scheme (expression trees) to be a subset of the language. the exention is no more than a way to abstract 'parallel scheme'. it would be nice not to go too far away from lambda abstractions. the problem is multiple outputs. these could be multiple functions. so what about common subexpressions? keep it manual for now.. maybe use scheme-like syntax based on 'values' but called 'output'. the latter will be more general than values: it can be re-arranged in time. it's an essential observation. not forcing the naming of intermediates can be problematic, since it's the whole point: dsp code is very graph-like, and naming is more efficient for this.. it looks like naming IS essential. brings me to composition: a new box consists of 'node' sections which name nodes. 'lambda' could be replaced with 'in' since it will name the external inputs. all other nodes have to be named. 'not forcing naming' can be implemented by special purpose box combinators. nodes are different from locally created 'specialized' boxes. names can be replaced by box expressions if they are tree-like (return a signle value) otherwise they need to be named in a 'node'. similarly 'out' can be discarded in a definition. this allows the use and mixing of scheme functions. (in (a b c) ;; 'in' is the parallel equiv of 'lambda' (box (mula (x) (* a x)) ;; create local specialized box (like 'define') (box (mulb (x) (* b x)) (nodes ;; naming intermediates ((q r) (div/mod a b)) (out (+ (mula c) (mulb c)) (- (mula c) (mulb c))))))) so, concretely: 'in' is like 'lambda' but it has parallel outputs 'nodes' is like 'let-values' 'box' is like a local 'define' 'out' is like 'values' but defines parallel outputs so the principles: 1. the ONLY point of the language is to extend the many->one lambda calculus that can create expression TREES to something that can create expression GRAPHS. 2. it is important that the lambda calculus is a subset which uses it's original lisp tree notation. * 'out' is redundant for single outputs * intermediates from single output boxes do not need to be named i'd like to extend this to grid processing: systolic arrays etc: box compositions that connect boxes in several dimensions, such that iterators can be derived from a highlevel description. Entry: driving led Date: Tue Dec 11 11:17:14 CET 2007 driving the led during reception is going to happen at 5kHz, which when using PWM is probably going to be too little. say 256 steps gives about 20Hz. so what about using SD modulation? i wanted to try this for a while, maybe now is the time. yup. works like a charm. since red is less bright, i give it a double time slot, which leads to a 4 phase state machine. at receive sample rate there's some noticable flicker at low intensity, at about 5Hz. it's easy to avoid by introducing a minimum of 5 or 6 as color values. Entry: more state machines Date: Tue Dec 11 12:27:35 CET 2007 the send and receive functionality should also be implemented as state machines. or.. stick to a single application thread, and run the other state machines from the blocking operations? maybe that's easiest. sending and receiving are mutually exclusive. currently there's only the LED that works in parallel. Entry: rx/tx interference Date: Tue Dec 11 16:50:53 CET 2007 there seems to be interference with driving the led and reception. i added "red blink" in the demo app whenever there is a bad reception, however, this seems to completely throw it off.. (edit: not the led but tx) so i need to add some pauses probably. which brings me to: there is no generic pause word, so i'm going to use just a double 0 for loop. the interference seemed to be due to the absence of 'ramp-off' : before switching to rx-mode the speaker was still being driven. i added those and some pause, now it seems to work. Entry: project scheme extensions Date: Wed Dec 12 09:38:21 CET 2007 i need to move away from loading scheme extensions as individual macros, but towards associating them to a project. they are different. the distinction to make is: * macros from forth code: incremental, can be redefined * brood extensions: fixed per project this of course leaves in the dark brood extensions as libraries.. it's a hodge podge. what i could try is to keep the target namespace management intact: typical forth style shadowing for both words and macros and allow it to call scheme code. what about a unified dictionary: * macros stored as symbolic code * ram addresses stored as macros + macros are allowed to postpone expansion if they reduce to single constants? it looks like the seed of the plan is there: it's simple and i can't see any problems. the main difficultie lies in the difference of the way the cat namespace works (declarative: no re-definition, all names defined at once) and the purrr one (shadowining, incremental) Entry: TODO list cleanup Date: Wed Dec 12 09:55:06 CET 2007 DONE: * fix the assembler: i'm running into word overflows, code is getting too big. maybe use a trick: whenever a word overflows, just add some new code after the code chunk, jump to there, and have that chunk jump to the original word with a far jump. as a quick fix: at least print the name of offending symbols to they can be manually patched to jong jumps. * switch the assembler to a mutating algo so proper jump graph opti can be performed easily. i see no point for pure algos there.. asm is a black box anyway. IMPOSSIBLE: * if 'invoke' is a macro anyway, why not combine it with execute/b ? ANSWER: it's awkward to set the return stack to the word after invoke without using a call. that call might as well be execute/b * nibble buffer is not interrupt-safe: the R/W thing is shared.. probably need separate R/W pointers! (FIXED) REMARKS: * make it possible for a macro to create a variable. more specificly: make it possible to create any couple of words and variables together. (this means a macro can create a macro.. probably means re-introducing some reflection). if the macro dictionary is merely a cache of a linear dictionary, with the linear dictionary containing macros, this kind of reflection should be possible to introduce without the disadvantage there was before: mutation in the dictionary hash.. there would only be shadowing, and 'mark' could handle macros too. syncing cache means (lazily) recompiling the macro cache. Entry: mzscheme slow text Date: Wed Dec 12 10:34:26 CET 2007 i just tried: (define (all) (define stuff '()) (let next () (let ((c (read-char))) (if (eof-object? c) (reverse! stuff) (begin (set! stuff (cons c stuff)) (next)))))) (printf "~s" (length (all))) tom@del:~$ time bash -c 'cat ~/brood/doc/ramblings.txt | mzscheme -r /tmp/text.ss' 606700 real 0m0.332s user 0m0.319s sys 0m0.012s so it's at least not read-char.. maybe i need to write a fast tokenizer for forth using just read-char instead of the yacc clone from mzscheme? probably the same goes for sweb. tokenizer has 3 states: * whitespace * comment * word easy enough to just do manually. it could be implemented as a 'read-syntax' word which adds source location information to the symbols and comments read. a syntax-reader is essential since they can be pluggen into the module loader system. Entry: incremental static binding Date: Wed Dec 12 12:40:22 CET 2007 about static binding, redefine and linear dictionaries: it's better to have something that is predictable, but a bit rigid, than something that's flexible but harder too use. what i mean is redefining lowlevel words: it's possible to do so, but dependency management then becomes manual. the rule is: later code can never change bindings in earlier code, but it can redefine behaviour for future code. this is dirty, but the simplicity is very managable and it allows for predictable hacks. the only real alternative is a proper dependency management system and name space isolation. david and goliath. Entry: sane conditionals Date: Wed Dec 12 16:15:42 CET 2007 time to give up on the crappy >? constructs. Entry: conditional optimization Date: Wed Dec 12 16:28:22 CET 2007 what i need is a way to optimize away a conversion from flags -> number -> flags, but without hindering the construction of proper flag bytes. the macros like '=?' can still be used as optimizations that need to combine with 'if' immediately, but the others should definitely produce flag bytes. Entry: >z Date: Wed Dec 12 17:29:29 CET 2007 i wonder why i'm not just using flag>c instead of >z.. since carry flag is unaffected by drop. maybe to save carry flag some places? 0 -> carry = 0 any other -> carry = 1 that's just "255 + drop" well.. it doesn't matter so much in that it's never inlined. Entry: then opti Date: Wed Dec 12 17:46:02 CET 2007 it looks like this is mostly broken. maybe since the introduction of 'drop save' elimination. i see that "z? if drop 123 ; then" doesn't eliminate to one instruction.. see 'swapbra' and extend it to other conditional execution macros. Entry: dtc primitives Date: Thu Dec 13 10:54:01 CET 2007 towards a standard forth. 1. get it to crosscompile 2. write a kernel in itself the important things to note about the implementation is that it is concatenative: there are no 'parsing codes', meaning, there is no lookahead. * every word is an instruction * 'return' is marked by a bit as a consequence, each word has only 14 bits of payload. two bits are reserved to distinguish between data and code, and implement the return instruction. now the criticism: maybe it's best to ditch the return bit, since it limits the addressable memory. with 14 bits only 16k words can be addressed. the trade-off needs somei think it's best to ditch the return bit, since it prevents easy access to primitives by just reading them from code. i'm not sure where this can bite, but using the LSB as tag bit (0=data, 1=code) and making execute ignore the tag bit allows the use of 15 bit numbers, which can represent addresses. maybe it's not such a good idea.. i'm a bit uncomfortable with not having 16 bit width. statistics. a return bit makes only sense if the words are expected to be short. padding is an option, but awkward, since every label needs to be prepended by a nop if it's not aligned. rebuttal: tail recursion. this is the thing that's handled with the return bit.. i forget a lot of thought already went into this thing. tail recursion justifies the inconvenience of handling the extra bit. remark: a tagged data system can be built on top of this forth. i'm not comfortable with giving up a 16 data/return stack in favour of a 14 or 15 bit tagged system. Entry: signed/unsigned comparisons Date: Thu Dec 13 12:45:45 CET 2007 two issues. are they the same or not, and what should the default be? they are not the same: pos neg > * always true in signed * always false in unsigned unsigned: carry signed: sign of result (might overflow) it's a bit silly, but i think it's time i admit i don't fully understand it.. carry in addition is simple. carry in subtraction is also not so difficult, since subtraction is addition with negative. a carry on addition means overflow: the word's not big enough. simple. but what is a carry on subtraction? let's isolate some cases. result carry sign overflow 10 3 - 7 1 0 0 3 10 - -7 0 1 0 100 -100 - -53 0 1 1 -100 100 - 53 1 0 1 http://en.wikipedia.org/wiki/Overflow_flag The overflow flag is usually computed as the xor of the carry into the sign bit and the carry out of the sign bit. In other words: addition adds one extra bit to the representation. In order to not have overflow, for unsigned addition/subtraction this bit needs to be 0, and for signed addition/subtraction this needs to be the same as the sign bit. So, for a signed comparison, take the sign bit of the result, and assume there is no overflow. For unsigned take the carry bit. Entry: dtc remarks Date: Thu Dec 13 14:34:51 CET 2007 * size or speed? in the end it should run on CATkit, which has little flash memory, so i should really go for size. * FOR..NEXT is not standard, so i can just make something up? can't get for..next going.. debugging return stack stuff is hard. wanted to have a quiet simple puzzle day, but it requires 'real work' :) about size vs speed. the primitives need to be fast, so they can be used in STC code with the VM eliminated, but the VM needs to be SIMPLE. the return stack really should contain the same stuff as can be found in straight line code. i'm going to eliminate some macros. hmm.. too much thinking because it's already too much optimized.. i find it difficult to throw this kind of stuff away. what to optimize: * inner interpreter loop * maybe math primitives (used elsewhere) not so important: * enter/leave + RS (once per highlevel word) Entry: eForth / tail recursion + concatenative VM Date: Thu Dec 13 16:05:24 CET 2007 why is not optimizing so difficult? i see factors of ten everywhere.. the vm-core.f i have is nice, but i'm still quite stuck at trying to solve multiple problems at the same time: * interoperability between STC and DTC: both primitives and brood. * tail recursion it needs to be simplified a lot.. in the same way that PF needs to be simplified to get to a proper VM architecture: it's the same problem. i can do with primitives what i want, but all CONTROL FLOW needs to be based on 2 simple instructions: _run and _?run - the duals _execute and _?execute are only for primitives. so what's the definition of _run, such that it can be turned into a jump.. IMPORTANT: conditional run is not the same as conditional branch.. this points to an inconsistency: things that JUMP are incompatible with the exit bit. another problem is that 'immediate' won't work: no compile time execution: a simplified forth. can i have a macro mode? before i can implement these i really need to take a look at putting back incremental extension in the language, this time without implementing it using mutation.. (it starts to look like this cutting of the reflective wire was a really bad idea..) Entry: macro code concatenation Date: Thu Dec 13 19:20:09 CET 2007 what i'd like to postpone expansion of constants until assembly. but, i can't influence the meta functions from forth code.. this is another one of those arbitrary complications. what about: - putting macros in the project dictionary - by default, they are expanded - when present in data positions, they are evaluated i can't see a reason why this wouldn't work. the only concern is stability: each invokation needs to reduce. i.e. '+' in meta dict is special because it's different from the '+' in macros (the latter can expand to symbolic code containing '+') the problem i'm trying to solve is to get a minimal symbolic representation of things that are constants by delaying their evaluation, or by somehow recombining? i.e.: if there is a macro : foo 1 + ; i want the code "123 foo foo" to expand to the machine code: (qw (123 foo foo)) instead of (qw (123 1 + 1 +)) the thing that decides what to do here is '+' but can this decision somehow be transformed to the point where 'foo' executes? if every macro inspects its result, and if the result is ONLY the combination of constants->constants, this combination can be made symbolic, since it can be re-computed at assembly time. i.e. (qw a) (qw b) foo -> (qw (c d e f)) can be replaced by (qw (a b foo)) because probably "c d e f" is not going to be very helpful to understand where the constant came from. this would enable the unification of: * constants * variables * macros * meta words * host code does the subset of these macros need to be explicitly defined? probably not. they are just macros, and qualify if they map qw's to qw's. Entry: partial reduction Date: Fri Dec 14 10:16:22 CET 2007 maybe macros should be made greedy, such that when completely expanded they reduce. what i mean is that "1 2 +" -> (movlw 3) but "abc 1 +" -> (movlw (abc 1 +)). combined with the mechanism described above, this could be the key to unification. as a result, macros will be the only evaluation mechanism, which just need to be provided with a symbol lookup. there are 2 phases of macro execution: - phase 1: compile to literals + instructions, names symbolic - phase 2: compute literal values using resolved names it looks like making the effect of 'meta' into a local effect is the way to go. it would be nice to find a way to fix the 'postponing' operation first, so at least generated assembly code looks nice. Entry: meshy finished? Date: Fri Dec 14 15:59:31 CET 2007 looks like we're at the end. got 8 devices talking to each other. so time to make a "what learned?" section.. * for DSP, use a dsPIC instead of a PIC chip, OR write a highlevel (but slow) set of primitives on PIC. i spent too much time in writing "fast" code that eventually didn't get used, or extensively modified to destroy the optimizations. DSP apps have the property that a lot of the code volume needs to be fast, which screams for a SEPARATE algorithm design and implementation/optimization phase. the problem here is on-target debugging. as long as the app scales time-wise (rate reduction without changing other variables) optimization can be postponed. * get it to work FAST, and start with the most difficult part, even if it means dirty hacked up proof of concept, then incrementally improve while keeping it working. don't spend time on things that solve needs that are not immediate if there are other immediate needs. - debug network: eventually didn't get used - the hardware layer: it delayed everything else the mistakes had quite severe consequences in the end. i could have gained 2 weeks by not making the debug network. the cause of the mistakes seem to be - mismatch in skill (no analog electronics hands-on experience, and dusty theoretical understanding) but mostly misplaced confidence in non-tested skill. - underestimation of importance of debugging. * debugging deserves its own bullet. ironically, i lost a lot of time building a debugging tool. building that tool was a good idea, but i forgot a couple of steps: - underestimated the difficulty in getting the debug net working properly. this actually required an intermediate debugging phase to monitor the behaviour of both send and receive. i didn't anticipate these problems, which was a mistake. lesson to learn is to never underestimate the problems that can arise, even if the application seems really trivial. - doing high-bandwidth work (DSP) requires high-bandwidth debugging tools or at least a large storage space on chip for traces and logs. a solution here would be to make a separate circuit only for logging, or use a high-bandwidth host connection. an example could be a circuit that records to a flash card, or a USB connection to host. - need better host side software extension system for special-purpose debugging tools. it should be the same as the way the host system is written, so that tools can be moved into the main distro when polished. to make this easier, the number of extensible points needs to be limited such that they are better accessible. i.e. the console's need to be programmable. so, to summarize: DESIGN then IMPLEMENT don't optimize and design at the same time if there is a lot of opportunity for optimization (i.e. DSP app on PIC18 where an order of magnitude of speed gain is easy to find). as long as time-critical cores are small, this is ok, but when the core is all there is, you need to get it to work first using a highlevel approach, and ONLY THEN make it fast. ELECTRONICS is DEBUGGING do not underestimate the difficulty of getting something right in reality, even if the logical model is trivial. programming problems seem to be about managing complexity, while electronics problems are about managing external influences, non-ideal behaviour, and tons of exceptions and hacks. these are entirely different. programming = abstraction, electronics = debugging. Entry: meshy presentation -- technical Date: Fri Dec 14 16:54:00 CET 2007 hardware goal = as simple as possible - 40mm speaker used as mic - input: 2 opamp mic bandpass amplifier + 8 bit A/D - output: switching transistor (PWM) - PIC18 @ 10 MIPS - prototype uses large chip (64kb - 4kb - 28 pin PDIP) - possible to downscale a lot (8kb - 256b - 18 pins SMD) - RGB led (single resistor, S/D alternated pulsed) lowlevel software - purrr - Forth dialect - simple but powerful - bare metal vs. abstraction mechanisms - interactive (debugging!) - bottom up programming - metaprogramming (scheme) - emphasis on debugging - sound modulation: - OOK (on-off keying) - BPSK (binary phase shift keying) - 10 baud framed bytes: 1 start, 8 data, 2 stop - 610Hz carrier (speaker reso) - speaker driven with 7 bit PWM @ 78kHz - demodulator - input sampled at 5kHz - downmixer (cross modulator) + lowpass filter - OOK: asynchronous, power detect - BPSK: synchronous costas loop Entry: simplex LEDs Date: Sat Dec 15 11:24:21 CET 2007 the most efficient way (wire-wise) to connect a bunch of LED is to place them on the midpoints of simplexes, where you connect the simplex points to +/- drive points: this makes it possible to switch on 1 hop vertices, but 2 or more hop vertices stay off since they will not reach threshold voltage. this structure is also called a "complete graph". http://mathworld.wolfram.com/Simplex.html http://mathworld.wolfram.com/CompleteGraph.html mapping this to a 2D or 3D structure in a nice symmetric way is not that trivial. however, the most symmetrical planar arrangement is: place the points in a circle. if the number of points N is odd, you get (N-1)/2 concentric circles each containing N points, with a criss-cross network below it. even works similarly, only one of the circles has half the elements. this structure can be wrapped around half a sphere. wrapping it around a full sphere gives easy access to the control points, and gives a spherical or cilindrical structure. the coverage grows ~ n^2 so taking more points is relatively more efficient. however, overall connection might get too complicated. a different approach is to take some kind of 'primitive circle' which can be unfolded in a line, for example the pentagram with 10 LEDs. transporting then could be done using a bus. i.e. a ribbon cable. maybe it's possible to use a ribbon cable with pins? using a linear solution, it might be possible to make something that is composable. i.e. take an N solution, add a wire and some N primitives and make an N+1 solution. this turns out to be just cyclic permutations. for example, starting with the 2-terminal primitive L2, it can be extended to a 3-terminal primitive L3 by means of the primitive 3-permutation P3, and adding an extra wire to P2, so: L3 = L2 P3 L2 P3 L2 P3 = (L2 P3)^3 L4 = (L3 P4)^4 in general: L_N = (L_{N-1} P_{N}) ^ N this is probably a lot easer to do than networking, since it's basicly braiding. a linear projection is easy to control, but i'm not sure if it's really a good approach for construction.. if i find an easy way to solve the permutation problem, then yes, it's a good thing. simplification: it's probably ok to leave out the last permutation, and compensate for it in software. now, permutations and braids: they are not the same. transpositions have no direction, and are self-inverting. a twist on the other hand has a sign, and is not self-inverting. braids can implement permutations while giving structural integrety. for example the most typical 3-strand braid: __ ____ \ / \ _/ \ _ \ / / _____/ \_ implements a 3-element cyclic permutation as a right crossing followed by a left crossing (nomenclature: rotate the image 90 counterclockwise and progress upward: direction is the strand that passes over the other one. compare this to a double right crossing: __ ____ \ / \ _/ \ _ \ / \ _____/ \_ this is a simple twist and provides no structural integrity, but implements the same permutation. can this somehow be used as a building block for the other cyclic permutations? sure.. as long ass you work with twists from left to right, and make sure the twist pattern gives you structural integrety, the same logic applies: the result is just a cyclic permutation. Entry: interactive mode Date: Sun Dec 16 10:13:47 CET 2007 from interactive.ss : The end goal of Purrr is to have only 'live' and 'macro' interactions: the system should be powerful enough so excursions to the underlying prj: code is not necessary. This gives a separation between 'tool development' and 'tool usage'. I've come to believe that this is not a good idea in general. It is OK to be able to access the most basic host code, such as compilation, upload and inspection, but for real work you'd want to automate those and have a 'real' programming language behind it. In other words: access to prj or scheme code is necessary. * it's ok to have a small collection of host words in interaction mode which are hidden using prefix parsing. * this set of mappings (parsing words) should be extensible: prefix parsing needs a simpler definition form. * the functionality behind those words should be extensible Concretely this requires interactive.ss to be adjusted so it can accomodate parsing code in a different way. Maybe it can be made extensible together with the other parsing words.. The problem right now is that it is a single method, and the way it's defined is difficult to make dynamic (it's a scheme macro). Actually, compile mode forth parsers are already registered in the global namespace tree, so making them extensible can be done incrementally by adding some more name spaces. Entry: extensible interactive parsers Date: Sun Dec 16 10:52:30 CET 2007 two conflicting views here: * currently interactive parsers are isolated functions, which is nice and clean. * what is required is extensibility and re-use. the solution seems to be to put the components in a global name space, which is used as the unified extension mechanism, and replace the function with a stateful one that refers to the name space. key elements here are 'with-member-predicates' and 'predicates->parsers'. these form a construct that needs to be attached to the global namespace tree. the former creates a collection of membership predicates. the latter creates a map (finite function) from atom -> parser. the problem with the current approach is the generality of the parsers: they don't just map names to functions, but also create 'classes' with similar behaviour, so there is a level of indirection that needs to be captured. the live parser map is * symbol -> parser (parser primitive) * symbol -> symbol (parser class) if they are stored in this way, interpretation is quite straightforward. the approach is: * provide alternatives for 'with-member-predicates' and 'predicates->parsers' so they postpone their behaviour and store it in the global namespace. * provide an interpreter. OK. implemented + tested. Some further cleanup. Maybe it's best to not store symbols in the dictionary, but parsers: use cloning instead of delegation? This way the dictionary IS the finite function. The real problem is that macros have a delegation method (function composition) but parsers (and assemblers for that matter) have not. so: Forth syntax parsers (lookahead) have no composition mechanism. Therefore cloning is used to give some form of code reuse. It used to be delegation, but this gives dynamic behaviour which contrasts with the static, declarative intent of the global name space, regardless of its implementation as a hash table. and about ns: The global namespace is used as: * declarative symbol table (single assignment, mutual refs) * cache (forth macros should eventually be defined in state file) Maybe forth.ss should be separated into generic forth style parser macros and functions and the definitions of the parser words. Entry: static composition and extension Date: Sun Dec 16 11:19:16 CET 2007 i chose for a hierarchical dictionary as the main means of program extension. the way it is used is not dynamic binding, but 1. postponed static binding and 2. cache of a linear dictionary. as a consequence, it can probably be completely replaced by mzscheme's module composition approach, together with some means (units?) to solve circular dependencies and plugin behaviour. however, i see no point in changing this until the dependency on the method that implements this linking part can be abstracted away. currently that seems problematic, because the name store is everywhere: it is the backbone of the system. i find it very difficult to see what is the right thing to do here. 1. i'm not using abstraction mechanisms provided by mzscheme to do namespace management, which makes me miss some static/dynamic checks, and is in general just a bad idea. 2. my approach is more lowlevel so flexible to shuffle it around and find the right abstraction. the thing is i'm not sure yet if i need this flexibility (over the built in functionality). the only way to really resolve the ignorance is to implement a toy project which doesn't use the global namespace, and only uses mzscheme units and modules. Entry: future dev Date: Sun Dec 16 15:48:03 CET 2007 * fix problems in TODO (mostly peval) * finish 16bit DTC * dsPIC forth * lisp-like dsp functional dataflow language for PDP/PF/dsPIC * CATkit 2 * sheepsint 8-bit synth engine (envelopes + FM) * E2 debugging * CATkit midi * USB Entry: inspecting macro output Date: Mon Dec 17 10:07:27 CET 2007 finding a common tail in 2 lists is necessarily quadratic. but i probably don't need that, since i'm looking only for common subtails in substacks. i'm still looking for a good description of the problem.. the problem of finding the common tail seems to be the one to give insight. what about this: 1. split input and output 'qw' atoms off 2. check if remaining tail is the same this is the only behaviour that's valid. once this data is obtained, it could be peeled to isolate the behaviour of a macro, at which point cold be decided to 'unevaluate' it. now, what does unevaluate means? ... (qw 1) (qw 2) + -> ... (qw 3) this could be replaced by (qw (1 2 +)) this is always the case: since the evaluation can be performed again later. the only information that is extracted at this point is whether the macro does anything else. the change in macro code seems to be here: (([qw a ] [qw b] word) ([qw (wrap: a b 'metafn)])) the 'wrap:' form needs to be replaced by something that might return a value if the variables contain numbers. running into a small namespace problem.. trying to use scheme names, but it might be better to leave the meta dict in there to do this kind of stuff, but only call it from the macros. basicly, the stuf after wrap: should be symbolic if the parameters are symbols, and computed if both are numeric. Entry: benchmarking Date: Tue Dec 18 16:21:16 CET 2007 the current reader is problematic.. it's slow, and i don't understand the reason. i don't think it's usage of streams, since it wat slow before, and it's not read-char, since i tried that.. so... 1. make a test for the current reader 2. replace it with a new reader 3. build 'read-syntax' first text: the problem seems to be somewhere else.. (define f (forth-load-in-path "monitor.f" '("prj/CATkit" "pic18" ))) is virtually instantaneous like it should be.. so where did i get the idea that this is slow? indeed: '(file monitor) prjfile prj-path forth-load-in-path is instantaneous also. otoh, 'forth->code/macro' isn't instantaneous at all.. compiling the code 'code/macro!' is instantaneous also. i think i got it. why is the code/macro splitter so slow? tracking down to forth.ss : forth->macro.code which uses @forth->macro/code which uses @moses it can't be @moses since that's just a filter.. so it's probably down the stream in the macro processor. need to test that separately. running into some inconsistencies.. probably best to switch everything to syntax objects, including a syntax-reader. Entry: read-syntax Date: Tue Dec 18 17:54:50 CET 2007 from http://download.plt-scheme.org/doc/371/html/mzscheme/mzscheme-Z-H-12.html#node_chap_12 (datum->syntax-object ctxt-stx v [src-stx-or-list prop-stx cert-stx]) converts the S-expression v to a syntax object, using syntax objects already in v in the result. Converted objects in v are given the lexical context information of ctxt-stx and the source-location information of src-stx-or-list. If v is not already a syntax object, then the resulting immediate syntax object it is given the properties (see section 12.6.2) of prop-stx and the inactive certificates (see section 12.6.3) of cert-stx. Any of ctxt-stx, src-stx-or-list, prop-stx, or cert-stx can be #f, in which case the resulting syntax has no lexical context, source information, new properties, and/or certificates. If src-stx-or-list is not #f or a syntax object, it must be a list of five elements: (list source-name-v line-k column-k position-k span-k) where source-name-v is an arbitrary value for the source name; line-k is a positive, exact integer for the source line, or #f; and column-k is a non-negative, exact integer for the source column, or #f; position-k is a positive, exact integer for the source position, or #f; and span-k is a non-negative, exact integer for the source span, or #f. The line-k and column-k values must both be numbers or both be #f, otherwise the exn:fail exception is raised. (datum->syntax-object #f word (list source-name line column position span) #f #f) EDIT: why do i run into the need to have a port object that can put back a character? scheme needs this too, so maybe the port objects need to support putback? it's the other way around: scheme ports support a peek operation. looks like it works now, and the code looks clean. next: create syntax objects. this seems to be rather straightforward by using 'port-count-lines-enabled' and 'port-next-location'. ok. seems to work now. Entry: syntax cleanups Date: Sun Dec 23 13:57:33 CET 2007 what about the '|' character for lexical variables? things to be aware of: don't break code / or break it verbosely again, i want to write a state machine.. i need to think a bit about the abstractions used in forth.ss 'parser-rules' works well. the rest is hard to read. the problem seems to be parsers that segment data, instead of taking a fixed amount of data from the stream. these need state machines. let's rewrite the def: parser as an example. basicly this is forth-lex.ss, but then recursively. OK. i've got a definition parser working which produces name, formals list and body. now this needs to be passed upstream somehow. looks like that is the next part to cleanup: macros can have formals, and they need a symbolic representation for this, i.e. in the state file. now the question is: should this be the (a b | a b +) syntax, which requires another lexing step, or should it be an s-expression with explicit formals list? what about this: make lexing steps easier, and just use more lexing steps. forth handles parsing (recursive) at a later state than lexing. Entry: regular grammar Date: Sun Dec 23 22:00:27 CET 2007 the essential property of a regular grammar is that, each production rule produces at most one non-terminal. intuitively, this means there is no "recursive" tree structure, only a sequential one: there is no "replication gain". so it looks like i need a way to express some of the state machine parsers as simple regular expressions based on membership functions, instead of the more specialized character classes. (vaguely related: note that the Y combinator is essentially a copy operation) Entry: regular expressions Date: Tue Jan 1 15:48:20 CET 2008 the data is a stream of tokens, so regular expressions can be constructed in terms of membership functions and modifiers like '*' or '+'. symbols can be converted to membership functions. that should be enough? not really. need some form of abstraction: a pattern can be a composition of patterns. so maybe it is better to stick with the lexer language in mzscheme? since what i am going to re-invent is ultimately going to be a generic regexp tool. EDIT: looks like it's really character-oriented. maybe it is a good exercise to try to write a lexer generator? can't be that hard.. also, i run into this problem so many times with low-level bit protocols that it might be a good idea to take a closer look: white space is essentially the 'stop bit' in async comm. which brings me to the question: i think i read on wikipedia (i'm offline now) that regular expressions and FSMs are somehow equivalent. how is this? how about forth-lex.ss: a specification not as production rules of a regular language, but as regular matching patterns? what is the problem i am trying to solve? find a function (or macro) that maps lex-compiler : language-spec -> token-reader stream = token stream | EOF token = word | comment | white at the same time, i am trying to stay true to the forth syntax: simple read-ahead. (keyword + fixed number of tokens) or delimited read (keyword + tokens + terminator). note: there seems to be a difference between reading UPTO a token, or reading UPTO AND INCLUDING a token. is standard forth always of the latter form? to answer the question partially: the current forth-lex.ss performs segmentation, and thus is not of that form: it cuts INBETWEEN tokens. but forth is. can i learn something from this? yes: cutting AT a token makes the automaton simpler, since it doesn't require peek. let's call that 'delimited' until i know the technical term. i think the important lesson is that: 1) forth should be delimited: this simplifies on-target lexing 2) exception: first stage tokenizer in brood = segmentation the latter is an extension to make source processing in an editor (like emacs) easier by preserving whitespace and delimiting characters. BUT, it should not introduce structures that 1) can't interpret. it looks to me that before fixing the higher level compiler and macro stuff, the lexer should be fixed such that it can be replaced by a simple, reflective state machine (true parsing words). looking at forth, there are 2 reading modes: - read upto and excluding character - read next word (= upto and excluding whitespace) by fixing some of the syntax (comments and strings) editor tools can be made exact: a list of DELIMITED words will read upto and including a delimiter. Entry: rethinking forth-lex.ss Date: Tue Jan 1 18:00:25 CET 2008 a proper markup language is necessary. one that will not throw away information, but gives perfect parsing of source code. note that in order to transform source code to markup, a tokenizer is necessary. the tokenizer is a form of 'unrolled' parser: it describes a segmentation that CAN be parsed by a reflective delimited parser. ('reflective' means words have access to the input stream and can thus influence the grammar). in order to make the right decision, it is necessary to have a look at the standard word ." which quotes a string up to but excluding the " character and prints it: this words interprets the first whitespace as the delimiter, and any subsequent whitespace is part of the string. in order to properly segment code, this behaviour needs to be respected. instead of (pre word post) a different segmentation is necessary which can properly encode eof. what about a word/white distinction? (word pos string delimiter) (comment pos string delimiter) (white pos string) another question: is EOF error or not, when it follows a word? i think the answer should be YES: otherwise it violates concatenation of files = file. got forth-lex.ss simplified now.. it looks really familiar ;) i need to give it the standard names, but this looks like it. NEXT: add delimited parsing to parser-rules. this should capture all parsing need, since there are no more non-delimited constructs. i.e. (parser-rules () ((_ macro : name words ; forth) ---------)) Entry: declaration mode Date: Wed Jan 2 19:15:54 CET 2008 embedded in standard Forth syntax is a "declaration mode" where all definitions are interpreted as macro definitions instead of intantiations of words. i'd like to express the state machine that implements this mode using an extension of the 'parser-rules' syntax, one that implements (a limited set of) regular expressions. let's start with a summary of current constructs (-> means "depends on") parser-rules -> @syntax-case -> @unroll-stx + syntax-case where 'parser-rules' creates a function with parser prototype (stream -> stream,stream) and @syntax-case is like 'syntax-case' but applicable to the head of streams. most of the real action is in forth.ss, where i'd like to eliminate a number of constructs. the current way to collect a number of definitions is using 'def-parser' which creates a definition parser parameterized by a type tag. recently i wrote this as a straight state machine. this i'd like to replace now with some regexp based matching approach. the key elements in a def parser are: * a definition is of the form : (optional | ... |) ... ; * a list of definitions is terminated by the word 'forth' previously i came to the conclusion to only allow delimited constructs, which are clearly marked with a start and stop marker. these constructs require no lookahead, and thus have a simpler automaton implementation. i'd like to use the '...' construct to indicate zero or more, just like the syntax-case macro, but necessarily limited by a fixed marker symbol. a '...' at the end of a match means pattern recursion. optional constructs can be handled by multiple match rules. this makes a def parser look like: (parser-rules (: ; | forth) ((: name | formal ... | word ... ; ...) ((def name (formal ...) (word ...)))) ((: name word ... ; ...) ((def name () (word ...)))) ((forth) (())) can this form of ellipsis be mapped to the default meaning of multiple occurances? this looks like an important question: a core difference between tree and sequence matching. question: what is better? * special meaning of '...' at the end of a sequence (self-recursion) * explicit recursion? the def parser could be constructed as a 2-phase machine: one that dispatches between staying in the mode and calling a single def parser or exit the mode, and the def parser itself. '...' could vaguely mean "multiple times", but there's a difference between: multiple times upto XXX, or infinitely many. it looks like explicit recursion is better than looping, so i'm going to drop the special meaning. this brings a single def parser to: (parser-rules (: ; |) ((: name | formal ... | word ... ;) ((def name (formal ...) (word ...)))) ((: name word ... ;) ((def name () (word ...))))) now, what i can use is this: (syntax-case #'(a b c end bla) (end) ((stuff ... end r) #'(r stuff ...))) => (bla a b c) yep.. it looks like there's a fundamental difference between the tree matching and sequence matching problem. maybe i need to give it a special symbol. let's take *** to mean: collect upto following terminator, so ... can still be used for tree matching. (parser-rules (: ; |) ((: name | formal *** | word *** ;) ((def name (formal ***) (word ***)))) ((: name word *** ;) ((def name () (word ***))))) what about a simpler approach? the only thing that needs to be done is to collect syntax object between marks into lists. these lists are easy to process with a @syntax-case parser later on. so the thing that's necessary is a way to construct a stream parser that collects up to a certain predicate. sounds familiar? ok.. this leads to simpler code. i could use the current 'def-parser' as a template for a more general delimited parser expression. i think i can ditch '@split' now. it leads to convoluted code. ok. 'mode-parser' is now written as an explicit recursion now. this probably means i can start throwing out some stream processing code. wait.. need to check the macros-with-arguments thing.. OK. fixed. commented out a lot of code from stream.ss that was related to chunking/splitting. so.. the lesson: * linear streams: use explicit delimiters for embedded sequences: simplifies parsing: no lookahead necessary. * convert delimited sequences to lists + use scheme's tree matchers Entry: next? Date: Thu Jan 3 00:48:28 CET 2008 * connect the syntax reader to the parsing/loading code. * unify all evaluation to execution of macros + manage evaluation time Entry: moving to stx objects Date: Thu Jan 31 12:49:59 CET 2008 what needs to be done now is to: * replace all compile words so they accept syntax object in addition to lists. * convert all generators to syntax generators * add print routines for them so.. start in badnop.ss: string->code/macro (for compile mode, which i can test now). i'm replacing forth-string->list with forth-string->syntax got string->syntax stuff working. now trying the path/file loader. this needs @syntax-case instead of @match. except for the weird problem below which i worked around, it seems to work now. printing works out of the box (snot). Entry: weird @syntax-case problem Date: Thu Jan 31 13:54:43 CET 2008 the 'load' symbol in this doesn't want to work. if i replace it with a different name, it does.. what's that about? (@syntax-case stream tail (load-ss load) ;; Inline forth file ((load name) (begin (printf "load\n") (@append (@flatten (f->atoms (stx->string #'name))) (@flatten tail)))) .... Entry: possible cleanups Date: Thu Jan 31 15:56:06 CET 2008 * asm buffer from tagged list -> abstract type? there's a lot of room for improvement in that department. it would allow some kind of instruction annotation that's not possible right now. i think were i to start from scratch, i would build it around this.. * macro unification (from the TODO) unify dictionaries: put macros in the main dict as lists, store ram addresses as variables, and find a way to postpone compilation of macros to their corresponding values if they reduce to values (are constants/variables/labels...) the former is cosmetics (atm), the latter is a tough problem, but can lead to a gigantic simplification. Entry: target name space unification Date: Thu Jan 31 16:01:12 CET 2008 name space unification would mean that the dictionary stored in the .state file contains not only addresses, but also macros (in a form that's specific enough to recompile). this form needs to include lexical variables. so a dictionary item is either a number, or a macro. target words are then just macros: ((abc 123) ;; literal / constant / ram variable / ... (go 3235 execute) ;; code (bla abc def)) ;; any macro code taking into account lexical variables this can be simplified to a single format: ((abc () (123)) (go () (3235 execute)) (bla () (abc def)) (arg (a b) (a b +))) where the first parens are the macro lexical variables. code that has no lexical variables is purely concatenative. this requires quite a deep cut, but should lead to great simplification. fork point is here. Entry: declarative namespace + cached linear dictionary Date: Thu Jan 31 16:53:39 CET 2008 make dictionary abstract? maybe the most important point to ensure is cache consistency. on one end, there is a symbolic representation of a dictionary, on the other end there is a compiled version, which resides in the NS (macro) part. how to ensure these are never out of sync? so the next step is to define what the NS object actually is. it is a collection of namespaces, where each element is STATIC. the IMPLEMENTATION allows mutation, but the use should be restricted to single assignment. otherwise the cache is invalid. the main function the NS object provides is PLUGIN behaviour: late binding of some identifiers to allow the system to be composed of several individual pieces, without needing the strict tree-based structure of mzscheme's module system. maybe units are the right way out, but right now i'm stuck with this more lowlevel model. what's necessary is to define some proper interfaces to this: 1) NS as graph binding (single assignment) 2) NS as cache object for target macros i made this remark before. the first access pattern is easily enforced: never overwrite anything. the second one is more difficult. need to google a bit, looks like a popular pattern: cache association list with a hash table. Entry: caching an association list Date: Thu Jan 31 16:54:05 CET 2008 the problem can be solved by making the operations abstract. association list: * push * pop * find as long as the access pattern contains no pops, the caching mechanism is quite simple. on pop, one could re-generate. this is effectively what i'm already doing, however, it's not guaranteed synchronized. so.. the elements: 2 dictionaries: (macro) ;; defined in core, and untouched by prj (macro-cache) ;; cache of prj macros for this to work, the code in (macro) should NOT depend on the code in (macro-cache). this means the core macros are not allowed to have pluggable code. this is only allowed in the static load part. let's rephrase: macros are subdivided in 2 parts: 1) declarative with cross-resolve (pluggable components) 2) linear dictionary extension on top of this does this in any way interfere with local name re-definitions? i think i just need to try it out.. re-iterate the model from the forth side: each compilation unit has a name space that can shadow/extend the previous one. all extensions in one unit need to be unique. this model resembles incremental compilation per word (strict early binding), but allows for cross-reference within one unit. path: * get rid of constants. * get rid of ram dictionary. * move macros to target dictionary. constants already were eliminated. they can still occur in rewrite macros that generate asm code though. the ram dictionary is more problematic. it's probably best now to move to abstract access methods for the dictionary. it does look like that's the way out. pulling those changes through the assembler will shuffle things quite a bit. macros can follow quite easily from there. maybe it looks like this: in assembler.ss -> 'label 'word 'allot represent the points where the dictionary is augmented. what will happen here is that macros can be defined also, no? there seems to be a conflict between allowing the definition of labels (ram or flash) and allowing those of macros, when they are all unified.. there is a difference however: as long as the thing which creates a new macro definition, only dumps it in the assembler buffer, there is no problem.. the entire buffer will be assembled with the current macro definitions.. wait, there's something warped about that. pushing through some changes, i arrive at the assembler. it might be best to turn the running variables (rom and ram top pointers) into real variables, and use the dictionary as a stack. going to try to do some things at once: - allot needs to be rewritten in terms of ptr@ ptr! - adding new dictionaries won't work any more fading out.. next = (code . 0) (data . 0) etc.. data is missing. ok, cleaned that up a bit.. also made the running pointers mutable. Entry: macros in dictionary Date: Fri Feb 1 12:28:01 CET 2008 that's the next step. now i need to think hard about where this can go wrong, with the semi-separation i have. basicly, the preprocessing step SORTS all names, to make sure macros are active before the rest of the code is compiled. this shouldn't give any trouble. the thing to look at next is the path macro definitions travel. probably it's best to parse everything in one go: formal list (empty for concatenative macros). forth.ss is again the place to be. looks like make-def-parser is the function to modify. that modification seems to work. now adjusting badnop.ss and macro-lambda-tx.ss to build a compiler function that uses the parsed representation to build a macro. the problem here is that it doesn't really fit in the rpn-compile framework. so.. i made it fit. the "body" for macro-lex: compilers consists of 2 elements. a list of formals and a body. this is the standard format used in the state file. md5 sum still checks. NEXT: move the 'macro dict into the normal dict. ouch.. can't have "123 execute" as macro.. or can we? maybe that's one that should be delayed.. i need sleep. this smells like the beginning of something new.. a proper way to organize the code. a question to answer: why did i violate source concatenation by introducing locals? the answer is of course out of convienience, but is there a real disadvantage? the macros themselves are still compositional.. this is just about source. Entry: name change Date: Sat Feb 2 12:12:30 CET 2008 it's time to start thinking about a name change for the cat lanugage.. problem is of course cat-language.com i have 2 alternatives: KAT and SCAT. the problem with KAT is that it sounds the same as CAT. the problem with SCAT is the same as the problem with SNOT.. do i really care though? programming in scat could then become scatology. i still think that's humor ;) Entry: reflection Date: Sun Feb 3 10:43:19 CET 2008 i was thinking yesterday about macro unification, and wondered whether it might be better to go back to the accumulative model for name resolution / redefinition. the main problem before was that compilation of code had side-effects (definition of new macros in the NS hash), which made it impossible to evaluate code for its value only. however, there is probably a way to put this accumulative behaviour back, by taking the assembler into the loop: let the asm 'register' the macros. the REAL problem i'm trying to solve is still macro generating macros and the generation of parsing words. both are a opposed to declarative code model, but in the end, the model isn't declarative at all.. it's a bit of a mess in my head now. GOAL: i need macro generating macros: limiting the reflective tower in any way will always feel artificial. how to do that? * accumulative (image model) is the simplest, and the original way of dealing with this problem. however, it doesn't give a static language. * declarative (language layer model) is the cleanest way of doing this, but requires some overhead that might look as overkill. can we have both? the declarative approach needs s-expr syntax to be managable. it won't be Forth any more.. let's see.. image model: simplest, highly reflective forth paradigm. declarative: cleanest for metaprogramming purposes. i guess i need to isolate the exact location of the paradigm conflict. what do i want, really? GOALS: * generating new names (macros) should be possible within forth code. currently, the only way are the words ':' and 'variable'. * cross reference should be possible. this currently works for macros, because they use a two-pass algorithm (gather macros first, then compile the code) and works for procedure words, also because of a two-pass algorithm (ordinary assembler). * linearity in chunks should be possible, which is the current model. questions from this: - is it possible to unify the 2 different ways of emplying a 2-pass algorithm for cross-references? - how to move from a fixed 2-layer architecture (macros + words) to an n-layer architecture. is this doable without a language tower? is it desirable? (is reflection really that bad? does it conflict with automatic cross-reference?) the more i let this roll around, the more a certain light goes to this solution: split the problem in 2 languages. use a reflective forth which 'unrolls' into a layered language description, and a static layered s-expression based language that uses the same macro core. this gives the convenience to use forth syntax and the reflective paradigm, and at the same time the flexibility to use the language tower when reflection is too difficult to get right, or the automatic layering doesnt work.. so, the current question becomes: can the GOALS be kept by moving back to a completely reflective machine (including parser!) which unrolls automatically? remark: it looks as if i really need the equivalent of 'define' which would be really 'let'.. it all seems to boil down to scope (Scope is everything!). a forth file should be transformable into a collection of definitions and macro definitions. it probably makes a lot more sense to see the dictionary as an environment which implements the name . value map of a nested lambda expression. let's see.. the current model (macros are compositional functions) is really good. the remaining problem is scope: when to nest (let*) and when to cross-ref (let-rec). another idea.. instead of looking from the leaf nodes and building a dependency tree, what about starting from the root (kernel) node, and build an inverse dependency tree? the linear model is the intersectin between the two. Entry: future CATkit Date: Wed Feb 6 13:43:32 CET 2008 some possible roads to travel with CATkit, and associated problems: * boot loader programmer: instead of going with the USB TTL cable, it might be more interesting to create a complete solution for programming with brood: one that can program any of the target chips straight from the factory. it's pretty clear to me now that freezing the bootloader spec is going to be really problematic: they are project-specific. building a single all-in-one programmer/debugger solution is the way to go. maybe the E2 ideas can be unified with this too? * to make the programmer doable, it might be wise to start using available Microchip C code: which means being able to link Purrr code to a MPLAB or Piklab project. also for ethernet based pics this might be wise. time to get a bit less radical if i want to get things done.. * a fairly standard 16bit Forth language. i'm far removed from this if i first want to fix the internal representation back to a more reflective approach with automatic unrolling into nested namespaces, and integrated parsing.. (EDIT: not true.. since the Purrr18 language should remain fairly stable, writing the Forth while doing the macro changes might work out just fine.) * pre-assembled kits for Forth-only workshops. what is necessary there is to work for minimal cost: basicly shrink and eliminate through-hole components. however.. the big cost is really not the board if it has pots on it. the deal is: there's no point in competing with arduino. Entry: overall design changes Date: Fri Feb 8 11:45:31 CET 2008 assembler it's been fun, but it might be good to start outsourcing code assembly. especially regarding the future use of different architectures, and interfacing with object code formats. it fits better in C code generation too. interaction this needs some thought, but at this point an abstract interface between the compiler and the target system is necessary. the road towards this consists of writing a double backend: one for PIC18, and one for ARM (philips) or MIPS (microchip 32bit). i'm thinking about moving most of it back to scheme, and phase out the cat code in prj.ss and badnop.ss forth language i'm a bit in a ditch here.. the current attempt to unify the namespaces into a single nested macro name space brings up questions about maybe unifying the parser too.. however, looking at radical forth changes like colorForth, a move towards a rather fixed parser can be observed. in my approach, the parser takes out a lot of dirty forth-isms while at the same time keeping the syntactic convenience they bring, at the price of not being so extensible.. the core idea is still: the current functional macro approach is good, i just need to figure out how to organize the name space and keep everything as declarative as possible (relationships, not state changes). Entry: CATkit 2 Date: Fri Feb 8 16:53:48 CET 2008 Keeping the current code in Purrr18 as the implementation language, moving to an on-target interpreter seems like the only sane way to decouple the CATkit community project from the evolution of BROOD. CATkit/Sheep core could still be done in Purrr18, but the availability of a straight no-hassle Forth would make things a lot simpler. Clear separation of kernel / user also serves as a good psychological barrier. This has huge implications for the architecture. The 18F1320 won't be enough. Probably a move to 18F2620 is necessary because of memory requirements. Using the current architecture though, there is a possibility to take the following path: * create a different debug bus over the ICD2 connector * use the serial port for Forth console Actually, that's not really necessary.. All this can be multiplexed over serial. Another qestion is: does it make sense to have an intermediate dtc layer like i have now, which essentially uses a double implementation of the compiler (macros): one in brood and one on the target? Really, the only thing to do is to replace machine code with Purrr18 and for the rest build a standard console based Forth machine. Entry: stand-alone Forth Date: Fri Feb 8 17:14:43 CET 2008 rationale: * more standard (documentation) * no dependency on Brood (decoupled from scheme + emacs) * no double implementation of compiler (host + target) roadmap: - look at Flashforth and Retro Forth. - start building dictionary -> interpret mode -> compile mode - possible on 18f1320 ? - macro/immediate? - tail recursion? Entry: goals Date: Sat Feb 9 10:24:59 CET 2008 to prevent ending up in a random walk, it's time to clearly state some goals on the PIC18 front. BROOD core + PURRR18: target audience is mostly myself, or people with assembler/electronics background. most important features are flexibility (focus on macros and code generation), speed and code size. BROOD is a tool for the "kleine zelfstandige". stand-alone PURRR: target audience is much broader. less emphasis on absolute control, more on simplicity, language stability and compatibility across platforms. it's the "configuration language". i'm thinking ANS + tail recursion + concatenative VM. non-PIC18 things are quite open still. core needs more modularity (see entry://20080208-114531) Entry: pragmatics of macro namespaces Date: Sat Feb 9 15:25:56 CET 2008 what about this: * design an s-expression syntax that has all the desired properties. * make the name-value binding explicit and unique: this gives problems with multiple entry and exit points. * write a translator from forth syntax * regenerate the macro cache, each time the language nesting level changes. (language ) (language ((a () 1 2 3) (b () 4 5 6)) ((help a b) (broem b b b))) nested syntax: at each point the current language sees the enclosing macros. a compilation step compiles code into macros containing the addresses. -> each macro block begins a new language layer. time is not right yet. maybe i should do the forth first? no.. i need to start breaking things and building them back up to get more insight on how to disentangle before changing the current code. Entry: breaking macro storage Date: Sat Feb 9 17:40:04 CET 2008 simply replacing '(macro) with '(dict) now.. secondary: prj.ss is really hard to understand. maybe more of the cat code should be moved to scheme? or at least to a more functional approach.. the state management is still difficult to understand. looks like this just works for the monitor. now why is that? i expected it to break somewhere.. it indeed breaks somewhere: interactive mode. looking up words doesn't work. time to move that to a more abstract implementation in target.ss next thing that broke is 'mark'. prj.ss: is so dirty because there's a lot of mutation going on, and the naming of words is really inconsistent. this really needs cleanup. another hidden assumption about "org" in bin->chunk. the problem seems to be that absence of 'org' leads to problematic asm blocks. what about structured asm? i read something about this in olin shiver's comments about a summer job he did implementing a scheme compiler.. maybe that's what i need to go to? anyways.. there's a lot lot lot of work cleaning up data representations. the whole ifte/s and run/s business is a bit rediculous.. it doesn't feel natural, and requires deep thought each time. i think it's time to ditch the way state access works, and move most code to functional programming with prj.ss doing nothing but state management (no control logic!) Entry: state management / the point of prj> Date: Sun Feb 10 14:50:08 CET 2008 something really smelly about it. i think i'm better off with true mutation in the scheme sense, instead of working around it the way is done in prj.ss the base line is: this prj> mode should be usable for DRIVING THE TARGET. the whole functional state business is overkill: most code can really be made functional, and possibly more understandably written in scheme. whenever state recovery is necessary, it can be moved to the functional domain (i.e. assemble and compile as they are now..) the problem i'm trying to solve is discipline: not gratuituously using global state. maybe i should read some haskell tips, since this is the way haskell programs seem to be written: a bulk of pure functions and a central state management module through monads. let's see some important properties: - the interactive forth layer translates to prj scat code - the macro code is purely functional code with a threaded asm state - staying close to scheme keeps things simple other remarks: - base and prj are different. this is clumsy. - there are 2 namespaces: NS and the prj state namespace. - prj already behaves as true mutable state. is permanence necessary? - atomic failures preliminary conclusion is: scat code is important as intermediate layer between scheme and forth, both for interactive and compile time use. the compile time part needs to be functional because it makes computations easier: compilations should be really just functions. the interactive part however is intrinsicly stateful: ultimately it manages the state of the target and the current view (debug UI). the only place where current scat/state approach is useful is atomic state updates. these however, can be replaced by purely functional code and a transaction based approach: each command is a state transaction and either fails or succeeds. compositions of transactions should maintain that property. aha, holy grail identified: COMPOSABLE TRANSACTIONS maybe i just need to start reading again. this is very related to COLA (combined object lamda architecture) and the recent transactional memory stuff in haskell. Entry: transactions Date: Mon Feb 11 09:35:00 CET 2008 the way it works now: every console command that updates the state store in snot.ss is a transaction. if it fails, the previous state is maintained. something like that can be implemented differently. what i'd like to avoid is to have to copy NS in the current implementation. a possibility is to transparently replace part of the NS tree with an association list. then parameters can be used to make a copy. it looks like the let* / letrec problem wants to propagate deep into the structure of the entire program.. why is that? maybe i should start using a persistent object model for the store? ok.. this is shaking up the roadmap again. TODO: - fix the problems with macro unification - implement reverse macro lookup properly - think about making evaluation time concrete (entry://20071217-100727) - work towards a cleaner state representation about haskell and monads: looking at state management, monads somehow solve the bookkeeping of 'current' data. this can take many forms, but two crystallized constructs are: global and dynamic environments, which in scheme would solve most problems involving the passing of data outside of function arguments. thanks to the type system in haskell, the red tape can be hidden, and all is implemented using just functions. EDIT: being able to use state restore on failure on the command line level is really nice. this should not be given up. however, once the target is being modified, errors can't be fully recovered. Entry: variables Date: Mon Feb 11 10:17:08 CET 2008 running into trouble with recursive variable expansion. the problem is that a variable is this: #`((extension: name () 'name) ;; macro quotes name 'name #,n buffer) which uses: (([qw name] [qw size] buffer) ([variable name] [allot 'data size])) and this in the assembler: (define (variable symbol value) ;; FIXME: no phase error logging? (dict-shadow-data (dict) symbol value)) so eventually, the name will get shadowed. the problem now seems to be that there's some recursive lookup that messes things up? lets try a test case. variable broem broem \ <- infinite loop ok.. conceptual error or just small bug? just small bug: forgot parens around 'name in (extension: name () ('name) which gave (quote name) -> recursive call Entry: intermezzo -> snot + interrupt Date: Mon Feb 11 10:22:48 CET 2008 this is getting on my nerves. it's been fixed a while ago in mzscheme cvs, but maybe i should just go for 3.99 atm? see if it breaks things.. went pretty well. had to replace some reverse! by reverse, and use mutible pairs in the decoder.ss another thing that changed is manual expansion of user paths (tilde). this is a bit more problematic. another thing that gets on my nerves is the absence of stack traces.. what am i supposed to do with this: ERROR: car: expects argument of type ; given {#f . #} ok.. it is pretty deep: the srfi-45-promise uses mutable pairs. fixed + fixed the plt sandbox code and sent mail to plt-scheme list fixed break stuff in brood + snot. breaks work now. Entry: more fixes Date: Mon Feb 11 16:14:53 CET 2008 the 'empty' needs to be fixed. something wrong there. doing reverse asm would be an interesting next step + moving some code to hex printing. Entry: moving more code to scheme in tethered.ss Date: Tue Feb 12 13:57:00 CET 2008 * mzscheme with modules is quite a nice namespace management tool to write nontrivial programs. the big flat namespace with late-binding plugin behaviour in brood is a bit messy. maybe i do need the extra bit of mz handholding, and move plugins to parameterized code? * i really miss closures when writing cat code. names and nested scopes are important, and trading in a bit of conciseness for names (and absence of stack juggling!) is a good idea. with closures and macros, scheme is malleable enough to reduce red tape where necessary. my personal preference is moving: cat is not a good implementation language compared to scheme. * the cat intermediate language is interesting to simulate interactive forth: translation is really straightforward. gluing scheme and forth together, this layer serves well: adding scheme functionality to cat is straightforward + translating forth to cat is too. this leaves me with the following problem to fix: ts-stack is a word that is used to plug in the target stack bottom + pointer location. do i keep it like that? it looks like these things are best solved using parameters: that way the scheme code will work too. maybe i should make a list: * connection (lazy-connect.ss) candidates: * stack location * flash program/erase size Entry: porting to mz v4 Date: Fri Feb 15 10:27:21 CET 2008 yeah, reading docs can bring clarity ;) doc/release-notes/mzscheme/MzScheme_4.txt i got a bit confused about the whole scheme and scheme/base thing while reading some web server docs. the biggest change seems to be the use of optional and keyword arguments in lambda expressions. do i make a full port? probably best to not keep too much legacy in the brood core.. i need the upgrade for sandbox.ss fixes, so maybe it's time to jump to 4 completely. as expressed in the release notes, the keyword arguments can be problematic for legacy code.. Entry: big changes Date: Fri Feb 15 10:52:12 CET 2008 OK.. i think i know what i need to do, but it's a big job: i need to get rid of the NS namespace, and split the code into: * purely functional * parameterized the line between the two isn't clear-cut. parameters are things that are "mostly constant". i.e. communication ports, file paths, ... to me it looks like this is the most important line of name space management in scheme code. (in haskell, the problem of code parameterization as automatic threading of data is solved using monads) the problem with parameters is that they break referential transparency, which is a great property for testing.. i think in most cases, a transparent function can be wrapped in a parameterized one. i just need some moderation here: every use of a parameter, deep in the code (like 'here' in the assembler) makes things more specific, but might be the right thing to do. so, basicly, code can be dynamically layered: the assembler i.e. doesn't USE the target dictionary as a parameter, but gets it passed as an argument by the interaction system (which i.e. does has it as a parameter). in contrast, the assembler, internally, might use dictionary as a parameter, but the code outside of the assembler doesn't need to know that. getting rid of NS namespace, and moving to module name management instead means: * more code is static (tree dependencies) * plugin behaviour (graph dependencies) need to be solved explicitly * simpler: map everything straight to scheme compilation, with names: - lexical - module-local (with prefix to separate from scheme) - toplevel (might be used for plugin behaviour / units?) that looks significant enough to call it brood-5 Entry: eliminating the state dialect Date: Fri Feb 15 11:20:19 CET 2008 anything that can be done on brood-4 before making the jump to abolishing NS? yes: moving to parameterized project data while keeping the transaction-like workflow intact + solve transaction thing with target memory maps. ROADMAP: - move more compiler code to scheme. - eliminate the prj <- state implementation, but make sure transaction behaviour is maintained (association lists or hash tables?) - move assembler and parser to separate dictionaries (or keep them in NS till later?) - move CAT code to module based namespace. Entry: plt scheme study Date: Tue Feb 19 16:44:25 CET 2008 maybe it's best to look a bit closer to the plt scheme language now that V4 is coming out. some things i'd like to know more about are: * mixin class system * delimited control mentioned on http://en.wikipedia.org/wiki/Plt_scheme in addition, it would be nice to get more of the drscheme functionality in snot, such as proper stack traces, module browser, syntax-level refactoring. i'll take http://zwizwa.be/darcs/sweb as the case study for this. brood's a bit to hairy atm. trying to make sense of: http://www.cs.utah.edu/plt/delim-cont/ it looks like understanding this will bring me closer to understanding the problem in brood with "undo" at the console, and the transaction based model i'm chasing after. yeah, vague.. reading the paper. chapter 2: the operators: shift, control, reset. hmm.. i'm missing a lot of muscle to read that one.. ltu to the rescue: http://lambda-the-ultimate.org/node/606 http://lambda-the-ultimate.org/node/297 "Good stuff! But keep in mind that, as the cartoon in the slide says, control operators can make your head hurt..." no shit.. to summarize vaguely what the 2 points are about: - delimited control: partial continuations: don't jump outside of context. - mixins: somewhat related to generic functions. about the delimited continuations, it might be best to read the plt doc on "prompt" and some related things on continuation marks and stack traces. for mixins, i'm reading this: http://www.cs.utah.edu/plt/publications/aplas06-fff.pdf from a quick skim i don't see how it's related to generic functions though.. mixins seem interesting though i don't see the difference with multiple inheritance. maybe that inheritance is linear instead of tree-structured? Entry: expression problem Date: Thu Feb 21 23:23:46 CET 2008 http://groups.google.com/group/plt-scheme/browse_thread/thread/3aaacdc5169e5889 Mark's reply was pretty clear, and this: The PLT folks have used the expression problem as a springboard for thinking about big issues like, what does it mean to be a software component, and what are appropriate ways for reusing and extending a software component. Is then modules/units/classes/mixins.. Swindle might be indeed a good thing to have a look at next. The whole deal of multiple dispatch, so central to PF, is in the end something i need to understand better. about multimethods: cicil is mentioned here: http://tunes.org/~eihrul/ecoop.pdf http://citeseer.ist.psu.edu/219067.html compression of dispatch tables? (about PF: there's probably a way out using small number of types or compile time type inference..) I'm reading ``Modular Object-Oriented Programming with Units and Mixins'' now. The slogans make a lot of sense: * UNITS: Separate a module's linking specification from its encapsulated definitions. * MIXINS: Separate a class's superclass definition from its extending definitions. Maybe i should give it a try? Entry: units Date: Fri Feb 22 01:06:40 CET 2008 looks like units + modules are going to be enough to organize brood without the need for a NS hash table. how to exactly chop it up is still a bit of a mistery. maybe start with the plain CAT code, then organize the macros in a similar way, then find a way to translate forth code straight to s-expression. what if i start with separating out the assembler as a unit? in the end i'd like to be able to use externally provided assemblers / C compilers. in doing so, abstracting the data types that are passed between assembler and linker might be necessary. these are assembly opcodes, dictionary and compiled target words + linker data. Entry: call by need Date: Fri Feb 22 12:18:56 CET 2008 was trying to quickly hack up a solution in scheme that emulates makefiles and i realized it's actually call-by-need, which is again the same as the dataflow serialization problem (pd). which can be extended to early reuse by transforming it into a linear language (i.e. forth). Entry: delimited continuations Date: Tue Feb 26 13:26:36 CET 2008 best to start here: http://pre.plt-scheme.org/docs/html/reference/Evaluation_Model.html#(part~20prompt-model) i think i sort of get it.. the analogy of stack frames, but more general since they can be tree-structured (just like environments). all the operations on continuations are then compositions of these trees, with restrictions on how far back in the tree continuations can be captured, and rules on composition that makes sence in light of these restrictions. Accessing a tree as if it were a stream and ``updating'' in-place without mutation.. http://lambda-the-ultimate.org/node/969 Entry: errortrace Date: Thu Feb 28 11:50:17 CET 2008 http://pre.plt-scheme.org/docs/html/errortrace/installing-errortrace.html this work when using it like this: Welcome to MzScheme v3.99.0.13 [3m], Copyright (c) 2004-2008 PLT Scheme Inc. > (require errortrace) > (enter! (file "/tmp/test.ss")) [loading /tmp/test.ss] [loading /usr/local/mz-3.99.0.13/collects/scheme/base/lang/compiled/reader_ss.zo] [loading /usr/local/mz-3.99.0.13/collects/syntax/compiled/module-reader_ss.zo] > (a) error: bla /tmp/test.ss:8:12: (error (quote bla)) /tmp/test.ss:6:12: (b) === context === /tmp/test.ss:7:0: b /tmp/test.ss:6:0: a /usr/local/mz-3.99.0.13/collects/scheme/private/misc.ss:63:7 the file /tmp/test.ss is: #lang scheme/base (provide a) (define (x) #f) (define (a) (b) (x)) (define (b) (c) (x)) (define (c) (error 'bla)) now, to incorporate it in snot, it looks like there's a combination needed with prompts. indeed.. the error printing works fine when wrapped in 'prompt', and execution continues thereafter. http://pre.plt-scheme.org/docs/html/reference/cont.html#(mod-path~20scheme~2fcontrol) first thing to note: 'prompt' and 'abort' i can add those in sweb instead of the current combination of parameters and call/ec. second: prompt is readily applied in the repl in brood, at run/error in host/base.ss it works for host/purrr.ss by replacing the toplevel error printer by a prompt. probably can do the same in snot. hmm.. it's not in snot that the prompt should be. i did add some marking to the code that prints 'language-rep-error' in case the underlying rep (provided by the program!) doesn't print the error itself. so in brood the error should be printed, and preferably INSIDE the box context. "console.ss" is loaded in the snot context from "snot.ss". the latter file registers the different languages using the 'register-language' snot function present in snot's toplevel. ("snot.ss" is not 'require'd but 'load'ed) what i'm interested in is frames that run up to the sandboxed evaluator, so maybe it should be implemented in snot/box.ss ? see snot ramblings for more.. Entry: continuation marks Date: Thu Feb 28 17:20:17 CET 2008 http://www.cs.utah.edu/plt/publications/icfp07-fyff.pdf currently continuation marks are used to make some kind of scat language trace through the code. basicly, i can put anything there i want. it's reassuring that the basic mechanism is available. (also, this idea is very related to some dynamic variable hack i tried in PF.. don't remember if it's still there..) something strange that i didn't know about exceptions: apparently the handler is executed in the context of the 'raise' call! that explains a lot. no.. this is not the case: (define param (make-parameter 123)) (with-handlers (((lambda (ex) #t) (lambda (ex) (printf "E: param = ~s\n" (param))))) (parameterize ((param 456)) (begin (printf "B: param = ~s\n" (param)) (raise 'boo)))) gives: B: param = 456 E: param = 123 ok: i'm confusing the lowlevel 'handle' with the highlevel 'catch'. the paper mentions how to implement 'catch' on top of 'abort', but also talks about interference of prompts, and the use of tagged prompts to work around that. so the bottom line: exceptions and prompts do not collide, because the prompt tag used to implement exceptions is not accessible. this does mean that an exception can jump past any arbitrary prompt. question: how does this work in sandbox? apparenlty sandbox re-raises exceptions: see the internal function 'user-eval' in 'make-evaluator*' in scheme/sandbox.ss : the value that comes from the channel is raised if it's an exception. something is still don't understand about mixing of prompts and exceptions. if i don't wrap a prompt around the evaluation in host/purrr.ss exceptions will terminate the program, so the prompt seems to terminate propagation and trigger the printing of the error. however, doing this down the chain in snot doesn't work like that.. a prompt with default tag wraps the toplevel, so the whole continuation is also a partial continuation (upto that prompt). hmm.. then i read this: "The default prompt tag is also part of the built-in protocol for exception handling, in that the default exception handler aborts to the default tag after printing an error message." note this says 'default exception handler'. so if there's one above the prompt, that one will be called instead of the default handler. Entry: roadmap Date: Thu Feb 28 14:45:25 CET 2008 adjusted roadmap: * get base language working without NS + put in separate module. * figure out how to use units for plugin behaviour then follow up with entry://20080215-112019 it looks like understanding the namespace issue by first moving the core component to a more native namespace management system is a key element. the rest should then be mere disentanglement. TODO: separate SCAT as a different project separate it from NS Entry: eval vs. require Date: Sat Mar 1 19:56:18 CET 2008 the key insight (finally) seems to be that the current 'eval' based approach needs to be replaced by 'require', or an underlying mechanism that allows module based namespace management. everything that now goes through the NS hash can be done with module namespaces. Entry: module namespaces Date: Mon Mar 3 00:27:20 CET 2008 everything reduces to scheme code in modules, which makes things easier to extend. (also for parsers?) (define increment (lambda s (apply base.+ (cons 1 s)))) the idea is that 'increment' can be imported as 'base.increment', or anything else, using prefix imports. there's no need to specify the target namespace unless there are clashes between scheme and the functions defined in the module, which can be avoided by not importing scheme bindings, and separating definition of base. primitives (which has scheme available) from definition of composites. composite modules then only contain definitions which map some namespace -> (un)prefixed. to this end, a similar aproach can be used as the 'find' plugin in the rpn syntax currently used for NS linking. the 3 elements: syntax, source ns and dest ns can be specified like before. (just make a namespace translator?) problem solved? probably only units left: plugin behaviour needs to be handled explicitly. Entry: language tower Date: Mon Mar 3 00:38:37 CET 2008 scheme base snarfed scheme functional rpn state macro primitives macro forth machine wrappers forth why so many? they all solve a single problem in a very straightforward way. base snarfs functionality from scheme, state is a lifted base + threaded state, and macro implements the greedy machine map + peephole optimizer using a threaded state model. misc hacks from plane notes: - auto snarf through contracts - use #lang scat/base for base->base maps (is purely declarative language possible?) - decouple module as unit to speed up compilation during incremental dev. (fake image based dev) - get rid of @ stx for streams (scribble) / find standard streams lib / use lazy scheme. (brood is pure FP so why not) - use parameters for compiler object (also for NS stuff?) Entry: parameterized transformer Date: Wed Mar 5 14:30:34 CET 2008 instead of using a compilation object, it might be more convenient to use parameters in the transformer environment to define functionality for the basic syntax operations. maybe best to write the rpn code from scratch in scat/rpn/ Entry: scat ready Date: Thu Mar 20 08:38:42 EDT 2008 looks like the lowest layer of rpn code + namespace management is done. made a nice extension that allows parsers to be written as syntax transformers (like it should!). until the representation part is finished and ready to be ported to brood, the process is documented in the dev log at http://zwizwa.be/ramblings/scat Entry: BROOD-5: initial move from BROOD-4 Date: Fri Feb 29 12:39:38 CET 2008 This ramblings file is a merge between BROOD-4 and BROOD-5. The new version is codenamed SCAT, and is a complete rewrite of the core representation and name space handling code. The darcs archive has been flushed as has happened before. The old histories are still available at: http://zwizwa.be/darcs/brood-4 http://zwizwa.be/darcs/brood-2 http://zwizwa.be/darcs/brood-1 (brood-3 didn't have a history flush, and is present in brood-4) Entry: utilities = language ? Date: Fri Feb 29 13:50:02 CET 2008 Splitting brood in 3 components: brood, scat and zwizwa brings up the problem of code bundling. there are 2 views of modules: - what they provide. this is the most important form of organization. there's a spectrum with 2 extremes: one object, and everything. the latter is a utility module, which is akin to a language. the former is a component module: an abstracted collection of code with a very limited interface. - how they are used. using component modules is straightforward: since they are often highly specialized, dependencies between components can be clean and understandable. using utility modules is not: granularity is much finer, and they behave more like "background noise": stuff you need to know about, but can assume to just "be there". therefore, when using utilities in a project, like scat, it's maybe best to take a single file and make sure it exports a non-colliding set of tools. so the purpose of that single file is to be a decoupling point, providing a language to the client, and importing small utilities and components from all kind of different sources. so the approach i take is to have one collection of utilities (zwizwa-plt), and have a single file in each project that uses a base language with (a subset of) these utilities present. an organic analogy: GRASS = all permeating language (base lang + utilities) TREES = specialized program components Entry: scat without ns Date: Fri Feb 29 14:26:39 CET 2008 how to proceed? this needs abstraction of definitions in 'composite' macro and abstraction of 'find' in code bodies. the latter is already worked out. TODO: make base.ss independent of ns.ss might be a good opportunity to start documenting. maybe try out scribble? scribble is quite nice. disentangling ns is not going to be simple though. there's aproblem that i didn't think about: BROOD is fraught with occurences of defining one language in terms of another one (i.e. primitive macros). will this still work? i do need different namespaces. should they also be just prefixed? => this is a core problem and needs a proper interface! also.. why not use real objects for the rpn-tx.ss plugin behaviour? maybe it is overkill. Entry: for-template and scheme/base Date: Wed Mar 5 16:44:26 CET 2008 setting: 2 modules test.ss (require (for-syntax "rep.ss")) rpn-tx.ss (require (for-template mzscheme)) when test.ss is #lang mzscheme, or #lang scheme, this works. however, for #lang scheme/base i get an error: /home/tom/scat/scat/rpn/test.ss:8:2: compile: bad syntax; function application is not allowed, because no #%app syntax transformer is bound in: (lambda (stx) (syntax-case stx () ((_ . code) ((rpn-represent) (syntax code))))) after adding a 'for-syntax mzscheme' or 'for-syntax scheme/base' in test.ss it works. what works is to add (for-template scheme/base) in rep-tx.ss and (for-syntax scheme/base) in test.ss looks like there's different phase 1 bindings for mzscheme and scheme/base.. or something.. i don't really understand. i try to explain: tom@del:~/phase-test$ cat tx.ss #lang scheme/base (provide gen-code) (require (for-template scheme/base)) (define (gen-code) #'(+ 1 2)) tom@del:~/phase-test$ cat use.ss #lang scheme/base (require (for-syntax "tx.ss" scheme/base)) (define-syntax gen (lambda (stx) (gen-code))) why are both requires of scheme/base needed? EDIT: take a look at these expands: box> (expand-syntax #'(module broem scheme/base (define foo 123))) (module broem scheme/base (#%module-begin (define-values (foo) '123) (#%app void))) box> (expand-syntax #'(module broem mzscheme (define foo 123))) (module broem mzscheme (#%plain-module-begin (#%require (for-syntax scheme/mzscheme)) (define-values (foo) '123))) mzscheme has mzscheme included in phase +1, while scheme/base does not. (what's the difference between %plain-module-begin and %module-begin ? Entry: snarfing Date: Wed Mar 5 18:27:53 CET 2008 got the syntax part working. next = snarfing from scheme to scat names. got compositions working too. do need to take a good look at repl toplevel vs. module local names. how to destinguish between undefined names in module context, and use of a toplevel name in repl context? EDIT: actually, it's not necessary. it might be easier to just map to a name when it's not lexical if the distinction between module-local and other isn't necessary, and let the scheme name resolver take care of it. Entry: namespaces Date: Thu Mar 6 13:22:21 CET 2008 it's a lot simpler now. The dispatch routine interprets two kinds of identifiers: lexical -> used as is other -> prefixed if a finer grained control is necessary, the rpn-global (and possibly rpn-lexical) parameters can be overridden. NOTE: it indeed makes a lot of sense to do this with parameters: it falls into the "printing" design pattern: code that transforms a description into a "document" according to a set of global parameters. Entry: language names Date: Thu Mar 6 16:02:48 CET 2008 is that still necessary? the only reason it's there is to interpret the source code, but if that interpretation isn't always possible, why keep it? it's also pretty awkward to fill the parameters everywhere.. with the risk of breaking things, i'm going to take out the language names. probably best to have the source code field represent an expression that evals to the object, and let the debug code that uses the source field interpret the whole expression. Entry: state syntax Date: Thu Mar 6 18:25:20 CET 2008 scat/state doesn't do anything else than switching namespace for quoted programs. symbols need to be imported explicitly. so the question is: how to import a bulk at once? EDIT: forgot immediates! ok.. seems to work now. maybe this is it: (namespace-mapped-symbols (module->namespace '(file "/home/tom/scat/scat/base/base.ss"))) not so difficult: see module->names in ns.ss now the question is, how to map this to a 'require' form? this doesn't seem to work, i get some strange errors that might be due to the fact that i'm using dynamic-require into the current namespace. what really needs to be done is just determine the exports of a module: the module then needs to be compiled, but not instantiated. maybe 'module-compiled-exports' is a better way to get to the exports? from scribble/search.ss: (module-compiled-exports (get-module-code (resolved-module-path-name rmp)))]) ok.. from syntax/modcode i can use get-module-code to do this: (define (get-exports path) (map car (car (call-with-values (lambda () (module-compiled-exports (get-module-code (resolved-module-path-name (make-resolved-module-path path))))) list)))) which works on: (get-exports (string->path "/home/tom/scat/scat/base/ns-tx.ss")) now i just need a way to make this work on ordinary require specs. + solve some problem in ns.ss (%app) looks like i've got something that works from toplevel.. now need to get it going in modules. almost.. the rest is for later. Entry: modules Date: Sat Mar 8 17:12:01 CET 2008 got stuck yesterday at translating require specs -> module file location. going to leave it, and try to get the symbol snarfing working. box> (define-lifted (state) (base) state-lift "base/base.ss") compile: bad syntax; reference to top-level identifier is not allowed, because no #%top syntax transformer is bound in: module i have no idea what's going on here.. what this means is that: * the compiler maps undefined symbols -> toplevel refs * there's an undefined symbol mapped to toplevel refs * there's no toplevel Entry: start from scratch Date: Sun Mar 9 14:43:10 CET 2008 probably need to read a bit about modules, namespaces and compilation. for example, what does this code actually do? (module test mzscheme (provide foo boo) (define-syntax (boo stx) (syntax-case stx () ((_ . args) (begin (printf "compiling\n") #`(+ 1 2))))) (define foo (boo))) when evaluated, it declares a module, compiling its code. before i can understand things, i need to see the relation between: - compilation handlers - load/use-compiled - get-module-code - namespace - namespace's module registry (namespace-attach-module) - code inspectors the get-module-code approach works, but somehow is context-dependent (current namespace / module registry?). it would be interesting to find a method independent of context. so... a namespace is somthing that maps names -> things, to be used by 'eval'. it's a generalization of the standard scheme toplevel. each namespace has a module registry. modules declared in a namespace will attach to this registry, to be referenced by an identifier. Entry: getting at the names.. Date: Mon Mar 10 23:06:32 CET 2008 got something that works: ;; Get to the exported names by requiring the module into an empty ;; namespace which has the base module attached to its registry. (define (get-names path) (let ((n (make-base-empty-namespace))) (parameterize ((current-namespace n)) (namespace-require/expansion-time path)) (namespace-mapped-symbols n))) still stuck at some #%app problem further down the line, but the names come out. wait.. what a mess! the previous one did work, and the current one doesn't (needs absolute paths).. get-module-code is ok. this seems to be problematic: (define (define-ns-tx stx) (syntax-case stx () ((_ ns name val) (let ((mapped (ns-prefixed #'ns #'name))) #`(define #,mapped val))))) the name created here 'mapped' is not recognized as a module-local one. ----------- broem.ss #lang scheme/base (require (for-syntax scheme/base "broem-tx.ss")) (provide foo) (define-syntax (foo stx) (foo-tx stx)) ----------- broem-tx.ss #lang scheme/base (provide foo-tx) (define (foo-tx stx) #`(+ 1 2)) this gives box> (require "broem.ss") box> (foo) broem-tx.ss:4:4: compile: bad syntax; function application is not allowed, because no #%app syntax transformer is bound in: (+ 1 2) === context === /usr/local/plt-3.99.0.12/collects/scheme/sandbox.ss:445:4: loop the thing that's missing here is the (require (for-template scheme/base)) in broem-tx.ss OK.. so it seems that now the remaining problems are because some names are expanded to toplevel form because they are not visible somehow? #lang scheme/base ------- lala.ss (require (for-syntax scheme/base)) (define foo 123) (define-syntax (broem stx) (printf "compiling broem\n") #`(define lala #,'foo)) (broem) (provide foo broem lala) that doesn't create the lala symbol.. why? Entry: more confusion Date: Tue Mar 11 12:27:28 CET 2008 The module-path as passed to require is resolved as: ((current-module-name-resolver) '(file "asdf") #f #f #f) actually, the docs explain it quite well: the name resolver also loads the module and places the name into the current registry when 4th arg is #t.. so we have the connection: current-module-name-resolver -> current-load/use-compiled EDIT: also look at 'expand-import' from scheme/require-transform 12.4.1 Entry: the problem Date: Wed Mar 12 09:16:25 EDT 2008 so.. what about making a test case for the actual problems? module a: exports a couple of names module b: reads module a's source, extracts names, and uses them why can't you export a name created by a macro? i think i did this before, but the experiment above doesn't work.. what's up? Entry: again Date: Thu Mar 13 09:07:52 EDT 2008 why is this so confusing? what i try to do is symbol capture. or not? 2nd question first. let's look at how structures are implemented: they create new names. maybe 'syntax-introduce' is necessary? i clearly have to stop tackling this in a half-assed way. what's going on here seems to be me being confused about the actual inner workings of the module system, syntactic forms and namespaces. it's at the very core of the language, nothing to try to random-walk around.. time for some discipline. revised questions: * how to export a name generated by a macro? * is my approach (snarfing names from modules) 'the right one'? roadmap: read about core syntax and hygiene. 2.3 Expansion (Parsing) -> very useful ;) the #%app error usually means that an identifier is not syntax, but a function application. if phase 1 doesn't have a language instantiated (which defines #%app) this is an error. the #%top error probably means that an identifier is not defined. in modules, the #%top identifier is not defined? 3.3 #%top can be used to override lexical bindings -> "Such references are disallowed anywhere within a module form" evaluating undefined abc as (#%top . abc) in toplevel gives: reference to undefined identifier: abc in module level: reference to an identifier before its definition: abc in module: "/tmp/test.ss" Entry: defining names from macro Date: Thu Mar 13 12:27:21 EDT 2008 #lang scheme/base (require (for-syntax scheme/base)) (define-syntax (foo stx) #`(define bar 123)) (define-syntax (broem stx) (syntax-case stx () ((_ name) (printf "expanding (broem)\n") #`(define name 123)))) (broem lalala) (foo) ;;--- this defines 'lalala' as a symbol, but 'bar' is not accessible. looks like a hygiene issue where hygiene has to be broken explicitly? a better approach is maybe to expand to a 'module' form, so all symbols are introduces in the same place, and no capturing is necessary? something like: (module state-snarfs "snarfer-lang.ss" (snarf-from "../base/base.ss" (state) (base) lift-state)) where 'snarf-from' is a macro provided by snarfer-lang.ss which expands to a #%plain-module-begin form. rationale: what is a module? it is a finite map (name -> stx/value) in that sense, a snarf is maybe indeed better exposed as a module transformer, mapping modules -> module instead of modules -> expressions again: the alternatives are: * variable capture for define forms (non-hygienic) * whole module expression generation Entry: datum->syntax Date: Thu Mar 13 14:59:33 EDT 2008 this worked: (define-syntax (foo stx) (datum->syntax stx '(define bar 123))) it defines 'bar'. replacing stx with #f doesn't work (no #%app error). the same thing with #' doesn't work.. apparently, the latter creates some lexical bindings.. so, the proper way to break hygiene is using 'datum->syntax'. the thing to look at is build-struct-names from syntax/struct this seems to work for the current state.ss update: using (->syntax stx ) with stx the stx object that's passed to define-lifted-tx, it seems to work fine. looks like the problem was (->syntax #f ) in short: the stx object seems to have knowledge of the module's namespace, can tell whether names are module-local? Entry: module path Date: Thu Mar 13 17:41:54 EDT 2008 more confusion: resolve-module-path could maybe be used in combination with (syntax-source-module) to resolve relative references? * have a look at module index : EDIT: ----- test.ss: #lang scheme/base (require (for-syntax scheme/base)) (define-syntax (foo stx) (printf "~a\n" (module-path-index-resolve (syntax-source-module stx))) (datum->syntax stx ''bla)) (foo) gives: image> (require (file "/tmp/test.ss")) 'test bla so it returns a symbol instead of a path.. EDIT: incredible.. this also works, but only in the dynamic extent of syntax transformers (why?) : (define (module-mapped-symbols module-path) (let-values (((base-phase template-phase label-phase) (syntax-local-module-exports module-path))) base-phase)) Entry: what's in a stx object? Date: Fri Mar 14 06:50:58 EDT 2008 looking at 2.3 expansion, it seems that some information at least is context-dependent. the question is: is this context stored in the stx object, or set by parameters? (looks like that implementation is internal : accessible by 'syntax-local-context') another: how is the lexical info stored? Entry: carefully tuned api for the compiler Date: Fri Mar 14 07:33:18 EDT 2008 all-in-all, it's quite nice. once the basic design elements are understood, and all over simplifying assumptions are cleared out by actually reading the docs (!). there is a huge emphasis on macros and modules (static), instead of "building new interpreters" as in SICP. Entry: on with the real work Date: Sat Mar 15 07:38:50 EDT 2008 tough this made me think a bit.. this snarfing business, is it really necessary? or will it work at all? currently it uses some form of delegation to implement functionality: if it's not in (state), take it from (base) what is necessary is to delay the 'fill up' to the last stage at which point, after declaration of (state) functions. Entry: the design choices Date: Sat Mar 15 09:09:15 EDT 2008 * use syntax objects for everything that represents code. this gives the best match to PLT scheme: it allows to use most of the underlying machinery in the way it integrates well. - representation: lambda + redefinable language structure - namespaces: modules and units - compile/interpret: use the scheme expander instead in retrospect, this took a long time for me to understand, but it looks like i'm finally getting it. identifier scope management (lexical, dynamic, module, unit) is hard. if somebody does it for you, then use it! * syntax streams: are these really necessary? they complicate things due to having to deal with both lists and streams. maybe, when everything is syntax and usable as scheme macros, streams are no longer necessary? it looks like the more important choice is going to be to have a complete map from text -> binary code / assembly code that lives in the syntax domain. Entry: quoted programs Date: Sat Mar 15 10:01:51 EDT 2008 in a code quotation, it should be possible to override the language. if the first identifier in a program is a syntactic form, then use it to transform the expression, otherwise use default. not so straightforward. actually, it is: using (define (transformer? id-stx) (syntax-local-value id-stx (lambda () #f))) maybe the possibility to do this recursively would be nice? that way all kinds of syntax extensions can be implemented, and forth could be represented as-is. this can probably be handled in dispatch? almost. it's a partial fold, while currently represent is a full (left) fold. fold up to encountering a transformer, then pass the remainder of the expression to the transformer (who might call represent again). first: cleanup some things. move the symbol mapping to a single function. second: having syntax in the middle of an expansion isn't really sound, right? what would (a _b c) do, with _b syntax? (represent (a _b c)) -> (lambda s (represent/step (a _b c) s)) -> (represent/step (a _b c) s) -> (lambda s (represent/step (_b c) (dispatch a s))) well.. it does make sense. if a is not syntax, the result is an expression ok.. i got something working: string things together with 'rpn-compile' until a transformer is encountered, upon which the rest of the code is handed off. however: still can't use '(lambda )' because the there's already a lambda wrapped. i wonder however: why not allow true parsing words? the loop over a code body could be made more explicit: give words access to the syntax objects. the traditional 'immediate' words could be used. no.. they need to be macros: need to be available at expand time, and function bindings aren't. so i'm on the right track. Entry: lexical tricks Date: Sun Mar 16 14:51:04 EDT 2008 maybe these lexical tricks are more of a nuisance than anything else.. if lexical capture from scheme is required, why not just prefix those names? what is lost here is an abstract way to access the namespace, but what's gained is more clarity of mechanism + readability of code. maybe that's worth more? in that case, a simpler non-intrusive prefix might be desirable, or a let-form that abstracts the prefixes. ok: took out the automatic lexical tricks: * all identifiers in rpn code are now mapped in the compiler * unquote works in - quoted () code -> unquote value interpreted as a function - quoted ' data -> unquote value placed in data structure looks better this way. Entry: typed writer monad Date: Sun Mar 16 15:58:10 EDT 2008 the state extension used in brood works well for macros, but is very hard to use with the dictionary in prj. why is this? the conflict is about what the meaning of quotations should be: if they're state functions, all code that takes programs should be redefined such that they pass on the state. this makes sense for state functions, but is infeasible unless it can be done automatically. passing state code to non-state code doesn't work. so what if code were typed? the problem is that i'm trying to implement some half-assed monad thing without having a type system to make it convenient.. so... base can't run state code, but state code can be converted to base code if it doesn't contain any real state actions. probably state needs a 'run' that can distinguish between the two types. can the functionality of run be somehow 'replaced' when 'inside' state code? in order to answer, need to define what 'run' means. Entry: run Date: Sun Mar 16 16:38:27 EDT 2008 since 'run' is the ONLY PRIMITIVE word that accepts code, any code can be made to run by overriding the behaviour of it. is this leading anywhere? if there are 2 languages (base and state), there should also be two language types when code in these languages is represented as data AND two STATE types. once these things are in place, it should be clear what the behaviour of run should be: dispatch on code and state type and do the right thing: \___ code STACK STATE data\___ STACK apply error STATE apply/lift apply in order to not have to change everything, the base stack type can be just a list. anything built on top though, needs to be typed. let's start with the state type. OK. seems to work. this looks like a generic 1-arg language (in scat-state.ss). looks like it's factored in the right way now: didn't take any changes to core to change the rep completely. next: word structures. maybe take out source rep first: if it's syntax, i can probably find source rep from source text instead of storing it.. DONE. the fact that SCAT words are procedures, is it a feature? or is a more abstract interface required? now base and state have different reps, maybe 'apply' (run) should be made abstract? or.. each function object should carry a run method, which accepts different states/stacks/arguments ? i'm being pushed towards a staticly typed language.. Entry: lifting applicators Date: Mon Mar 17 10:45:46 EDT 2008 so.. splitting the stack/state objects themselves into different types is straightforward. now, how should a language type be implemented? as a run method? this is probably the most straightforward way: the default could be apply, and anything smarter than the base langauge simply overrides it. this is confusing. if a word has an explicit 'apply' method, it's no longer a procedure (which has an implicit, universal apply method). dead end? yes.. a word structure is either a procedure, or not. why should it be an abstract type? -> * to add annotation * to distinguish from other procedures let's reach for the bottle: can i solve this with dynamic binding? no.. all the words that somehow are to operate on the state need to be lifted. it's not that the functions deep inside some dynamic extent need to be updated, it's that the state itself needs to be accessible. it's probably possible to solve part of it with dynamic binding, but that will not play nice with closures.. so.. to come back to a comment in brood: the problem is not that prj state language is a bad abstraction, it's that the lifting of operators from base -> state is not so straightforward : anything that passes or duplicates the whole state needs to be handled. eventually, this boils down to 'run' and anything built from it. maybe the compilation should handle this? the next step to the solution is: 'run' should be isolated: no-where should there be 'apply': i could keep the base representation for debugging purposes, but everywhere else it should be abstract maybe 'base-control' should be separated out again, like it was in BROOD-2? am i going in circles? not really: but w.r.t. to lifting, control operations are different. next: split off control.ss which contains all the code tainted with 'run'. so, why is dynamic binding for state so bad? suppose there's a transition from scat/state -> scat/base, and something needs to access the state inside the scat/base extent. if this is allowed, the whole base language is no longer referentially transparent. let's see.. ifte : (choose run) is it possible to lift 'ifte' if the source is not available? maybe i just need continuations? there's something right in my face here i just don't see.. Entry: lifting Date: Tue Mar 18 08:46:21 EDT 2008 the thing that bugs me is that currently, the only way to do the lifting is to manually update and duplicate all the code: i found an operation that doesn't compose easily. this means that the whole of 'control.ss' needs to be parameterized, so it can be included in the state language. what this control needs is: * a way to apply a function to the abstract state * a way to push/pop to the stack in this state should i give up the concreteness of the representation? currently, the only place where i use it is in debugging (and that's ok) or in lexical binding of functions (which can be dealth with using macros). when what currently is 'apply' becomes abstract, a genuine compositional language can emerge. OK: got stack abstract now. change wasn't so deep, means i'm getting better at choosing the right abstractions.. to summarise: the base language has the following components: * RPN TRANSFORMER * STACK DATA TYPE * NAMESPACE IDENTIFIER MAPPER this is extended easily to STATE DATA TYPE Entry: representation Date: Tue Mar 18 11:33:20 EDT 2008 so... what about: represent everything with - procedures (closures) - structure types - tagged lists tagged lists can be used to implement structure types, but closures can too. (the example of 'cons' in SICP). iirc, in Haskell, structure types are unevaluated functions. basically, these are all interchangable, and are merely about implementation. however, in PLT scheme, using structure types seems to be more efficient. i see myself moving more from the "wow, everything is a list!" and "eval is amazing!" attitude towards the static undertone in PLT scheme, which seems to be more based on the language tower model (macros are no longer 'accumulative' -> everything is unrolled into a tree structure) and structure types: more abstract than tagged lists. everything is more static without loosing any power, except for a little bit of quick-hack power.. Entry: sidestepping the problem Date: Tue Mar 18 12:17:33 EDT 2008 what if i sidestep the whole issue, and define base to have a void state? that way, the skeleton can be kept, but arbitrary data can be threaded through code. all control code just passes on the state, without being able to modify it. how is this really different from an 'environment' value to be passed along? if you look at how 'invisible state data' is used, it is quite related to lexical variables. for example in (lambda (x) ), the 'x' might be used very deep inside , so for all the nesting inbetween, this 'x' is not used. the equivalent of a deeply nested expression in an rpn language is a long composition. let's give it a try.. seems to be the right thing to do. next problem = to abstract control. Entry: control Date: Tue Mar 18 13:29:05 EDT 2008 control operations are things that access the data stack, and are able to apply functions to state, collect state, etc.. without having access to the state themselves. maybe this is a nice occasion to start using structure types and a modified match with (struct ( ...)) -> ( ...) Entry: dynamic trick Date: Tue Mar 18 15:55:40 EDT 2008 instead of making a lot of control words for each dynamic invocation, there's a mechanism now that takes a function which accepts a thunk, and evaluates the thunk in the dynamic environment. the thunk represents the rpn continuation. Entry: comprehensions Date: Wed Mar 19 10:00:21 EDT 2008 today's excursion into plt land is about comprehensions. the main reasons: * for-each: the concatenative program interpreter * for: number -> list maps, for low-level ops hmm.. leave it for later. i fixed it like it was, but turned interpret-list into a function (was syntax). looks like the language is ok now. it's simpler, and hidden state is easier to implement. purity is guaranteed by just not passing in any state. so can the whole state namespace be ditched then? namespaces are still necessary for the brood macro language, but not any more for prj. the important thing is that there's no need for lifting, since the base and state language syntaxes are the same now. let's call it all scat. looks like this solves a lot of problems. Entry: the scat machine model Date: Wed Mar 19 12:22:24 EDT 2008 all scat functions take a data stack, and a hidden parameter used to implement any hidden data that bypasses the computation. the reason for implementing it like this is that the code operating on the data stack (the SCAT langauge) is orthogonal to extensions that implement hidden state. the reason that the pure functions ALSO pass this state around is to avoid the problem of lifting pure functions to state passing functions. if purity is desired: simply do not pass any interpretable state into a computation. to check purity: pass something that can only be interpreted locally (checks read), and check if it's the same when it comes out (checks write). why is hidden state necessary? anything can be solved with combinators, but for some problems, bypass can be be a tremendous simplification. relate this to: * lexical variables (bypass trough random access in environment) * monads (bypass bookkeeping implemented by 'bind' and 'return') functions can then be classified into 3 groups that do 1) not know about state (define-word). 2) know about state, but merely pass it on (define-control, make-state) 3) modifications on the state data (like 2, but with data accessors) i'd like to clearly separate 2 and 3, but see no way to do now other than giving group 2 no means to interpret the data, and require functions in group 2 to never replace the data. the latter could be guaranteed with an assert, the former is namespace management. i'm done ;) Entry: from here Date: Wed Mar 19 13:47:33 EDT 2008 i guess it's time to start rebuilding brood on top of this structure. the core changes are: * create the macro language * rewrite prj.ss going to keep macro.ss in scat for now. default semantics seems to work fine, but name space mapping needs to make the distinction between defined macros, and undefined ones which default to calls. maybe this is a nice point to lift target namespace management to scheme by requiring all words to be defined as macros? box> (define-ns (macro) abc (postponed-word 'abc)) next? core is working fine. the remainder is namespace management and rewriting parsing words. Entry: parsing words Date: Wed Mar 19 16:40:59 EDT 2008 what about this: all scheme transformers are unary functions. nothing prevents me though to add binary functions to the transformer environment. these could then be reserved for calls from within 'next', because they don't fit the scheme transformer type. this so, because it's illegal to call them directly anyway: they use dynamic state set by the compiler macros. this seems to work pretty well. Entry: forth / macro mode Date: Wed Mar 19 18:41:12 EDT 2008 the next thing to implement is the forth / macro mode. this is a key issue, since i'd like to change this to a set of LHS = RHS expressions where names are clearly defined. part of this is delimited parsing. (EDIT: solved) to make this transition as smooth as possible, the basic syntactic form which defines both macros and words needs to be defined. the core problem is a tough one: allowing identifiers to be overridden requires some lexical structure. the second problem is to retain the 'inner' names after a compilation. maybe the enclosing structure can be saved and incrementally extended? what i'm talking about is this: (let-ns (macro) ((foo (macro: 1 2 3)) (bar (macro: 456))) ) -> (let-ns (macro) ((foo (macro: 1 2 3)) (bar (macro: 456))) (let-ns (macro) ((word (target-word 123)) (burp (target-word 567))) )) basicly, the target dictionary is a nested lexical structure like the above, which has a 'hole' in it. on the next compilation step, this hole can be extended. this way all name management can be delegated to the scheme expander. to make this workable, the state should be stored in a form that's like the above, but flat. see target-tx.ss for the resulting code. to summarize: * target state = dictionary = listof listof (name . code) each element in the top dictionary list consists of an association list for one compilation level. names within a level need to be unique and can be used recursively. * each forth <-> macro transition in forth code introduces a new nesting level. remaining q: how to represent non-macros? what about using 'forth:' to mean something that generates a macro which compiles an address. what about treating words as macros, but mark them as 'instantiated?' this might be interesting, since it allows some flexibility for code processing (i.e. code inlining). one thing at a time.. what's next? need to define some interfaces. most likely the assembler. where does it fit in? what does a forth file represent? Entry: representation Date: Thu Mar 20 08:40:18 EDT 2008 with scat ready, the next real problem is representation of forth code. time for some "what is" exercises. macros are not really the problem, but what is instantiated forth code? (let-ns (macro) ((foo (macro: a bar)) (bar (macro: c d e foo))) (let-ns (macro) ((baz (forth: foo bar)) (shma (forth: wikki wakki))))) in the end, after compilation, this leads to: (let-ns (macro) ((foo (macro: a bar)) (bar (macro: c d e foo))) (let-ns (macro) ((baz (macro: 123 compile)) (shma (macro: 456 compile))) )) so: a dictionary is a nested let-ns expression with a hole in it. compilation is filling this hole to get a new nested let-ns expression. it looks like the key extension is to move the assembler to the syntax level also. so what does the 'forth:' form do? * it creates a macro which compiles a reference, and saves code for later compilation in that lexical environment (closure) bound to that reference. so the essential part is to separate the creation of new names from evaluation of the righthand sides + complete unification of forth words and macros. (a forth word has a macro associated which compiles its body). testing this, but i'm missing an essential part of brood (compilation stack pattern matching) to get this working. EDIT: worked around it using base language. what i got now: (define empty-state (make-state '() '())) (define (run-macro macro) (state-data (macro empty-state))) (define-syntax forth: (syntax-rules () ((_ . code) (let ((word (delay (run-macro (macro: . code))))) (base: ',word compile))))) (define xxx (letrec-ns (macro) ((abc (macro: 1 2 3)) (def (macro: 4 5 6)) (broem (forth: 123)) (lala (forth: abc def)) (shama (forth: lala))) (macro: lala))) looks like a pretty good implementation. it even has a means to only compile what's necessary by adding an 'export' word: determine which functions get exported into the namespace. this structure creates a graph in which the nodes are forth words (instantiated macros). Entry: code structure: line or graph? Date: Thu Mar 20 10:43:43 EDT 2008 now.. look in to the 'structured assembler' (something from an Olin Shivers writeup about a project he worked on doing dataflow optimization for..) if assembly is just a serialized expression graph, why not keep it a graph longer? there are some features used in purrr that assume serialized code (fallthrough). is this really necessary? or should such code be grouped somehow as a single function with multiple entry points. it's interesting as low-level control, but a pain to do code transformations.. the rep mentioned above already has this. now, with this structured asm thing, maybe all local label issues can be solved that way too? NEXT: * unify macros and forth words (meaning of ';') * defining LHS/RHS and ':' * variables Entry: definitions Date: Thu Mar 20 16:59:55 EDT 2008 a problem popped up: either ':' (upto) is terminator, or ';' is a terminator (including). the real problem: if ':' is a macro, it should be able to introduce new names. how to do that? macros are parsed inside the body of a toplevel 'represent'. instead of doing it like that, rpn-next could be called directly. ok. got it working by abusing the compiler a bit and storing continue points in a dynamic variable. i tried without side-effects (using prompts), but can't get that to work. now with compilation mode built in. Entry: forth rep Date: Fri Mar 21 15:33:11 EDT 2008 seems to be fixed now. see the files forth.ss forth-tx.ss and forth-rep.ss : the result of evaluating a (definitions . ) form is an assembly code graph. now, how to save the macro definitions so incremental compilation can be performed? the 'definitions' macro should be extended with an input assoc-list of defined words (and macros?). Entry: incremental compilation Date: Sat Mar 22 09:12:25 EDT 2008 an intermediate representation is necessary which preserves the macros in some form so they can be re-instantiated, but also preserves the forth words in some form. is it possible to somehow grab the source code (after lexing) of each word? that way original macro source can be preserved, and forth source can be translated to abstract rep (only addresses). basic idea: can't use source code to save current language state. maybe save the environment functions together with the forth structs that are generated? i.e. if address is filled, return that, otherwise return word struct. maybe the macro from yesterday needs to be factored a bit? OK. incremental updates are implemented as a simple nesting of letrec-ns forms. so, how to represent the target state? i prefer to have this in a readable form, preferrably one with macro source intact, and forth words resolved to numbers. first: separated out some things: there's a 2-level nesting with 'old' not being collected. need some factoring to reset parameters. or.. i could make the macros ephemeral, so they have to be included explicitly in each source file? or.. the whole thing runs at stx-transform time and expands into a module that defines a number of forth macros and a binary code chunk? maybe it's best to collect per level. then later, code that's not necessary can be ignored. can collection be done statically? the thing is: macros need to be saved in some syntactic form, not as an object. from this, it's not necessary to evaluate macros, except for things that generate code. instantiated live target code is represented by macros. so, ground rules: * evaluation of code gives a list of nodes in a code graph * saving of state = saving of macros as syntax. problem = how to save macros? each word evaluated needs to be evaluated in the lexical context. Entry: what is compilation of forth words? Date: Sat Mar 22 19:20:59 EDT 2008 essentially, in the current context, it's a source code map which translates all forth words to macros that compile a call: macro : abc 123 ; forth : xyz abc abc ; -> macro : abc 123 ; : xyz #x0010 execute ; so what is the essence? it's source transformation. i probably need to perform some operation twice: one to construct some syntax, and another one to evaluate functions. maybe first separate syntax and semantics better? let's try to catch the code first. it's probably easier to move from using a 'continue' thunk to just recording the point of the next definition. catching code is problematic: there might be a macro that generates several words or macros.. this cannot be cleanly cut out by cutting out each definition. the only real thing is what comes out in the end: that's what needs saving: the fully expanded nested let expression. looking at this: box-macro> (forth-words : abc 1 2 3 def : def 4 5 6 abc) (forth-words-incremental (: abc 1 2 3 def : def 4 5 6 abc)) box-macro> (forth-definitions (lambda (collect) (letrec-ns (macro) ((abc (mode-forth 'abc (lambda (state) (macro/def ((literal 3) ((literal 2) ((literal 1) state))))))) (def (mode-forth 'def (lambda (state) (macro/abc ((literal 6) ((literal 5) ((literal 4) state)))))))) (collect)))) actually gives the solution. instead of expanding to something which includes dynamic binding, just have them pass in as an argument. in this case: ditch the 'forth-definitions' and make 'mode-forth' and 'mode-macro' parameters. this gives a very clean separation of function and structure: the structure is just the expansion of all macros. function can be plugged in later. yep. works like a charm. box-macro> (forth-rep (macro : abc 123 forth : def 345)) (lambda (forth macro collect) (letrec-ns (macro) ((abc (macro 'abc (lambda (x) ((lit 123) x)))) (def (forth 'def (lambda (x) ((lit 345) x))))) (collect))) Entry: syntax: going further Date: Sun Mar 23 09:38:32 EDT 2008 what i'd like to save is not only this structure, but a means to transform it into something that * has addresses bound * can be extended made the 'lambda' part parametric.. is this necessary? so.. maybe move away from storing the delayed computation to something more concrete? like the rest of the syntax stream? kept it, but cleaned up the access a bit. so.. it's essential to have 2 functions: * the syntax transformer * the asm graph evaluator now, when the assembly step has finished, the original syntax needs to be updated (words replaced with address refs) and saved such that it can be reused later to build new syntax expression. so what is what? a dictionary is a collection of frames from 'compilation units'. (a CU is a single level in the final nested letrec expression.) composition of CUs is composition of syntax transformers. ok.. separated out the core form: code->reps made an extra macro called 'dictionary' which enables a slightly lighter concrete s-expression representation, so it is easier to edit when replacing words with macros compiling addresses. maybe add another called 'word' which transforms the names so expanded names don't have the prefix? this will probably clash with macros.. nope: there's a way between: i need this only in the representation, which is after macro expansion. EDIT: had to update rpn-tx.ss to expand forms returned by rpn-map-identifier before determining if the resulting identifier is a macro. so.. we get this: box-macro> (forth-rep dictionary (: foo 123 : bar 4) (: baz foo bar)) (lambda (forth macro collect) (dictionary ((foo forth (lambda (x) ((lit 123) x))) (bar forth (lambda (x) ((lit 4) x)))) (dictionary ((baz forth (lambda (x) ((word bar) ((word foo) x))))) (collect)))) where 'dictionary' and 'word' are macros that make this code a bit more readable. updating a dictionary to replace words with references goes like this: (baz forth (lambda (x) ((word bar) ((word foo) x)))) -> (baz macro (base: #x0123 compile)) maybe this deserves a little macro of itself to turn it into (baz macro (address #x0123)) the next step is to turn this dead representation into a live function that can be composed to generate new syntax transformers. so why a raw lambda, and not some form that has concatenative code? the problem is that this requires uncompiling: the lambda is the result of all syntax that might be defined in forth code, which can contain arbitrary scheme forms: there might be no 'simple' concatenative form. on to extension transformation. rep + name.address -> rep Entry: not currently transforming Date: Sun Mar 23 13:52:01 EDT 2008 PROBLEM: not being able to run the transformers because they depend on some expander environment is problematic. i'm using some functions that use context only available in `official' transformers. fix that. EDIT: this is not just 'some expander environment'. it's the lexical environment of the expression being transformed. also see entry about calling macros directly. entry://20080325-144330 in other words: is it possible to turn a function into a transformer? this requires some level shifting voodoo i don't see how to perform.. (define-syntax transformer (lift tx-fun)) ?? too simple to see ?? EDIT: looking at plt/src/mzscheme/src/env.c -> now_transforming (syntax-transforming?), this predicate is derived from the presence of a scheme_current_thread->current_local_env value. in eval.c -> expand the function _expand is called with a new expander environment. hmm.. too complicated to quickly browse. what i did see is that the environment during an expansion is ether dynamic, or attached to the stx objects. the latter makes sense, but i can't see any obvious refs.. the magic is done in resolve_env, which is quite complicated. it looks as if the info is not tied to the objects.. so guesswork: the expander has a dynamic environment which contains the lexical environment of an expression. Entry: what is dict? Date: Sun Mar 23 18:36:38 EDT 2008 the incremental compilation has type (DICT,SRC) -> (DICT,BIN) so what type is DICT? It needs to be something that can be serialized, so what about plain s-expressions? Is there information that cannot be captured? One problem is that arbitrary extensions might depend on external code. Representation should probably be a 'module' form which explicitly states its dependencies. Problem solved. Module forms are neatly representable by s-expressions since they have no free variables. Now how to go about this? A dictionary is a module that exports a function 'update' which takes forth source code, and outputs the expression of another module and binary code. This is interesting: it involves writing a "module quine" ;) EDIT: when you can modify the language in which to write a quine, the problem is trivial. the reason why some quines are interesting is the length it can take to express one in some language / system, or the extent to which detours can be taken.. It almost works, except for some constant redefinition errors. OK. got it working with arbitrary payload and update expression. some interplay between macros and s-expression code generation: on each iteration: * apply the update expression to the state * pass on the update expression * send the output very straightforward iterative system once the boilerplate generator is in place. this can be minimised by re-defining %plain-module-begin or something.. updates: can probably standardize the module name, since only one is necessary and it can be loaded into a sandbox. stylized the reflection loop a bit: now using #%module-begin macro and a minimalistic module spec which also carries over the body code. cleaned it up to a single macro: (define-syntax (module-begin stx) (syntax-case stx () ((_ (tick state) . forms) (let ((name (syntax-property stx 'enclosing-module-name))) #`(#%plain-module-begin (provide update) (define (update input) (let-values (((state+ output) (tick 'state input))) (values output `(module #,name scat/forth-dict (tick ,state+) . forms)))) . forms))))) so state update / storage is solved. Entry: dictionary update Date: Mon Mar 24 09:21:37 EDT 2008 the incremental compilation has type (DICT,SRC) -> (DICT,BIN) so, what does the module storage part need to implement? - name binding (i.e. add special require forms) - implement a transformer binding it will be used as a syntax include. Entry: lexical syntax annotation Date: Mon Mar 24 10:33:01 EDT 2008 still i don't fully grasp the notion of lexical information in syntax. the macros 'let' and 'lambda' annotate syntax with lexical properties, when they are expanding. so does this work? (define (make-expr body) #`(lambda (x) #,body)) ((eval (make-expr #'x)) 123) => 123 so it looks like indeed, building syntax like that does perform 'capture'. my problem is this (define (wrap stuff) #`(lambda (x) #,(stuff #'x))) (define (stuff stx) #`(let ((x 123)) #,stx)) (eval ((wrap stuff) 1)) => 123 so the inner let captures the #'x there's nothing special about defining the formal parameter x and the reference x in the same function 'wrap'. (define (test-stx stx) (syntax-case stx () ((_ (a) b) (bound-identifier=? #'a #'b)))) (test-stx #'(lambda (x) x)) => #t so 'bound-identifier=?' does do some lookup: it doesn't need to expand the syntax? EDIT: what i wonder is: is it possible to construct a function 'wrap' as above which can guarantee that the 'x doesn't get captured? this requires some kind of symbol rename. looks like it's a legitimate question: (define (wrap stuff) (let ((x (car (generate-temporaries #'(x))))) #`(lambda (#,x) #,(stuff x)))) (define (stuff stx) #`(let ((x 123)) #,stx)) these are constructed using interned symbols, not gensyms. let's update the lambda code. so what about syntax marks? 2.3.5 (define-for-syntax (wrap stuff) #`(lambda (x) #,(syntax-local-introduce (stuff (syntax-local-introduce #'x))))) (define-for-syntax (stuff stx) #`(let ((x 123)) #,stx)) (define-syntax (test stx) (wrap stuff)) ((test) 456) => 456 works.. so, if i understand: marking is an on/off operation: the net result is that syntax introduced by the transformer is marked, and thus not the same as syntax introduced elsewhere. EDIT: this does give problems with names introduced by 'letrec-ns' separately: they are no longer catched. so.. in the end, i'm better of doing everything non-hygienically? sounds like not right.. Entry: cleanup Date: Mon Mar 24 11:58:57 EDT 2008 i'm not in a great design mode today, so going to do some maintenance and simplification. changed some names in forth-tx + took out the mode symbols and hardcoded them to 'forth and 'macro -> lambda will capture them. (see the remarks about bound-identifer=? and the local transformer environment). the question that rises here is can forth code shadow the 'forth', 'macro' and 'collect' names? yes: but only locally within one forth file. maybe it's also better to flatten the dictionary representation that's stored in the modules? let's postpone this a bit.. TODO: 'with-forth' and the scat macro: collapse syntax and namespace. i tried to add a similar 'word' macro but apparently it doesn't do that.. maybe add syntax-parameters? break. Entry: dictionary serialization Date: Mon Mar 24 18:38:34 EDT 2008 but.. with these things going on, the representation is no longer serializable as an s-expression. so maybe i should look into flattening out everything to a simple associative list.. there's a bunch of problems there that need to be ironed out. maybe it's time to take a couple of days break from it? from what i've been reading today serialization if not in compiled format is going to be a problem due to hygiene: converting expansions to s-expressions is not a good idea.. compared to previous implementation, what do we have here: * no ambiguities for names * can obtain all macros from loading source code i guess it's time to start writing down some requirements and work from there.. Entry: basic structure Date: Tue Mar 25 08:44:07 EDT 2008 maybe i should just stick to using an opaque representation, and try to use this to gain access to the data. ther are 2 models to use this: * all code is available in source form: intermediate representation doesn't matter much, since it can be redefined. * some code is closed. in that case, the representation does matter, because it becomes an interface. it's the latter problem i'm trying to solve. if this could be almost human readable, but mostly independent of mzscheme's binary representation, that would be great. requirements: * a dictionary needs to contain binary code + macro code + reference to the compiler version / library it was made with. * a dictionary needs to be opaque, and at the same level of scheme modules. what about separating the incremental model from the libary model? the incremental model is mostly for developing. they are two different namespace models. * kernel: uses mzscheme's module name management system. * incremental: extends the flat public interface. maybe the next question is: what is a forth file? how to make "forth file" == "scheme module" and build the incremental compiler on top of that. it's very straightforward for macros. but what about words? every module that's compiled contains forth words the same way as the nested let. yes. it's better this way. Entry: the ':' macro Date: Tue Mar 25 13:00:05 EDT 2008 whenever forth->definitions is called, it needs to be done in an environment where level -1 has the some definition macro (':') defined. this can be ensured by requiring it for template. hmm.. something's wrong here. got some dependencies tangled up.. can it be made automaticly? separate things: forth-lang.ss now gives a macro 'forth:' that produces toplevel forms from forth code. this can then be wrapped for module usage. module works, but it doesn't want to export the name symbols. calling the transformer directly should solve this: then no marks are added. alternatively, we could mark ourselves? aha: the 'provide' statement needs to have the same lexical context as the names, then it works. (define-syntax (module-begin stx) (syntax-case stx () ((_ code ...) (let ((name (syntax-property stx 'enclosing-module-name))) #`(#%plain-module-begin #,(datum->syntax stx '(provide (all-defined-out))) (printf "FORTH:\n") (forth: code ...)))))) gathering forth code works too now: everything is dumped in a list which is named according to the module name. Entry: calling transformers directly Date: Tue Mar 25 14:43:30 EDT 2008 isn't really good style. why? they all run in the same lexical context as the transformer they are called by. need to have a better look at local-expand and friends to see if there's no better way to handle this. in other words: check where is it (not) necessary to have the same lexical context. i guess it's ok for the RPN code representation: a piece of RPN code produces a single lambda expression, which is a single lexical environment anyway.. but i don't see if it's always harmless. maybe i should just try to break it? because of the way that early expressions are on the deepest nesting levels, introducing new names into the expansion only influences code BEFORE a certain point, so i see no reason to do so. Entry: syntax-case implementation Date: Tue Mar 25 19:59:59 EDT 2008 let's see: http://www.cs.indiana.edu/~dyb/pubs/tr356.pdf two identifiers are bound-identifier=? only if they they have the same name and are present in the original program or are introduced by the same macro. free-identifier=? determines if two identifiers WOULD refer to the same binding. generate-temporaries creates a list of temporary names. not becase renaming is necessary (never is!) but because it might be convenient to insert (a list of) names. macro implementation: for each expansion, a mark is created. input and output syntax is marked, and double marks are dropped. the net result is that identifiers introduced by the macro are marked once. Entry: onward Date: Tue Mar 25 20:29:12 EDT 2008 with the basic representation intact, i'm going to leave the the incremental dictionary stuff for later and concentrate on bringing over other constructs from brood. - reader syntax OK - the pattern matcher OK - variables and constants OK - code chunks / anonymous words OK - exit / jump-to-end OK - local variables OK Entry: reader Date: Wed Mar 26 14:58:39 EDT 2008 stole syntax/module-reader and adapted it for reading forth code. Entry: splitting scat and forth Date: Wed Mar 26 15:14:20 EDT 2008 let's call 'scat' the rpn language with functionality and namespace management, and grow this project into brood-5 instead of trying to port it 'into' brood. i see no reason to release scat as a different project yet, but there are good reasons to separate the scat code into a single collection that's accessible trough scat.ss and scat-tx.ss interface files maybe time to combine some files.. there are a lot. Entry: pattern matcher Date: Wed Mar 26 17:38:14 EDT 2008 looks like it's working: had to correct some hygiene: names lost their lexical content in name-stx->symbol time for more porting. next: constants and variables. trying to port some pattern macros to scat, and i run into the use of macro-find/false. this is the first occasion of dynamic namespace access, which requires a bit of thought to solve.. the deeper question is: are quoted symbols still allowed? or do they always refer to macros? i guess the answer is no. that's what making things static is all about. it could still be added later, but let's not do so until there's a compelling reason. ok.. it's getting a bit more serious now: true cleanup. no more hiding behind reflection ;) Entry: name generating words Date: Wed Mar 26 19:58:11 EDT 2008 it's all about names now.. the next things to tackle are name generating words. let's start with 'constant'. this needs to create a macro eventually. i eliminated this before because of the awkward reflection loop. is that still a problem? yes. it's a phase mix that can't be resolved, because "free-range" code is no longer allowed. so, no constant: use macros instead. then 'variable'. there's little trouble here, except that it requires address resolution, so qualifies as a word. so.. what does variable expand into? maybe a forth-word? let's see if forth-rep.ss can be extended to represent variables. maybe it should get its own representation, next to forth and macro? a variable is a macro that compiles a reference (as literal) to a word structure. so in some sense, it is a forth word: the defining property of a forth word being: need for allocation of memory. so what does a variable look like? (make-word (scat: ',var-struct literal)) this var-struct should also be a word struct, so it can be registered in the same way as forth words. however, the normal 'forth' wrapper expression isn't really useful here. so it might be necessary to change that a bit. Entry: delimited words Date: Wed Mar 26 21:31:10 EDT 2008 if there's a 1-1 relation between names and words, some care needs to be taken to solve conditional jumps etc.. this is exactly the kind of trouble i got into when trying to make a purely concatenative VM for the catkit on-target forth dialect. Entry: forth-tx.ss and macro-lang.ss Date: Thu Mar 27 08:27:14 EDT 2008 note that 'macro:' is really for brood style macros (postponed target words) but the forth-tx in itself is more general. maybe move it to separate directories? in short: those 2 need to be bound, at the forth-lang.ss level, (which should be purrr), because they are orthogonal upto there. yes, it's a good time to decide how to make behaviour pluggable. for microcontroller targets, most of the forth code can be shared. forth uses 'compile' and 'literal' from the underlying target. so should it be a unit? it's only 2 names, let's first pass them in through function/macro. let's reach for the bottle: dynamic binding. it solves all your problems! ;) but, in this case it might make sense. units are really overkill, and to have some default is interesting for testing. i guess if the bindings themselves are isolated, they are easy to change later. so what to separate: - macro (representation of postponed code + pattern matcher) - forth (forth syntax on top of macro) - purrr18 (code specific for PIC18) so.. got macro separated. now trying to make a Forth layer on top of SCAT. note that here 'imperative' code should be possible! that's something to fix later. until then only declarative code. OK. scat-forth.ss is working! Entry: duplicate module instances Date: Thu Mar 27 11:22:56 EDT 2008 i ran into a problem where the rep.ss module is loaded twice when requiring test/fafa.f and scat.ss (the latter to get at the scat: symbol) Entry: dependencies between subprojects Date: Thu Mar 27 12:12:27 EDT 2008 maybe it's best to just have one file for both the main and tx words. did that. looks a lot cleaner now. also made the test cases pull in all code. Entry: variables Date: Thu Mar 27 13:17:08 EDT 2008 So, let's represent variables by (define (wrap-variable name size) (let ((word (make-target-word name #f size))) (values (scat: ',word literal) word))) This probably requires a compiler extension since it's different from macro and forth modes. Got it working: the trick was to add a special variable mode that evaluates macros as literals, and a 'buffer' word that behaves as ':' to define that macro. This then leads to the subsitution macros: (substitutions (macro) ((variable name) (buffer name 1)) ((2variable name) (buffer name 2))) see macro/target-rep.ss and forth/forth-tx.ss for implementation. Entry: control structures / contitional jump Date: Thu Mar 27 16:17:00 EDT 2008 there's an opportunity now to write the forth-style control words in terms of higher order abstractions. is this possible? or are the more lowlevel forth constructs necessary? probably things like for .. next are going to lead to trouble. basicly, i need the equivalent of 'label' and a way to emulate fall-through. the problematic part is the conditional jump. conceptually, it joins 3 words parts: the part before, and the 2 branches. let's just do if: : bla if do-it then go-on ; -> : bla ' l0 ift go-on ; : l0 do-it ; ha! forth-tx.ss has a 1-element stack. turn this in an arbitrary length stack and all branches can be postponed and compiled after the word is done. that's the mechanism: making this work so the current 'inline' branches are still used should be straightforward. maybe it's just a 'swap' on that stack? again. the idea is to make temporary at each brach point. let's try the compilation stack thing. OK. implemented. now what does it mean if there is more than 1 continuation waiting? ok. i know what this is! quoted code ;) expression nesting should be part of the parser.. but the stack's not a stack but a queue: quoted defs come AFTER current def. i got a bit of a pardigm clash here: the scheme lexer has support for s-expressions, so is a better candidate for building a syntax for a language that supports code quotations. however, this is not compatible with forth syntax. the question is: how to map if <1> else <2> then -> [ <1> ] [ <2> ] ifte begin <1> again -> [ <1> ] forever doesn't look like it's a good idea to write a second recursive expression parser.. really. stick to s-expressions for that. part of the purrr kernel could be written in this different syntax: all the machinery to manage that is available now. rephrase the question: how can we keep the illusion of straight-line code? it's very convenient to have as a low-level tool, but for optimizations it's better to have a graph structure. maybe forth chunks should just be lined up? what about using an assembler instruction for this? fallthrough? problem is that this doesn't collect all the variables.. maybe variables should have them too? or variables represented as an 'allot' opcode? so, what about: * forth words have reversed asm code stored * the head of the list (last instruction) is 'falltrough' which points to the next word. * compilation interface exposes 2 words: register and compile. Entry: variables again Date: Thu Mar 27 18:01:01 EDT 2008 if variables are represented by the opcode 'allot', then they can probably be generalized to quoted words: this is basicly 'create'. do i need to distinguish between ram/rom/eeprom allot? for now, let's keep it simple. Entry: semicolon Date: Thu Mar 27 19:22:22 EDT 2008 let's say that 'exit' always means return, even in macros. but ';' means macro: jump to end forth: exit how to implement "jump to end" ? this requires labels. so.. what is a label? an entry point. a word. let's see.. there are 2 kinds of fork points: * conditional goto (not call!) * entry / label so.. a macro can split a word? x x x then y y y this turns the ys into a different word. the place where this would happen is in target-rep.ss in macro->code: that call might return multiple words. the first one being the original word, but other ones made of chunks, where each chunk is an entry or fork point. note that this is almost the same as the splitter for parsing forth definitions. why are they not completely the same? problem with the forth parsing requires a function return because of 'rpn-next', while the code splitter doesn't.. so.. this needs to be brought to the level of words (which have names), not code lists. OK done that. so.. what needs to be a word? jump targets. the reason is simple: jump sources are clearly visible (idally, a word would be ONLY jump sources), but targets are not. maybe in a later step also eliminate sources? anyways. there's 2 kinds of jump targets: forward references (if .. then) and backward references (loops). OK: it's basicly like it was before, but labels are now references to word structures (created when the label is created) and created by the 'label' macro. these are used by the 'split' word to start filling the word structure with code. Entry: hindsight Date: Sat Mar 29 09:39:50 EDT 2008 Making BROOD more static is a way to bring the early exploratory phase into a more fixed structure. It seems the overall design is good enough to be stabilized. from that perspective it makes sense to cast it in stone. What did actually change over the last couple of weeks? Basicly, symbols are disappearing. The only place where they are still left is in the assembler, but that's easily changed. Symbols are replaced with identifiers, and are a (scheme) compile time object. The Forth compiler is now implemented using Scheme's exposed compiler API (the macro system). What this change did for the structure of the program is to point out places where reflection was used without justification. It's now using PLT Scheme's approach of 'unrolled reflection'. As a result, more things can be checked at compile time + name handling is completely handed off to scheme. This is the natural extension of 'lamb / brood 3'. The step through brood 4 was necessary to get familiar with the language layer approach. Giving up the NS hash table and run time evaluation hacks is the final step in trusting this layered module system: when names are identifiers, scheme can do the management. so to restate one of the goals of brood: to do Forth the PLT Scheme way: built on top of its hierarchical macro/module system. things i'm adding: - RPN: rpn syntax for scheme - SCAT: an embedded rpn language with scheme semantics - PAT: a typed concatenative pattern subsitution language - MACRO: a language for expressing postponed forth semantics - FORTH: building forth syntax on top of scat - PURRR: macro language with forth syntax MACRO has postponed semantics because the output of brood is assembly language. If there were a simulator, it would make sense to bring the static implementation down to that level. Instead, that part is still interpreted. This also allows easier integration with external assemblers. Central to brood is the structure of the MACRO language. It's a concatenative language that operates on 2 stacks: the SCAT data stack which is used as the MACRO compilations state stack, and the 2nd stack, which contains the assembly code, is interpreted by the PAT language as a typed concatenative language: the one which implements partial evaluation in PURRR. PAT's matching can be translated to compile time if at one point it is decided to give up on the symbolic asm representation, but instead use a (typed) abstract one. This probably needs some cleanup in the assembler first. Entry: exit: jump to end Date: Mon Mar 31 08:59:52 EDT 2008 now for 'exit'. what i want to support is words/macros like this: ... if ; then ... the ';' is exit for ordinary words, but it's jump-to-end for macros. the 'end' is defined in the environment that executes a single macro. there's no way to do this except for wrapping each macro. ok. so, split it in 2 parts. * meaning of ';'. for macros, it means 'exit-macro', for words it means 'exit'. * implement 'exit' and 'exit-macro'. not that local macro exits always occur within some conditional construct which ALSO introduces splits. can this be used somehow? or is it better to optimize this away when eliminating empty words? aha. the label is that for the code AFTER the exit, not the code it exits TO! they's different! seems to work. note: this allows dead code elimination: anything present in a word's code body AFTER exit or jw can be eliminated. this allows to turn jw/nop -> jw making code re-arrangement possible. dead words are then eliminated by just not being reachable. the next problem that pops up is interference between 'exit' and how it is used in optimizations. basicly, 'exit' needs to be a parameter. seems to work. one more problem: if the last word in a macro is ';', it doesn't need to split: would create too many spurious words. maybe the ';' thing can be checked at (scheme) compile time? the problem is that this really needs semantics.. so, no.. maybe propagate source location information to at least give proper error message? ok. wasn't so difficult: srcloc is now passed from compile time to be stored in the target word structure, or used in macro error reporting. Entry: eliminating the meta language Date: Mon Mar 31 11:11:48 EDT 2008 maybe it's possible to eliminate the scat: meta layer altogether? maybe makes things simpler, but could lead to some circular refs. scat is really only necessary to implement 'm>' and '>m', the rest can be implemented with the pattern language. hmm.. let's keep it in there for a bit. Entry: local variables Date: Mon Mar 31 13:54:41 EDT 2008 .. and then i'm done. : foo | a b c | c b a ; first step is to make an anonymous version of the pattern language. took some time, but got it. needed to clean up pattern language transformer intermediate state. the interesting thing now is that we basicly get multiple occurences for free. if locals captures the 'state' argument of (lambda (stack) (expr stack)) this is pretty straighforward: * close the expression collected up to then * apply it to the input state * collect locals from this state * bind locals to variables * bind wrapper macros * exand the rest of the code in this augmented lexical env. EDIT: yes, but it took some detours ;) it's working in the most generic version, with as much as possible in the form of runtime support. Entry: interpolation in ellipsis Date: Tue Apr 1 08:51:51 EDT 2008 var ... -> (1 2 3 var #,(bla) 4 5) ... doesn't seem to work. trying to work around that with local syntax (let-syntax). this works, but it made me run into an interesting problem: the local environment of the transformer is lost when doing something like (let-syntax ((rep: (lambda (stx) ((rpn-represent) (stx-cdr stx))))) ...) now, i tried to capture that environment, but apparently that doesn't work becasue the RHS is evaluated in a different phase. basicly, the transformer in which the let-syntax expression occurs and the RHS are independent. i don't see a way around this, but it might be interesting to think more about it. i had some small bad feeling about parameters, and this is where it goes wrong.. Entry: time to start porting brood Date: Tue Apr 1 12:09:15 EDT 2008 looks like all the machinery is in place, except for incremental compilation. time to drag library stuff over, then think about incremental stuff. the next problem might be 'load'. which has to be replaced by a module based interface. the thing to solve here is 'require' in forth. this probably requires the function that moves forms to the toplevel.. maybe 'begin-lifted' from (lib "etc.ss") i got it working by manually collecting requires in the purrr-lang.ss wrapper, but that's not the best way.. /usr/local/plt-3.99.0.12/collects/scat/purrr/purrr-syntax.ss:41:10: require: not at module level or top level in: (require "purrr-bla.f") now.. how to collect code from different requires? looks like i need to get at the require before names are expanded. OK. fixed it by expaning straight to a #%plain-module-begin Entry: collecting words / incremental compilation. Date: Tue Apr 1 14:55:15 EDT 2008 suppose we're building a kernel. that kernel is represented by a single module. when instantiating that module, we get access to the exported words. these are linked to structures that might not be provided explicitly. all dependencies are handled, and the required target code can be computed by flattening the call graph given the entry points. this means the problem of getting a linked kernel with limited entry points is separate from building a library of macros accessible in other programs. maybe time to start working on incremental compilation? Entry: purr18 / redefine Date: Wed Apr 2 09:15:33 EDT 2008 maybe leave the incremental compilation bit till later, and try to port the core purrr18 language first, then figure out how to modify the assembler. maybe the latter should really be kept separate so i can target external assemblers. ok.. the next problem is the use of a lot of undefined bindings in the previous pic18 spec. it needs a proper mechanism for target plugin behaviour. one way to solve it is to wipe it under the carpet and move it to the assembler, since that's symbolic. but this probably won't work for everything. i.e. if a DUP is defined in the kernel, it needs to be replaced in all the code that uses it. it's true late binding. there are 2 paths to take, the static one (units + explicit linking) and the dynamic one (just redefine the macro structs). since the core language is macros only, that's ok. there is one problem though: the core language's code should be independent of target. if some target decides that a core macro needs to change, core shouldn't have to know that. that means it has to provide an exact specification of all words that can be redefined. this looks like too much of a hassle, so let's go for redefine using mutation of word structs + some mechanism to at least keep track of changes. what about this: allow for macros to be redefined by checking for the availability of the identifier. if they are defined, bind the old functionality to a name SUPER, so one could do things like: (([drop] dup) ([movf 'INDF0 0 0])) ((dup) (macro: SUPER)) this makes sure that words are at least defined, and it also makes the hierarchy between redefines clear (= same as module hierarchy). what model is this? it's not late binding (at least one binding needs to be present). TODO: * fix 'insert' syntax to something simpler. OK * add redefines to the macro syntax think more about why this is a bad idea. redefine can be a postprocess step by having it refer to itself first, then to swap the old and new implementations. why are assignments bad? no sharing is possible. that's where parameters are better in some cases: at least the extent of the side effect is limited. so should i just make each word a parameter? - if the name exists, it doesn't get redefined, but the word that's returned by the wrapper is swapped with the one defined (define (letrec ((macro/super )) (swap-word! macro/super)) can this be handled by define-ns ? no.. it needs to be at the module language level: that's where the define-ns macro is inserted. nope.. it needs to be deeper than that. got it working with this: (define (define/swap!-ns-tx define stx) (syntax-case stx () ((swap! ns name val) (let ((id (ns-prefixed #'ns #'name))) (if (identifier-binding id) ;; introduce 'super' as temporary self-ref (let ((super (ns-prefixed #'ns (datum->syntax #'val 'super)))) #`(letrec ((#,super val)) ;; swap to undo self-ref (swap! #,super #,id))) #`(#,define #,id val)))))) tested with: (define/swap!-ns word-swap! (macro) dup (macro: super super)) -> expands to (letrec-values (((macro/super) (macro: super super))) (word-swap! macro/super macro/dup)) (macro/dup (make-state '() '((qw 123)))) -> #(struct:state () ((qw 123) (qw 123) (qw 123))) the module level thing didn't work because the 'require' statements were not expanded yet, so bindings were not there. now, the test case with forth lang doesnt work. FIXED: problem was the introduction of 'super' from syntax context != source. Entry: parameters Date: Wed Apr 2 20:39:46 EDT 2008 some words about 1) bottom-up VS. late binding. 2) augmentation (permanent) or default + specialization (temporary redefine) (the interesting thing about writing brood/scat is that it brings up issues that seem quite important on the level of larger scale program organization) so, is this specialization of deep components just a dirty hack? + bottom up design with static binding at least solves the 'undefined' errors: the lowest layer's linking is statically checked. having a core that's bottom up makes it easier to develop and test: there are a lot of macros and transformers in there. the defaults for low-level components could be just for testing partial eval. + allowing vectored code (making everything a parameter) solves having to explicitly _declare_ things as a parameter: some pluggable behaviour is necessary in the current approach. are parameters necssary in the core language? they make sense for rpn: used in scat/macro/interactive. do the macros need a similar approach? is this all a consequence of the solution, or the problem? seems the most important question is: OK to let higher level modules modify parameters, instead of setting them with 'parameterize'? what about making re-definition explicit in the 'compositions' macro? Entry: standardizing interface names Date: Thu Apr 3 07:25:57 EDT 2008 The goal is to make the macros used to define rpn words handle redefinitions. The previous approach of mutation will be replaced by parameters. Before that, let's give an appropriate name to toplevel macros: 'compositions' 'patterns' Entry: every macro a parameter? Date: Thu Apr 3 07:39:22 EDT 2008 What about defining extensions as a functions that installs environment modifications and runs a thunk? This allows to keep at least the most basic functionality intact, and allows specific extensions to be represented by an object. To come back to yesterday's remarks: i'm a bit uncomfortable with extension of low-level components without being able to undo those. Carefully building a bottom-up structure and then starting to poke around in its innards without 'undo' mechanism seems wrong.. So what is the real reason for needing this poking? The macros employ target specific optimizations. This should solve it: (define (make-macro . args) (make-parameter (apply make-word args))) Hmm.. where to introduce? First naive replacement does violate something.. a lot of code assumes macros are functions. Maybe in 'make-word' ? Maybe using dynamic-wind is still the best approach. The problem is that my representation type isn't abstract enough: i really would like it to be a procedure mapping state->state. So.. is dynamic-wind thread-local? Maybe that's the big difference. So what about this: keep word interface like it is, but provide a words-parameterize form: if it can't be solved with straight parameters because they are procedures, solve it on the other side of the interface. EDIT: some auto-upgrade is added so it is not NECESSARY to specify that words are parameters when the auto-upgrade words are available. however, it is possible to PREVENT words from becoming parameters by not exporting the auto-upgrade functions. this looks like it's flexible enough. now to adjust the compositions and patterns macros to use this. got 'with-compositions' working, but it needs a 'super' too. got 'super' working too. so now the mechanism for extending the compiler is in place: every word can be replaced in dynamic context + a mechanism for at least limiting some replacement can be easily installed. EDIT: got pic18 compiling with redefined core words. now.. get rid of the parameters in target.ss and make those into re-definable word too. then it should be mostly done. todo: * remove target-postpone-* parameters in macro-lang.ss and replace them by parameterized words. OK * same for split / label? -> NO: special api Entry: uneasy feeling Date: Thu Apr 3 16:28:44 EDT 2008 something doesn't feel right with this parameterization. however, it does look simpler.. maybe some mode should be added to automatically extract which parameters are redefined? but then, 'super' might not do anything.. i need to think a bit about this. EDIT: so why is this parametrization necessary? it's a cross-cutting concern: OPTIMIZATION. it's used not (necessarily) to change functionality, but to change implementation. the thing which makes it a bit half-assed, is the way in which responsabilities of the core and the extension (pic18) are distributed. is there a sound abstraction hidden here? OK.. so what about automatically collecting all the extensions at compile time, but leaving them unspecified in the code. what about doing it the other way around: specifying ALL target specific macros as an extension, and have them define the parameter if it doesn't exist yet. Entry: simplifying ns-tx Date: Fri Apr 4 10:48:55 EDT 2008 a lot of syntax-case macros get simpler by using let-syntax: this allows ellipsis to be used instead of explicit list manipulations. the core 'ns' functionality is really to transform the names in a symbol introducing macro like let. abstracting this now. box> (letrec-ns (macro) ((a 123) (b 345)) #f) scat/ns-tx.ss:94:19: compile: access from an uncertified context to unexported variable from module: "/home/tom/scat/scat/ns-tx.ss" in: make-let-ns-prefixer after exporting the variable it worked.. Entry: what is an extension? Date: Fri Apr 4 12:09:02 EDT 2008 so.. i'd like to keep the hierarchical module approach, which means that several language modules can be built on top of the same code, without the need for different module instances. this requires at least that the FUNCTIONALITY of the module's data structures (words) is not modified. in that respect the current approach with parameters is OK. so, an extension is a PARAMETERIZATION of the core compiler, such that it generates target-specific code using abstractions provided by the core. currently, each extension needs to know which words are to be extended using 'with-patterns' or 'with-compositions' macros. now, does it make sense to have 'with-xxx' automatically define names if they do not exist yet, with 'super' bound to an exception? or should i just forget about all this nonsense and go back to multiple instances of modules, with permanant mutation of word structures? let's read the doc about mudule instances first. Entry: module instances Date: Fri Apr 4 12:27:26 EDT 2008 looks like i'm stuck at a fundamental misunderstanding.. each time 'require' is called, the RHS of define expressions is evaluated. so ALL the expressions are instantiated! this happens EVERY TIME AGAIN when code is required. however, this process is fast if the code is already COMPILED. how did i work around this with the ns hash before? it looked as if there was only a single instance.. maybe because i was working in a single module environment? in the light of the re-definitions and parameters mentioned before, what i'm trying to do doesn't really make sense: suppose i want a PIC18 and PIC30 forth. if i want both of them in the same module, parameters would be a good approach. but having them in separate modules would work just as well. they would represent different specialized instances of the core compiler, but would have no sharing of data. they could be explicitly put in different namespaces. in such an approach, modifying the word-structures in-place is a perfectly legitimate approach. the parameter words weren't for nothing however: they can still be used for local parameterizations, like ";". OK. 'compositions' and 'patterns' now will re-define words, with 'super' bound to the previous implementation. i guess now it's time to see how to compute code instances? i tried this: tom@del:/tmp$ cat A.ss #lang scheme/base (require "B.ss" "C.ss") (printf "A\n") tom@del:/tmp$ cat B.ss #lang scheme/base (require "C.ss") (printf "B\n") tom@del:/tmp$ cat C.ss #lang scheme/base (printf "C\n") box> (require "A.ss") C B A box> (define NS (make-namespace)) box> (parameterize ((current-namespace NS)) (namespace-require "A.ss")) C B A so, what happens is: each toplevel require pulls in what is necessary, but instantiates the phase 0 expressions only once. the same require in the same namespace will not do anthing, but in a different namespace it will instantiate again: values are not shared between namespaces, but compiled expressions are shared (the global module registry) The manual has this to say: In a top-level context, require instantiates modules (see Modules and Module-Level Variables). In a module context, require visits modules (see Module Phases). In both contexts, require introduces bindings into a namespace or a module (see Introducing Bindings). So what's the difference between 'instantiate' and 'visit'? Visit just looks at the phase 0 names, and evaluates any phase 1 expressions, but doesn't evaluate phase 0 expressions. Got it. This brings new light to collecting code. All modules should require the central code registry, which then defines a *code* variable. Got every module's code annotated by its name too: this gives all the code compiled + a possibility to take only what's necessary. Entry: next Date: Fri Apr 4 14:36:09 EDT 2008 so what's next? get the monitor to compile with 'require' instead of 'load' + provision of symbols. this might introduce some trouble with undefined symbols. also, the assembler needs some fixing. that's a bit boring.. what's the most interesting thing to do next? Entry: beautiful vs. interesting Date: Fri Apr 4 17:16:35 EDT 2008 i was thinking about beautiful vs. interesting today. if beatiful means simple on a superficial level, then interesting means simple on a deeper, hidden level. in programming, things are usually interesting before they become beautiful. boring then means non-compressed complexity: that you don't understand the problem.. it's either simple or interesting. boring can always be made interesting on a meta-level, right? Entry: poke Date: Fri Apr 4 17:22:15 EDT 2008 what about starting poke, as a temporary relief from the pic18 guts? poke should fit nicely on top of macro. it might also be a way to improve on what 'assembler' means. let's port the C code generator first. the cgen has a special purpose transformer. let's map that to macros, starting from the bottom. hmm.. doesn't really work that well since this prints to strings, not s-expressions. OK.. got indentation working as parameters. now.. go from s-expressions -> syntax objects so 'syntax-case' can be used. before that, first needs to be defined what an expression transformer is: stx -> string ok. got it: syntax-case instead of match, but not using the scheme expander. Entry: compiler's NEXT Date: Sun Apr 6 11:27:33 EDT 2008 this should be (source, exp) -> (source+, exp+) instead of returning exp only when done: that would eliminate the hoop-jumping in forth-tx.ss ha: the continuation thunk used doesn't work well with the current continuation passing approach. -> fixed it by returning a thunk instead of just the syntax: some extra information was needed (a ':' means 2 things: 1. end previous def 2. start new one) makes me wonder if i can fix the splitting in target-rep.ss in a similar way. Entry: the assembler Date: Sun Apr 6 14:43:02 EDT 2008 keep the road open to have assembler opcodes as syntax, to carry over source code information. so what does the assembler do? * provides chunks of (fully linked) target code representation given a namespace of primitive assemblers. * resolves target code/data addresses (using a multi-pass algo) what does the assembler not do? * symbol resolution: all symbols should be resolved by the compiler, so there is no need to do any namespace management. * code order is determined by the compiler: assembler gets a list of word structures. should it be called 'assemble!', and see it as a graph update function? not boring at all: there's a problem that needs to be solved, and quite a deep one: what about symbols? well actually, they might evaluate straight to word structures, which are accessible! Ha! evaluation of expressions in the assembler are dependent on 2 different context: whether they are part of call instructions, or part of literal loads. this seems to solve a big problem, but i can't quite express it yet... TODO: * evaluation of symbolic code * think about side effects OK/not ? * where to store assembler opcodes? Entry: quoting and meta-code Date: Mon Apr 7 07:36:13 EDT 2008 the problem: quoted labels. somewhere down the line, quoted words loose their quoted tag, such that macro evaluation doesn't give what's needed. ' abc 1 + jump what about getting rid of the symbolic part, and starting with clean semantics? (([qw a] [qw b] +) ([qw (macro: ',a ',b +)])) this doesn't work of course, because it recurses.. so what would work? schould it be scat code? (([qw a] [qw b] +) ([qw (scat: ',a ',b +)])) what is the semantics of a quoted scat word? can it be just a thunk? (([qw a] [qw b] +) ([qw (lambda () (+ a b))])) thunks are most flexible, but would it be good to limit the semantics to somehow scat words or macros? let's restate the goals: * obtain a value at assembly time. * allow easy composition of meta-code at compile time * allow meta code inspection * simplify definition of meta-ops (snarfing) (([qw a] [qw b] +) ([qw (scat: ',a ',b +)])) maybe i need to give up on inspection, and solve that later. concentrate on semantics first. [qw ] what is ? it's any VALUE that makes sense at compile time, to be passed around between macros, but eventually, it should end up as a number. what is composition of meta code? in the previous approach, this was done syntactically: just concatenate lists. is this still a good approach? isn't a more general abstract approach better? (([qw a] [qw b] +) ([qw (meta: a b +)])) now, what does 'meta:' do ? * produces a single (delayed) numerical value * the '+' comes from scat namespace * the 'a' and 'b' are lexical parameters. ok. got it in macro/meta.ss : simple layer on top of scat code which appropriately quotes lexical variables, and wraps results in a meta structure to chain evaluation. seems to work with the pic18 stuff too. now: meta annotation. something that might come in handy is to figure out where assembler literals come from. in the old brood, code was just symbolic. here it needs to be annotated explicitly, because that information is lost. problem: meta code has to be thunks: the value can depend on numerical addresses of words, which might change during the relaxation phase of the assembler. Entry: tick Date: Mon Apr 7 10:57:55 EDT 2008 so, with meta quoting out of the way, the real problem can come back now: computing with word labels. this probably boils down to giving TICK the proper semantics. ' foo this will produce a literal with a quoted macro. all symbols in macro/forth code need to be macros, and quoting symbols needs a different tick. what does it mean to quote a name? it produces a literal value that supports an 'unquote' operation. in addition: it MIGHT support POINTER MANIP if it is a macro that wraps a call to a word. so: the previous approach of treating symbol names as word addresses IMPLICITLY dequotes it to yield a numeric address value. anonymous macros might be convenient. anonymous words also. what's the difference? let's see: ' foo compile == foo this is an important issue, and needs some more thought. the difference between "execute" and "compile" should be cleared up also. looks like i really need to be careful with AUTOMATIC changes between macros and words. NOTE: macros can't survive to the assembly phase, so everything that used to be a symbol, now needs to be a target-word struct. does this solve it? ;; Get the address from the macro that wraps a postponed ;; word. Perform the macro->data part immediately (as a type check) ;; and postpone the address evaluation. (([qw macro] address) ([qw (let ((word (macro->data macro 'cw))) (meta-delay (let ((pointer (target-word-address word))) (unless pointer (error 'unresolved-word "~s" (target-word-name word))) pointer)))])) looks like it: 'run' and 'address' are now separate. 'run' doesn't need to know if the quoted macro represents a target word. 'address' does need to know that, and fails if it is not. Entry: optional library code Date: Mon Apr 7 12:19:22 EDT 2008 Then, the more general problem of requiring runtime words only if necessary. For example: (([qw macro] run) macro) ((run) ([cw 'runtime-run])) Since symbols are no longer allowed, this form of late binding needs to be handled differently. The most straightforward solution is to have the default throw an exception, and rely on targets to implement the word. Entry: bugs Date: Mon Apr 7 13:03:29 EDT 2008 something went wrong with forth / macro mode: check test/purrr-broem.f -> fixed: current-mode evaluated at the wrong time. Entry: assembler Date: Mon Apr 7 20:08:01 EDT 2008 using a prompt-tag to abort from meta-force: this somehow needs to give the smallest or the largest instruction, depending on whether assembly uses a growing or shrinking relaxation. Entry: more assembler Date: Tue Apr 8 09:12:35 EDT 2008 About the relaxation algorithm: as far as i understand, it is necessary that individual instruction sizes move only in one direction (grow/shrink) to prevent oscillations in the relaxation phase. 2 questions: * is this correct? * how to ensure? ok.. i'm going to assume it's necessary to limit size changes. how to implement this? for this to work individual instructions need to be tagged somehow. let's put the responsability at the machine assembler end, and provide only a mechanism to record the previous result. on the other hand, we could pad with nops. OK. got it. Entry: relative addressing Date: Tue Apr 8 13:41:28 EDT 2008 so.. the relative addressing is a bit of a hack. is it possible to move address resolution down to the assembler opcodes? sure. just have them depend on 'pointer-get'. let's port the pic18 assembler, and see if the generation can be improved a bit. porting asmgen and trying to get relative addressing, which now has already overflow detection, to use absolute input. Maybe meta-catch-undefined can be eliminated by setting undefined words to 'here', so they compile to a small relative jump instruction. Maybe just leave that out: it's an optimization, not essential. OK. pic18 seems to work too. Entry: .f -> .bin Date: Tue Apr 8 17:11:08 EDT 2008 Time to define the purrr18 language, which assembles straight to binary. Maybe target-word structures should have a 'bin' slot? ok.. now for assembling on the spot. or is that not a good idea? (assemble! (apply append (map cdr *code*))) maybe it's time to start doing the "load in namespace" thing? Entry: workspace Date: Wed Apr 9 13:37:21 EDT 2008 so.. all the static stuff seems to be in place, now for the dynamic workspace. there are some issues still with multiple instances. so. why would one want to use a namespace? to use 'require', 'eval' and 'compile' in a controlled fashon. Entry: outstanding issues Date: Wed Apr 9 13:44:48 EDT 2008 - dead code elimination OK - opti-save / pseudo OK - variable allocation OK: words got realms now - splitting: OK - jump chaining - org - code serialization Entry: assembler bugs Date: Wed Apr 9 13:53:52 EDT 2008 something wrong with error handling on eval.. (allot data 123) -> the 'data part is not allowed: only numbers and meta promises that evaluate to numbers Entry: multiple compiler passes Date: Wed Apr 9 14:45:03 EDT 2008 something i forgot about: there is the 'pseudo' and 'opti-save' pass that goes over the code after the first pass. it might be a good idea to formalize this a bit. the real question: why not postpone all optimisations till later, and have the core language be as simple as possible? Entry: words vs macros Date: Wed Apr 9 16:21:56 EDT 2008 now that i got the target word datastructure in my hands, it's easy to see that these are completely separate from the macros that generated them. they can be serialized with evaluated meta expressions, possibly augmented with some annotations as to where the computations came from. to get at macros, simply load all the source code, but discard the *code* variable. so next question: how to serialize a graph of structs? and while we're at it: what about graphs and functional programming? Entry: conditional jumps -> more static assembly rep Date: Wed Apr 9 18:13:49 EDT 2008 something to think about is if it is possible to find a common primitive for for .. next and other loops. ha.. something else: assembler constants need to be bound: no more symbolic magic. maybe also a good time to require assembler opcodes to be bound names + perform arity checks? also added an opcode check: it's probably best to just replace that with module name bindings in the (asm) namespace though, so all checks can be automated. however, that does require either moving the assembler to compile time, or using namespaces + eval. a big hurdle is the implementation of the pattern matcher: it uses symbols. since the assembler namespace is flat, and not so big, and quite constant, it doesn't really need to be managed.. let's keep it like it is, but add an arity check. it would be nice to have some things available at compile time though, like arity checks. might combine both the symbolic matching AND some static name binding? Entry: static vs. dynamic Date: Wed Apr 9 19:57:49 EDT 2008 i don't know whether this is mostly bias, but it seems that using a bottom-up approach instead of an ad-hoc, late-bound approach makes things easier to understand. i.e.: i didn't realize that the "delay evaluation until assembly time" is really ONLY about values of target code and data addresses: everything else can be evaluated earlier. previously this was handled in a sort of ad-hoc way with evaluation of macros and assembler labels. so, what about the assembler? should it be static (which needs some magic in the pattern transformer) or stick to the symbolic approach? maybe some middle road: names handled statically, and the rest done with structure instances. Entry: matcher Date: Thu Apr 10 01:14:34 EDT 2008 it's probably best to: * represent assembler instructions with structs + explicit type info * write a special purpose matcher that takes into account bitfield widths. the idea is this: the asmgen goes from the textual rep -> symbolic rep -> procedure rep. what about stopping that compilation somewhere and leave a little bit of interpretation of the proto? (define (proto->assembler . proto) (match proto ((name formals . operands) #`(make-asm (lambda args (parameterize ((asm-error (proto->asm-error-handler '#,proto args))) (apply (lambda #,formals (list #,@(map assembler-body operands))) args))) '#,proto)))) proto looks like this: (movlw (k) ((14 . 8) (k . 8)))) moved the asm-error parameterization to assembler.ss resolve/assemble ok. arity check works. let's move on to making a pattern matcher. the thing to do is to make it match instances. need to revert to find other bug. the problem was with '(list-rest', which behaves differently than the dotted notation in the other matcher. as far as i can see the following was legal in match: #`(#,@'() . rest) -> rest however, in the alternative syntax this becomes: #`(list-rest #,@'() rest) -> (list-rest rest) which isn't the same: box> (match '(a b c) ((list-rest bla) bla)) match: no matching clause for (a b c) tricky business ok, so now the plt matcher is in place, and it should be possible to start matching struct instances instead of symbols. on the other hand, it's not so essential: got plenty of checking implemented now.. maybe move on to real work? Entry: mature optimizer Date: Thu Apr 10 13:28:53 EDT 2008 in order to keep the optimizer tractable, it has to be factored a bit. lets see what we got now: (1) compilation + optimization of non-jump instructions (2) jump optimizations on intermediate code (3) elimination of intermediate reps + save opti the last step is target specific atm. Entry: meta eval annotation Date: Thu Apr 10 13:32:08 EDT 2008 annotation is easily made by replacing undefined addresses with symbolic references.. ok, not easy, but at least straightforward: target values now have 2 thunks: one that produces real values, and one that gives expressions referring the target word names. maybe.. it's better to just use an s-expression language here? targeting external assemblers' expression languages will probably be easier using a nested format, instead of a linear one.. for the built-in assembler, there's no need, since the evaluation mechanism is abstract. Entry: conditional jumps Date: Thu Apr 10 17:22:33 EDT 2008 these are special.. but how exactly? one of the things i'd like to try is to isolate loop bodies so they can be optimized. the previous 'amb' based approach (for .. next) is a bit of a dirty hack, and doesn't work very well with non-flat code as before.. the for..next opti checks to see which of these produces the smallest loop body. for .. next dup for drop .. save next drop so, what is the pattern here? * execute a couple of simultaneous paths * choose the best one this probably needs a purely functional split loop, so continuations can be used without trouble. let's try that. what does 'split' do? it calls 'next' and then continues. so it needs a true continuation. remarks: * split doesn't need call/cc: it just produces a value, and that value isn't all that interesting. * in the loop body, no assignments can be made: when a split occurs, just cons the word and code list together, and perform mutation AFTER everything is done. OK, i think i got it written down.. not sure whether it will work though: afraid that the continuations in the macro evaluation will somehow interfere with the update loop.. it should really be seen as 2 tasks communicating.. anyhow. more later. i can't get this to work.. probably i'm discarding things i'm collecting when calling the continuation. maybe composable continuations work here? but i don't really understand them yet.. it's like stuff pushed to the return stack.. so why can't this be solved in a monadic way? actually, this is the same problem as the one i'm trying to solve with passing data alongside the normal stack: because there's no room besides 'data' i can't just tuck away more stuff.. looks like this is getting me closer to how to implement the core mechanism for monadic threading... probably going to learn a thing or two here. let's concentrate. - a macro is a map (stack,asm) -> (stack,asm) - i'd like to extend this to a map (words,stack,asm) -> (words,stack,asm) which has a single mixing operator 'split', and all the other operations are lifted. how to do this? http://community.schemewiki.org/?composable-continuations-tutorial from plt manual: (reset val) => val (reset E[(shift k expr)]) => (reset ((lambda (k) expr) (lambda (v) (reset E[v])))) ; where E has no reset similar: (prompt val) => val (prompt E[(control k expr)]) => (prompt ((lambda (k) expr) (lambda (v) E[v]))) ; where E has no prompt applied to the monadic bladibla: suppose 'D' performs some mixing with other state y, but all the small caps operate only on x. (lambda (x y) (a (b (c (D (e (f x))))))) = (lambda (x y) (let-values (((y+ x+) (reset (a (b (c (shift abc (let ((x+ (e (f x)))) (values (merge2 x+ y) (abc (merge x+ y))))))))))))) this way, the extra data 'y+' can be passed sideways, not going through the 'abc' chain. so.. as long as an expression is wrapped in a reset, 'shift' can get inbetween. how to use this for implementing lifting? one prompt tag per lifted thing? Entry: composable continuations Date: Thu Apr 10 21:33:38 EDT 2008 some simpler example is needed. suppose we're doing one prompt tag per threader. (define (one stack) (cons 1 stack)) (define (drop stack) (cdr stack)) (define (word: . fns) (apply compose (reverse fns))) (define (broem stack extra) (define (mix s) (shift post (values (+ extra 1) (post (cons extra s))))) (reset ((word: one mix one one) stack))) getting tired.. what i want to do is basically: create a mechanism to chop the code up in chunks that is compatible with continuations for non-deterministic programming, so optimization can be implemented using 'amb'. what i sort of see is that it is possible to use shift and control to chop up a program into different parts, and recompose them. in the light of writing a RPN program (a b c) as (lambda (state) (c (b (a state)))) this makes sense: shift can capture what happens AFTER a certain point, upt to where the result is needed. again.. i think i get what reset/shift do, but can't make the connection to sidestepping threading. maybe i should try to translate it to RPN first? Entry: hiding more stuff in 'data' Date: Thu Apr 10 23:06:10 EDT 2008 1. there is no difference in trying to extend stack -> (stack,data) -> more, so it should work with just a number. 2. in a loop which has extra internal state, compute a composition of functions where one of the functions is special, in that it can refer to the enclosing state. can't this be hidden in 'make-state?' : let that function perform all the necessary combinations of state. what i wonder is how to relate this to real monads: the operation of "flattening" 2 monadic layers? bind : (M t) -> (t -> M u) -> (M u) map : (t -> u) -> (M t -> M u) join : M (M t) -> M t map seems really trivial, but join? Entry: lifting with shift Date: Thu Apr 10 23:19:44 EDT 2008 (a (b (C (d (e x))))) ab : x -> x de : x -> x C : x.y -> x.y now make abCde : x.y -> x.y let's try again: add1 : x -> x swap : (x . x) -> (x . x) (define (swap x) (cons (cdr x) (car x))) (define (lift fn) (shift post ;; capture postproc (lambda (xy) ;; create new function (let ((xy+ (fn xy))) ;; apply fn to its input (cons (post (car xy+)) ;; apply post to one of the components (cdr xy+)))))) ;; .. and join again * capture the stuff that postprocesses the x component * apply the Entry: struct matcher Date: Fri Apr 11 10:17:44 EDT 2008 in order to make the monad thing work, i'm going to use structure types only, and write a special purpose matcher that handles nested structure types with a simpler syntax. Entry: broken functional compiler Date: Fri Apr 11 15:03:07 EDT 2008 paste it here, so i can get the imperative back online: ;; To the macro layer, code and labels are distinct entities ;; represented by abstract target-word data type and reversed assembly ;; code lists respectively. After compilation, the code lists are ;; permanently attached to the word structs. During compilation, no ;; side effects are made, so continuations can be used for ;; optimizations. (define (compile-word input-word) ;; Label generation is stateful, but that's ok since we don't care ;; much about the counter values. They are just for readability. (define next (make-counter 0)) (define (label (name (format "_L~a" (next)))) (make-simple-target-word (string->symbol (format "~a" name)))) ;; Get the macro code, and create a start thunk and set up the ;; grabber parameter. (define macro (target-word-code input-word)) (define name (target-word-name input-word)) (define grab-words (make-parameter #f)) (define (go) ((grab-words) (macro->code macro name))) ;; Split needs to be purely functional so continuations can be used ;; freely when compiling code, discarding the split word state if ;; necessary. (define word/code (let ((tag (make-continuation-prompt-tag 'compile-word))) (set-target-word-code! input-word #f) (parameterize-words-ns! (macro) ((semicolon (ns (scat) postpone-exit))) (parameterize ((target-make-label label) (target-split #f)) ;; Ensure side-effects are local. ;; State updates directed by calls to 'split'. (let update ((words '()) ;; listof (word . code) (current-word input-word) (continue go)) ;; continuation thunk ;; Split needs to be purely functional so continuations can be used ;; freely when compiling code. (target-split (lambda (state new-word) (let ((code (state-data state)) (stack (state-stack state))) (shift-at tag chunk (update (cons ;; no assignments! (list current-word chunk) words) new-word (lambda () (k (make-state stack '()))))) tag)))) ;; After 'macro->code' we end up here to record the last bit ;; of code, collect everything and exit from 'go' and thus ;; the 'update' loop. (grab-words (lambda (final-code) (cons (list current-word final-code) words))) ;; Continue computation (reset-at tag (continue))))))) ;; Link up structures, and return a list of words. (map* (lambda (word code) (set-target-word-code! word code) word) word/code)) Entry: next Date: Sat Apr 12 00:30:14 EDT 2008 got a bit confused by the control operators yesterday. might look at this link, and some more about cursors.. http://blog.plt-scheme.org/2007/07/callcc-and-self-modifying-code.html Entry: monads Date: Sat Apr 12 13:03:46 EDT 2008 so.. from the point of 'map' and 'join', which i think are easier to understand. map: take a function f:u->t, and turn it into Mu->Mt join: take MMt to Mt: undo a 'double wrap' the key insight is that how many times f is used, and in what order is not specified. and for join, it doesn't matter what the wrapping does, as long as it can be flattened: wrapping can contain multiple base type instances, in whatever structure. maybe 'bind' isn't that hard to understand after all, it takes a monad Mt and a function that produces a monad Mu from a value t, unwraps Mt, applies t->Mu as many ways as necessary, and combines all the Mu into a single Mu. in the map/join version, the ORDER of wrapping is very important. ((map f) m) == (bind m (lambda (x) (return f x))) (join m) == (bind m (lambda (x) x)) (bind m f) == (join ((map f) m)) in order to understand this better, i'm trying to implement it (without looking at other implementations.) see monad.ss i'd like to make 'map' and 'join' polymorphic, but that's not quite possible because of absence of typing information. functions could be annotated however (do contracts help here?) (something in the back of my head: in haskell, one can dispatch on the return type of a function. i'm not sure if that's going to be a problem here.. EDIT: it's about the unit operation.) Trying to implement a monad that carries around just an extra scheme value. This is the simplest thing i can think of. (define-struct extra-monad (value extra) (define (extra-map t->u) (struct-match-lambda ((extra-monad value extra) (make-extra-monad (t->u value) extra))))) (define extra-join (struct-match-lambda ((extra-monad (extra-monad value extra-inner) extra-outer) (make-extra-monad value ???)))) The problem seems to be in the join operator. Map is simple: just pass it on. But what does the combination do? An option is to simply pick one of the 2. http://groups.google.com/group/comp.lang.functional/msg/2fde5545c6657c81 "You can also turn programs in continuation passing style into monadic form. In fact, it's a significant result (due to Andrezj Filinski) that all imperative programs using call/cc and state can be translated into a monadic style, and vice versa. So exceptions, nondeterminism, coroutines, and so on all have a monadic expression." maybe time to formulate my question: since call/cc seems to be more 'native' to scheme, why don't i use that instead of monads? ok.. am i allowed to try again with reset/shift ?? Entry: reset / shift Date: Sat Apr 12 14:00:25 EDT 2008 trying to do this: (abcZdef) -> (ABCZDEF) need to do this dynamically, without changing the small caps.. wrt to state, the diagram should illustrate it: -a-b-c-Z-d-e-f- -------+------- so i try to use 'shift' to collect the remaining computation, and turn it into a lifted function. what i want is this: (lambda (_) (f (e (d (Z (c (b (a _)))))))) -> (lambda (_) (cons (cons (lambda (x) (f (e (d x)))) z) (c (b (a _))))) the problem is really termination (the 'null' of the list if you want) ok.. i get something, but not what i expect.. time to go to a simpler version. hmm.. very confusing stuff: i understand what happens if there's one shift, but every next one gives results i don't understand. probably best to try to write out some examples using the reduction rules. Entry: static Date: Sat Apr 12 15:23:48 EDT 2008 if i can't get it to work dynamically, why not provide the information statically? the only thing that matters is the binding for the functon 'split' in the compiler loop. isn't there a way to make the forth macros accept this word? the problem is: there are words defined on top of split, so i'd have to make all those dependencies static too.. Entry: state.ss / 2stack Date: Sat Apr 12 17:13:04 EDT 2008 main problem: the 'data' part in state is the thing that's passed around by all control words. this cannot be the 2nd stack: data needs to be a stack of 'wrapped things'.. i'm not sure what that means yet, but it's 'stuff' that gets threaded through computations. BROOD 6 is probably going to be about doing this with composable continuations.. i'm going to try to shield access to this data atom. i think i still don't understand why scat-control.ss needs to have this atum clearly visible. trying to define these stack update functions, i find a need to make the WHOLE state representation explicit again. ok.. it's the right path, but i'm using the wrong abstraction. i need a mechanism to just stick something on state to be retrieved later. in other words, the layers of wrapping need to be made abstract. i.e in: a b c D e f g H i j if D and H interact with the threaded state in some way, they needs to be able to do that without the lower case functions knowing about the existence of these things. different type of things need to work independently. or not? i.e. asm access only makes sense if the type is actually extended with such information. i'm missing some crucial insight here.. Entry: automatic lifting Date: Sat Apr 12 18:05:29 EDT 2008 i'm looking at this the wrong way.. this all makes so much more sense taking the stance of "automatically lifting" a procedure whenever it is applied to a certain input type. that's really the only thing necessary.. so what about turning this around and seeing 'state' as an object with a method 'apply me' imagine a conversation between STATE and FUNC. STATE: dude, i want to apply you. what's your game? FUNC: i take A to A STATE: hey, i got some A here, i'm going to use you and move on. so, 'state' should really be a function. STATE: FN -> STATE so.. it's the responsability of the state to interpret the functions applied to it, and the responsability of functions to identify themselves. (define (make-stack data) (lambda (fn) (if (stack-proc? fn) (make-stack (fn data)) (error 'type-error))) this changes the representation from (lambda (x) (a (b (c x)))) to (lambda (x) ((((x c) b) a)) which really looks like RPN code :) the 'dumb' state would be (define (dumb data) (lambda (fn) (dumb (fn data)))) so.. is this an interpreter? looks like it.. note that in order to optimize things, some could be unrolled: (lambda (x) ((x c) (lambda (stack) (a (b stack))))) this is basicly an implementation of the 'map' function: the function implementing the state object is the monad wrapper M which contains a type t, and it maps an incoming t->u to Mt->Mu. summarized: * all functions are typed, and do not need to be aware of state. * state is completely abstract maybe this can do all kinds of lifting automatically? i.e. scheme functions -> stack functions etc.. how hard is it to change this? where do control words fit in, since state is no longer passed automatically. maybe control words are just another type that take a continuation argument? (define (stack lst) (lambda (fn) (cond ((stack/control? fn) (stack (call/cc (lambda (k) (fn k lst))))) ((stack/data? fn) (stack (fn lst)))))) well.. it's an interpreter for sure. can these contitionals be eliminated? well yes, if at compile time the type can be determined.. so is that possible? can functions be typed statically? there's one problem though: composition: what type does this have? state -> state, where state is a fn. so there's a difference between 'primitives' and 'composites'. (lambda (x) ((((x) a) b) c)) EDIT: something's chicken and egg here tho: primitive types and extensible types. looks like i slammed into the "expression problem", since i want to extend both the type and the methods. Entry: questions Date: Tue Apr 15 19:37:42 EDT 2008 * extensible types: is the inverted approach of previous post a good idea? * reset/shift : there has to be a way to 'split' functions at points where other data is injected. Entry: shift/reset breakpoint draft Date: Tue Apr 15 21:10:42 EDT 2008 (define tag (make-continuation-prompt-tag 'tag)) (define (make-split [more #t] ) (lambda (inner) (shift-at tag rest (values (and more rest) inner)))) (define x add1) (define y (make-split)) (define stop (make-split #f)) (define (make-composition . fns) (apply compose (reverse fns))) (define (test fn input) (let next ((thunk (lambda () (reset-at tag (fn input))))) (let-values (((k v) (thunk))) (printf "v = ~s\n" v) (if k (next (lambda () (k v))) v)))) box> (test (make-composition x x y x x x y x x x x stop) 0) v = 2 v = 5 v = 9 9 EDIT: i get it.. nested shifts will always return the deepset shift free expression. Entry: multiple compilation paths + memoization Date: Wed Apr 16 09:31:56 EDT 2008 the reset/control is about implementing the forth compile loop without side-effects, currently it uses a stack (push!). once that is done, there should be a way to use extenisions to compile some sequences multiple times, and pick the best one. one of those is for/next. however, with nested loops, care should be taken not to make the algorithm quadratic. i'm not sure whether memoization is necessary: explicitly using 2-path execution might be more interesting. in the 'test' loop before, this amounts to running one compilation multiple times, one with code wrapped around the loop, and pick the best one. Entry: breakpoints Date: Wed Apr 16 09:54:05 EDT 2008 the reset/shift approach has the semantics of breakpoints. let's just call it that, and make the abstraction complete. the players: * (make-breakpoint tag mix [more #t]) * (with-breakpoint tag fn state0 value0) -> state,value * (mix state value) -> state,value this seems to work well. my only worry is composition: what happens if there is more than one tag involved? the way to look at this might be from the outside: a tagged shift only makes sense if it's captured by a tagged reset, so combinations of tags would be properly dynamicly nested. in that case, i see no problem. Entry: compiler with breakpoints Date: Wed Apr 16 14:19:48 EDT 2008 looks like it's working. now: is this really necessary? it would be nice to understand if it can be done using parameters and side-effects. the true test here is of course to try something with continuations, see where it goes. maybe have a go at for .. next? Entry: postpone-exit Date: Fri Apr 18 11:05:58 EDT 2008 hmm.. something wrong with -broem -bla tests, they seem to hang. problem with mexit. they were calling each other: (compositions (macro) macro-prim: (exit postpone-exit)) (define-ns (scat) postpone-exit (ns (macro) exit)) renamed to 'compile-exit' goes better with 'compile' Entry: cps forth Date: Fri Apr 18 11:31:43 EDT 2008 is there any meat in cps forth? or is this just a way of interpreting? probably.. cps replaces "CALL" and "RETURN" with "GOTO with parameters". it does need first class functions though. Entry: parsing C Date: Fri Apr 18 12:50:15 EDT 2008 http://eli.thegreenplace.net/2007/11/24/the-context-sensitivity-of-cs-grammar/ of things to do.. i need to have a look at piumarta's packrat parser. that would be a very interesting addition to brood. Entry: scat progress Date: Fri Apr 18 14:07:35 EDT 2008 is going really well. i'm as good as done, except for the interactive part which needs a bit of re-org. the name space management is a lot better now. making things a bit more static didn't really hurt. Entry: new name for purrr Date: Fri Apr 18 14:24:01 EDT 2008 everybody keeps calling it picforth, but that's already used. what about PRICFOTH? it already sounds obscene in dutch.. Entry: tethering Date: Fri Apr 18 14:44:10 EDT 2008 * compile the monitor * port interactive code maybe it's possible to get rid of interpret/compile mode in console interaction. maybe some 'auto tether' can be made: not running certain optimizations so macros can be easier simulated? that's quite a challenge.. the problem at first hand seems to be the use of platform-dependent constructs.. translating forth to pseudo code is trivial, but some of the langauge is defined ONLY in terms of assembly code. the reason to have an interpret mode is to not have to touch the flash rom. ram-based forths should really just compile and execute, but for rom-based forths there is room for a separate interpret mode language. it's also the right spot to introduce tethered commands from target's perspective. Entry: compiling the monitor code Date: Fri Apr 18 15:06:57 EDT 2008 things that are going to pop up: - handling the namespace - compilation, assembly + serialization of word struct. - org Entry: for .. next Date: Fri Apr 18 15:45:13 EDT 2008 maybe i need to test this first: compile 2 branches + save the best in memoized form such that nested loops are computed inside out in linear time. for body next dup for drop body save next drop so, at the time 'for' executes, it needs to know which of the 2 is shortest: (body) or (drop body save). let's call the above: (for0 body next) and (for1 body next1) and reserve (for) and (next) as the macros that setup the evaluation. this leads to the following control logic: if 'for' can capture 'body', it can try several strategies and pick the best one. can this be done using composable continuations? it would be the first testing point to see if they mix well. if so, it can probably be generalized to a lot more control structures. i worry about nesting: for_o for_i .. next_i next_o would lead to something like: (lambda (state) (reset (next_o (reset (next_i (body ;; for_i (shift i ;; for_o (shift o state)))))))) maybe they need different prompt tags? looks like it: the inner shift won't see the outer reset. let's give it a try. this needs to go deeper: since the rest of the code explicitly needs to be called inside a dynamic extent.. confused now. stack: next_o next_i body for_i for_o it's probably best to bring shift/reset to scat. ;; Installs a reset and saves the prompt tag on the stack. (define-ns (macro) reset/tag (lambda (state) (let ((tag (make-continuation-prompt-tag 'reset))) (reset tag Entry: composable continuations Date: Fri Apr 18 17:16:26 EDT 2008 http://schemekeys.blogspot.com/2006/12/delimited-continuations-in-mzscheme.html ... four classes of delimited continuation operators ... are referred to as -F-, -F+, +F- and +F+. Dybvig et al. describes them as "a classification of control operators in terms of four variants of F that differ according to whether the continuation-capture operator (a) leaves behind the prompt on the stack after capturing the continuation and (b) includes the prompt at the base of the captured subcontinuation." that makes things a lot easier to understand. Entry: tools + check Date: Fri Apr 18 20:55:41 EDT 2008 moved code used from zwizwa-plt back to the tools/ directory. granularity is too fine. if i need it in other projects, maybe best copy/paste.. most is too specific. should also cleanup sweb to get rid of the stream stuff, and use something standard. Entry: serialization Date: Fri Apr 18 21:44:26 EDT 2008 using scheme/serialize and define-serializable-struct should give serializable object code if the target-value structs are evaluated. maybe some annotation should be left instead of the value? Entry: graphs and FP Date: Sat Apr 19 10:28:53 EDT 2008 i never quite understood how to deal with graphs in FP. in EOPL there's a point where circular reference is avoided by delaying linkage. i think at the point where environments are implemented. at the time it struck me as odd.. so, what is a graph? it's a map:: node -> (listof node) whether children are ordered or not, listof can be setof the problem with graphs is that nodes refer to one another. let's first try to represent a graph as a tree. see also zipper: http://www.st.cs.uni-sb.de/edu/seminare/2005/advanced-fp/docs/huet-zipper.pdf using lazy data structures, self-reference is easy, and can be represented by lambda terms, which eventually boil down to the Y-combinator. EDIT: the idea seems to be to represent the graph as a lazy structure that can generate a 'local tree expansion' or something.. a bit like manifolds and R^n patches. EDIT: what i'm looking for is called circular programming. http://www.csse.monash.edu.au/~lloyd/tildeFP/1989SPE/ http://www.haskell.org/sitewiki/images/1/14/TMR-Issue6.pdf basicly: you need lazy evaluation to build graph structures: a pointer to a structure can be available while the structure itself is as of yet unevaluated, and as such can reference itself. Entry: spread the word Date: Mon Apr 21 19:48:29 EDT 2008 http://www.forthfreak.net/index.cgi?WikiNode Purrr is mentioned there. Maybe i go around and edit some wikis? Entry: string -> language Date: Mon Apr 21 21:26:18 EDT 2008 How to create forth code from a string? i forgot how the logic works.. pic18/lang/reader.ss: (module reader scat/forth/module-reader scat/pic18/purrr18-module-language) the generic forth reader uses #%plain-module-begin from the specified module. to declare and instantiate a module body (module test "pic18/purrr18-module-language.ss" : abc 1 2 3) (require 'test) (print-all-code) abc: [dup] [movlw 1] [dup] [movlw 2] [dup] [movlw 3] now, from a string: open the reader module with a prefix: (require (prefix-in 'forth- "pic18/lang/reader.ss")) The answer seems to be: forth code lives in a namespace, so in order to load a file, create a new namespace. EDIT: got it to work by using: (parameterize ((current-namespace ns)) (eval form) (eval `(require scat/macro/code ',name))) now i can instantiate multiple namepspaces, each with their own language. one problem though: the word structures are not accessible, because the instances are different. anyways, this gives a nice border to create the "badnop interface". now, this takes noticable time with all modules compiled. (ns-print-code (purrr18->namespace ": abc 1 2 3")) which means something is running during instantiation of the modules.. maybe it's the tests? maybe it's possible to keep a namespace around with an instantiated compiler, and re-evaluate forth code? TODO: split instantiation of compiler, and compilation, to make way for incremental compilation. looks like this is the next step: make this easy to use. Entry: repl during compilation Date: Tue Apr 22 10:24:22 EDT 2008 in order to have the same debug 'compile' mode as in brood-4, some access to the asm state is necessary. this needs to be implemented as a breakpoint word, one which prints out the whole state in a meaningful way. Entry: flashforth Date: Thu Apr 24 16:08:58 EDT 2008 going through the flashforth tutorial, and it seems mikael has been busy. with some optimizations here and there. it's nice to have an example like that. this does bring me to the optimization vs. simplicity trade-off. it seems difficult to stay at either extreme. Entry: state extensions through shift/reset Date: Fri Apr 25 10:22:36 EDT 2008 with this shift/reset thing working for augmenting the compile state from one straight code list to an assoc list of such, i think it might be better to do the same with the 'data' element in the core: everything that uses the 2stack state should use this dynamic extension mechanism. to extend state: * define mixer words using 'make-breakpoint', referring to a prompt tag * wrap such code in 'with-breakpoints' the place to start is the control words. this needs to be made independent of state rep anyhow. CONTROL INTERFACE: - state - state-cons - state-stack the only one that's really problematic is 'apply' because it performs both function application on an isolated stack + state merging. maybe this should ignore state effects? solve later: might disappear when state threading is done using composable continuations. NOTE: the business of 'merging state' in the monads/JOIN operator seems to be a generalization of assignment. ok. so i got all explicit state reference removed. time to wrap the whole thing in 'with-breakpoints', and turn the low level rep of scat functions to stack only. so.. made the change. the remaining question is "where to wrap"? should all macros have SCAT prototype, or are they converted to a 2stack -> 2stack mapper at some point? Entry: closed or open? Date: Fri Apr 25 12:25:27 EDT 2008 the remaining question for the compiler is: how to represent macros. are they open SCAT functions, or closed 2stack -> 2stack functions. the problem with the latter is that it can't be composed in the scat way. so, let's do this: if composition in SCAT is necessary, prototype remain stack->stack + open dynamic refs, otherwise the expressions are closed using 'scat->2stack' now, this has implications for the pattern transformers: pattern-tx->macro will represent a transformer as an open SCAT function. ok.. got that sorted out. see k/asm->scat in scat-2stack.ss now it's time to get in trouble: make-split-update requires access to the asm state, so it needs to operate on closed macros. open macro = scat function (stack -> stack) closed macro = 2stack -> 2stack so, the solution is to not operate on open macros, but on closed ones: that way the compile mixer has access to the asm buffer. (i need some terminology cleanup) next problem: 'with-exit' requires access to asm buffer. maybe time for a break.. can't wrap my head around this: m-exit uses a parameter, so can i create a mix function that references this parameter? probably not, because the dynamic context of the mix function is likely outside the scope of the with-mexit. maybe the macro exit status should be hidden in the overall compile state. alternatively: m-exit could close over the asm-buffer? i don't really understand yet.. summarized: how do parameters and delimited continuations interact? the problem with 'with-exit' as it is now is that without turning it into a 2stack mixer function, there's no way to access the state. so, bascly, i need a mixer that calls with-breakpoints. does that work? it really should work. what is this? local closure. ok.. i'm going to try first to mix parameters with the breakpoint control structure, then possibly make an abstraction for this. got too much in my head again.. macro->postprocessor operates on closed macros. ok. i have the impression to be on the right way: this way of composing does need a better abstracted api. instead of using 'values' in the state update function, use a structure type: the type has 2 values. Entry: better abstraction Date: Fri Apr 25 19:19:23 EDT 2008 1. the low level sequencer api: * make-breakpoint * with-breakpoints 2. high level lifting api: consists of adaptor functions constructed from state wrap/unwrap functions. i'm REALLY close to figuring out the relation to monads, but can't wrap my head around it yet. instead of applying the continuation in 'with-breakpoints', that operation should be abstracted. Entry: weird bug Date: Fri Apr 25 22:27:35 EDT 2008 ok. almost working, except that i get 'stack' instead of '2stack' in the mexit-update function, while 2stack-mexit gets a '2stack'. i hope this is not a conceptual error with order of prompt tags.. looks like it is.. i should check if it's possible to mix the open/close constructs. basicly, what i'm doing is this: create a shift with a reset in it. (shift (E (reset ...))) the normal operation is nested shifts: (reset (E1 (shift (E2 (shift ...))))) it really should not be a problem: shift can only see the inner reset, the only way a reset can disappear in a reduction is when it exits (returns a value). (reset val) => val (reset E[(shift k expr)]) => (reset ((lambda (k) expr) (lambda (v) (reset E[v])))) ;; where E has no reset -- (reset val) => val (reset E[(shift k expr)]) => (reset (define (k v) (reset E[v])) expr) i can't juggle with it yet.. maybe make some test cases? EDIT: i really don't see it.. put in some logging tags, and i don't understand that order either.. looks like the 'nested closing' doesn't work. Entry: in words: mexit Date: Fri Apr 25 22:28:15 EDT 2008 during the execution of a macro, the word ';' will compile a jump to the end of the execution, except when it occurs at the end. the word will be split if necessary. => 'except' : remove last jump. the problem i'm facing is where to store the state necessary to implement this. was implemented using parameters + side effects, but want it to be in terms of shift/control to give a purely functional implementation. Entry: giving up? Date: Sat Apr 26 09:15:22 EDT 2008 maybe time to see if this can be installed in the compile state. what i need there is a stack to trace the dynamic context of macros. it feels wrong though, to not have this coincide with the dynamic call stack.. but maybe it is becase i'm messing that up, that i need it stored explicitly somewhere else? the problem is: what if macros don't exit? am i allowed to jump outside of context? does it need dynamic-wind? hmm.. if a macro doesn't exit, it probably also doesn't produce code. on the other hand: having the mdyn stack available might be interesting. Entry: next try Date: Mon Apr 28 14:08:47 EDT 2008 let's make a simpler example first: 2 level nesting of scat->2stack and 2stack->scat. works just fine.. (define-ns (macro) test-b (lambda (s) (printf "test-b ~a\n" s) s)) (define-ns (macro) test-a (2stack->scat (match-lambda ((struct 2stack (asm ctrl)) (printf "test-a ~a\n" asm) (let ((out ((scat->2stack macro/test-b) (make-2stack asm ctrl)))) out))))) box> ,scat (require "macro.ss") (macro:: 1 test-b 2 test-a 3) toplevel in /home/tom/scat/ with-breakpoints:init # # # with-breakpoints:next # # # test-b # with-breakpoints:next # # # with-breakpoints:next # # # test-a ((qw 2) (qw 1)) with-breakpoints:init # # # test-b # with-breakpoints:next # # # with-breakpoints:next # # # with-breakpoints:next # # # (qw 1) (qw 2) (qw 3) box> so it's somewhere else.. EDIT: i ran into a segfault somewhere.. EDIT: that bug is fixed, maybe try to see if this code now works? EDIT: ok.. my bug is still there, i'm going to let it go. Entry: destructive assignments Date: Mon Apr 28 15:38:55 EDT 2008 so.. why am i doing this? suppose one takes a partial continuation which has state, does it hang on to this? (define ((integrate state) in) (set! state (+ state in)) state) (define k (reset (let ((x (integrate 0))) (x (shift k k))))) box> (k 1) 1 box> (k 1) 2 box> (k 1) 3 box> (k 1) 4 yes: multiple executions of the partial continuation keep their state. why would they do otherwise? that's the dragon i'm fighting: i need an abstraction where the partial continuations are pure functions. the important remark here, also related to finding a decent abstraction instead of the breakpoint one: what i'm doing is 'splitting' a composition. maybe i should go back to that, istead of the mixer/update abstraction? basicly, something like (a b c | d e f | h i j) with-split -> (abc def hij) the funny thing is, trying to just split = (lambda (x) (shift k (cons k x))) doesn't give a list, a pair with value and continuation, which in turn produces a pair with value and continuation. Entry: simplified sequencer Date: Mon Apr 28 18:16:47 EDT 2008 maybe an explicit sequencer isn't necessary.. a b c >asm d e f what should '>asm' do? obtain the continuation (d e f) and obtain the state from somewhere. so what it should pass to the driver is a procedure that takes a state and a continuation and produces a state. mix (state k:type->state) -> state this looks like 'bind' (Ma a->Mb) -> Mb Entry: is it possible to implement mexit as a parameter? Date: Mon Apr 28 21:58:10 EDT 2008 as long as the parameters are retrieved inside the proper dynamic context (not inside a mixer!) there should be no problem mixing _immutable_ parameters with the stitch mechanism. _mutable_ parameters are a problem when multiple executions are desired. for mexit, this includes the exit label reference count (number of exit points in a macro). i.e.: suppose some macro executes multiple times, and on each execution it calls mexit: the reference counts will add up. but, if this effect can be kept local, there is really no problem: if macros are wrapped, the state is only visible during execution of that macro. i can imagine cases where this is violated: : bla ... ( ... ; ... ) ... ; the code between parens might be grabbed to be compiled in multiple variants as part of an optimization: here ';' really shouldn't have any side effects except for the result produced by the variant used. so, on order to keep the design of the compiler simple, the following requirements for macros are a good idea: * side-effect free wrt. code produced. * read-only parameterization allowed: not necessary to be referentially transparent as an exception, side effects ARE allowed if they do not influence the compilation results (i.e. logging). the reference count tracking of exit label references violates this. Entry: practical mexit Date: Mon Apr 28 22:14:36 EDT 2008 so, can i sidestep the issue and eliminate unnecessary splits at the end of macros? probably not: will mess up optimizations. local exits for macros are an exception. is it possible to somehow automatically ignore the last one, to scan the code for references? no: if a split occurs during the execution of a macro by any other means, some jumps to exit might not be visible. a b c ; d e f ; so, what are the tasks: * maintain an exit label (dynamic parameter) * figure out whether to split or not at the end of the macro the problem is that ALWAYS splitting is bad, because it interferes with optimization. checking if the label is reachable should be not too difficult if the start of compilation can be marked somehow. let's go back to what i'm really trying to do here: to _emulate_ a return stack. why don't i just have such a thing instantiated explicitly in the compiler state, so other return stack operations can be emulated also? Entry: struct macros Date: Tue Apr 29 11:41:00 EDT 2008 it would be nice to abstract away the details for update pattern matching. this requires some access to struct layout. basicly, generate this from struct names: (define-sr (compile-update (icurrent iwordlist iasm ictrl) (ocurrent owordlist oasm octrl)) (match-lambda* ((list (struct compile-state (icurrent iwordlist)) (struct 2stack (iasm ictrl))) (values (make-compile-state ocurrent owordlist) (make-2stack oasm octrl))))) actually, it's much more straightforward to do this with structure type inheritance. this requires a deep change though.. first: switch order of fields in 2stack Entry: structure types and inheritance Date: Tue Apr 29 14:35:02 EDT 2008 now i feel stupid delimited control isn't necessary at all here.. simple inheritance will do the trick just fine. one thing i didn't get though is: inheritance works nice for read, but what's needed is to construct the right output type, so the update function needs to be abstracted somewhere.. -> all derived structs now have an 'update' function in the first field, and a direct constructor as in: (define update-compilation-state (case-lambda ((state ctrl) (update-compilation-state state ctrl (2stack-asm-list state))) ((state ctrl asm) (update-compilation-state state ctrl asm (compilation-state-current state) (compilation-state-words state))) ((state ctrl asm current words) (driver-make-compilation-state ctrl asm current words)))) (define (driver-make-compilation-state ctrl asm current words) (make-compilation-state update-compilation-state ctrl asm current words)) ok.. done feeling stupid. works, and is a lot easier to understand. this can be implemented more efficiently using lists: less copying, more sharing. not important atm. this abstraction makes it a bit easier to use: ;; state matcher which introduces 'update' (define-syntax (state-lambda stx) (syntax-case stx () ((_ type (var ...) . expr) #`(lambda (state) (match state ((struct type (update var ...)) (let ((#,(datum->syntax #'type 'update) (lambda args (apply update state args)))) . expr))))))) maybe use syntax parameters instead of introducing a symbol? Entry: summary Date: Tue Apr 29 19:19:57 EDT 2008 last couple of weeks were dark. what came out? * using inheritance + abstract factory gives a much simpler solution to hidden state threading than composable continuations. inheritance solves state read, while abstract factory solves functional state update. * read-only parameters are ok for macros, but mutable parameters or mutable closed variables used for determining code output are not: if continuations are to be used to perform certain optimizations by trial and error, it's best to stick to pure functions. (i don't care so much about referential transparency as i care for macro side effects.) * i've got a little more intuitive understanding of monads, and am now of the opinion that they are too general for what i'm trying to do. also: absence of polymorphism makes them hard to use in scheme. and, what i'm trying here might fit better under the arrow abstraction, but i'm unfamiliar with that. Entry: shift / reset and for .. next ? Date: Tue Apr 29 20:36:56 EDT 2008 what is the reduction rule in rpn? shift ... reset => ( ... ) reset the problem here is twofold: - what prompt tag to use? - how to pass the continuation? it's probably better to use 'call-with-continuation-prompt' Entry: brood is Date: Wed Apr 30 18:34:27 EDT 2008 entry://20080329-093950 functionality: RPN SCAT PAT+MACRO FORTH PURRR types: SCAT: stack MACRO: + asm stack FORTH: + dictionary / current word / macro rs implemented using functional data structures. Entry: next action Date: Thu May 1 01:49:04 EDT 2008 time is running out, what needs to be done next? - fix the for .. next optimisation as a pilot for other partial continuation based optimizations. (-> delimited control + lazy code) - port the monitor code. - port the interaction code. - port catkit / sheepsint. - find an easy bootstrap for catkit from within brood. - serial port interface (PLANET PACKAGE) - simulator interface Entry: simulator and partial evaluation Date: Wed May 7 12:48:26 CEST 2008 * interaction should really be partial evaluation of machine instructions. * the assembler should be specified as a simulator (functionality), and state dependency (for data flow analysis) this should be implemented as a separate macro language. (TODO: look at plane notes again) Entry: wire protocol Date: Wed May 7 13:02:14 CEST 2008 it's important to have a look at what exactly goes over the wire. the minimalistic monitor there is now is nice for minimal complexity, but something that can be inspected directly has its advantages. i'm thinking about the prefix notation from pltix. Entry: namespace stuff Date: Wed May 7 13:26:58 CEST 2008 it might speedup compilation a bit to separate phase +1 code from the core routines: right now, the whole scat module gets instantiated during compilation. this is back to how it was before moving all phase level 0 and 1 code into one module for convenience. i've tagged (they print their name) the modules that instantiate stuff so it is clear that compilation does not spuriously instantiate any code that only makes sense at run time, and that run-time instantiation happens only once per namespace. so, that works pretty well: an instantiated compiler + code dictionary is represented by a namespace. this is a "synchronous late-bound object": it takes messages that can alter its state, and returns values. what is missing is a way to serialize the state as object code that can be imported again without compilation. Entry: machine model / partial evaluation and state management Date: Wed May 7 17:52:08 CEST 2008 the idea is to be able to evaluate (simulate) code off-target, as long as it only depends on MACHINE state. [movlw 123] can be translated into: read,modify,write with the update happening off target. [movwf LATA] is a border case: it can be split in read,modify,write but it also effects external physical state. what is required is a clear definition of what simulation means: is it completely isolated from the 'real' world, or does it just simulate the computation part of the target? does [movwf LATA] alter the output of pins, or does it modify some internal model? it would be best to make this behaviour pluggable: the amount of 'realness' should be configurable. the modes are: | STATE COMPUTATION -----------------|------------------------ (1) stand-alone | real real (2) tethered | real emulated (3) simulator | emulated emulated (4) test | emulated real and, really, you need only the first 3. does the 4th one make sense during application development? actually not: the CPU is a functional unit, and can be exactly emulated (in principle, might not always be necessary: partial emulation can be good enough). this mode DOES make sense during emulator testing though. (emulating STATE completely might be impossible since it depends on the external world) the place to introduce emulated state is in the partial evaluator of machine code. so.. what you want is to be able to modify meaning of code depending on level of simulation. i.e. [movwf LATA] might mean: (1) execute the instruction on the target (2) simulate the instruction as passive (memory only) machine state update on the host + write the state to the target (3) simulate the state update as active machine state update, do not involve the target. (i.e. writing to the latch might set the state on input ports during next instructions.) (4) compare the state update simulated on host and executed on target probably i should generalize brood as a framework for pluggable simulation. this is more general than the previous emphasis on tethered development, and potentially a _LOT_ more powerful. it's probably best to focus on memory mapped i/o and synchronous execution: get it to work for the PIC18 first, then generalize the architecture. each functional unit can be implemented as a thread. what you want basicly is fine grained control over what exactly is executed on the target, and what is not. there is an order relation hidden here: it's impossible to simulate state update when executing code on the target, this means there's a directed graph of 'realness' that can be used as a guide to building a code/data structure to implement this. given the program source, it can be compiled for: (1) running completely on the target (2) running partly on the host + target state update. the latter could be plain code execution. (3) complete simulation some remarks here. * time-critical software needs to run on-target, so it is important to design programs such that they can be tested by virtualizing the stimulus (slowing down time): make everything synchronous, that way time is an integer and can be abstracted. simulate non-synchronicity on top of this. * the application domain is massive parallel, so the basic unit of simulation is a task. PLT scheme has all the necessary tools to build this kind of thing. it would be interesting to equip purrr18 with some libraries to implement state machines and tasks in a way that works well with the simulator. * program compilation = partial evaluation of simulators. i.e. [movwf LATA] can be compiled to machine code and executed on the machine only if LATA is real. an application will compile to 2 things: 1. supporting machine code to run on target (i.e. the monitor) 2. host side entry point, which might sequence simulation * not so much related, but can 'incremental dev' be used here? only recompile parts of target support code that is necessary? this is an optimization problem which only needs proper dependency management (memoization) and can probably be solved seperately. Entry: simulator problem definition: generalized interaction mode Date: Wed May 7 18:43:54 CEST 2008 given 1. (assembler) source code 2. cpu (functional) and memory/port (state) model generate: 1. binary support code to upload, possibly incrementally 2. a toplevel driver function that starts the simulation work with assembler source code to keep the machine model simple: source code simulation is never going to be accurate enough to be generalized: you want the nitty gritty. this also enables the decoupling of the compiler and the simulator: external compilers can be used. so, should the memory model be destructive or not? this boils down to the question: what is more important: speed or the ability to have non-straight line execution? what about going for the cleanest solution, and have EACH memory location be a port, with memory being a simple loopback port? so the only state related to memory model would be the configuration "patch", which is static for a certain simulation. all other state could be task-local. Entry: base machine: real or virtual Date: Wed May 7 20:20:36 CEST 2008 there's no need to do work twice, so what machine should be used for the old interaction mode? the 3-instruction forth? the problem is primitives: currently, the primitives are the machine instructions. so accurate simulation means simulation of those. on the other hand, it might be more flexible to allow a higher level simulator to test algorithms. the thing that gets in the way is premature optimization: part of the problem to solve is manual machine mapping: currently, purrr18 is more PIC18 than Forth. maybe something intermediate can be constructed: a VM that implements a subset of instructions without optimizer? so problem: find an intermediate language that is easier to simulate than target (too complex / target specific) or language (too language specific, underspecified) EDIT: this is a serious problem. 2 stances: * forget about target simulator: concentrate on programming language, and make it clean. that way simulation is easy because the primitives can be simple. * forget about the language simulator: what you want is a tiny layer on top of the real machine, and possibly use multiple languages and binary code. if i have to choose, the 2nd one is really the only practical solution. the only disadvantage is that it's target specific.. is there some trick to be able to have both? Entry: metaprogramming in the real world Date: Thu May 8 12:35:56 CEST 2008 talking to axel yesterday, and he was saying that he's doing nothing but writing scripts that write scripts. what does that really signify? why is metaprogramming so effective? there's a selling point hidden here.. i'd say: it's so incredibly difficult to build an interface to extremely parametric code, that it's better to just turn it into a proper language with its own composition mechanism, such that it is complete. the metaprogramming then eliminates the tedious step of making compositions that can't be composed in the base language. or, it removes the necessity to extend the base language for one specific problem. Entry: simulator generator + specification Date: Thu May 8 13:16:10 CEST 2008 another thing that popped up in the discussion: how fast is the simulator? can it be specialized? maybe this is an essential point also: concentrate on creating a simulator generator. this means the simulator needs a specification language, so it can be compiled to fast specialized code later. Entry: loop bodies and delimited control Date: Thu May 8 14:34:32 CEST 2008 maybe today is not the day.. tired and stupid, i'm not worth much. but i run into very strange results when trying shift/reset. why don't these work? (define-word shift stack (shift k (cons k stack))) (define-word reset stack (reset stack)) reset actually doesn't install a prompt around a computation, because 'stack' will be evaluated before it is reached. it expands to (call-with-continuation-prompt (lambda () stack)) what i need is a macro that does this. that is unfortunate, because all code that uses it will need to be macros too. is that really true? yes, looks like it is: otherwise the code gets evaluated before it's passed to reset. looks like i need to play with evaluation order a bit: instead of using strict evaluation, it might be easier to use lazy. what about: every scat function takes a delayed computation? it shouldn't be too difficult to change this in one place only: wrapping each strict function so it becomes a lazy one. (lambda (lazy-apply fn thunk) (lambda () (fn (thunk)))) or (delay (fn (force thunk))) Entry: strict vs lazy Date: Thu May 8 16:57:49 CEST 2008 so, i run into a point where evaluation order does matter: reset needs to delay its argument.. is it worth it to modify the entire representation to a lazy one? the thing is: modifying evaluation order requires macros.. but macros are viral: any composition of a macro is again a macro. on the other hand, why isn't EVERY function a macro? with the bulk using strict application. do i really need to access functions directly, or is (scat: bla) enough? or.. why can't composition automatically be macrofied, or, why can't i have unbalanced parenthesis? it looks like there is really no way around this: in order to capture dynamic compositions, code needs to be delayed so a prompt can be inserted BEFORE evaluation. EDIT: so.. lazy eval. does that give problems with sequenced code? no: since a scat program is already sequenced (composition of unary functions) there is no problem here. when this is done with the datatypes themselves, it should be fairly straightforward: states are thunks. Entry: concatenative family (Cat language) Date: Fri May 9 18:12:11 CEST 2008 it's time to dive into Joy, Factor and Cat again to see where things are different. especially Cat, since Christopher and me are doing similar things for 2 years now with little interaction, and with a slightly different focus. in SCAT: * ties to scheme are important. my goal is not to write a stand-alone language. hence the choice of PLT Scheme, which is pretty big.. * SCAT is dynamically typed. * SCAT is not linear. * MetaCat: i use term rewriting, but in a different part: i see no need for SCAT metaprogramming other than introducing non-concatenative language elements to support Forth. otoh, rewriting is _very_ important in the PIC18 code generator. however, the code that is rewritten is symbolic assembly code, not SCAT code. * SCAT is only used to support MACRO. it's probably not general enough as a full programming language. however, things are easily snarfed from Scheme. (with Dave's move to PLT Scheme for Fluxus, there is an interesting road to travel there though..) TODO: relation to Factor and Joy. Entry: peephole optimizer Date: Fri May 9 18:25:40 CEST 2008 maybe it's better to separate the machine specific optimizer from the code generation step? that way the peephole optimizer can be reused with different languages, and probably be tested using a machine model. Entry: assembler expression language Date: Fri May 9 18:35:17 CEST 2008 what currently is 'target:' might better be written in s-expr syntax, so that it's easily converted to concatenative syntax. (the other way is more difficult). while now it's kind of cute to have this concatenative language map to a concatenative assembler expression language, later when external assemblers need to be supported, this might become a nuisance. also, and probably more important: a distinction needs to be made about data types and partial evaluation: * use scheme's infinite precision types * use only target types (i.e. accurately SIMULATE the computation) Entry: documentation + presentation Date: Sun May 11 11:42:37 CEST 2008 introduction: Brood is a metaprogramming environment for deeply embedded programming, starting with the idea: "How to modernize the tethered Forth approach?". Forth is appropriate for programming small computers, but too low-level for a host-side metaprogramming framework. Scheme is ideal for this. The second objective is to generalize this to special-purpose problem description languages. what is metaprogramming? - use language A to generate code in language B - A = B possible, but more likely A > B (more high level) - partial evaluation of the usual language tower, to limit complexity of on-target support code. (give up some generality) - overall idea: use high level construct where possible, but specialize to low-level where possible. why are macros important? - aren't functions enough? - partial evaluation: separate compile and run time. (get extra cake at compile time without giving up possibility to use highly specific code.) why these weird languages? - scheme = clean lisp. lisp's strength: * metacircular interpretation (language defined in itself) * leads to easy metaprogramming (lisp macros) * scheme: based on untyped lambda calculus = functional programming with imperative extensions (environment model) - forth * due to the concatenative composition model, the language itself is quite powerful in itself, despite its simplicity. (even without dynamic memory management or garbage collection. related to innate 'linearity') * base language = static, suited for real-time applications * efficient: thin machine model for simple sequential chips (less efficient for pipelined number crunching processors: a dataflow language would be better suited there). * simple metaprogramming * it has a purely functional + purely concatenative subset different brood layers: - PLT scheme module system + module languages - SCAT: purely functional intermediate language implemented as Scheme macros. - MACRO: purely functional metalanguage on top of SCAT. a MACRO program generates (symbolic assembly) code. it includes PAT which combines code generation and peephole optimization. - FORTH: syntax on top of SCAT or MACRO to provide the non-concatenative part of Forth (parsing words like ':'). - ASM: * target specific assembler generator * target address expression language (= SCAT) * standard n-pass branch instruction code relaxation - LIVE: live target interaction / simulation framework. why from scratch? - to gain deeper understanding - to find a natural modularity without tool-specific idiosyncrasies can it use external tools? - yes, but design is optimized for internal tools. (i.e. compiler -> assembler interface uses structured data instead of text) can it use different languages? - interfacing on object level: no problem (not implemented yet though) - since custom languages are the core business, i see little advantage supporting standard languages (like "C") directly. - however, purrr is a core component. why not OO? - FP is natural: compiler = a function. maps source code to object code. - stateless code generation makes different code generation paths easy to implement (output feedback without environment setup) - easier to do OO in FP than vise versa. different implementation language? - Forth/C/C++: been there. too low-level while performance payoff not so important. - Perl: i tried before, but i prefer structured data to strings - Java: too clumsy. - Haskell: i'm tempted, but probably too little wiggle space to evolve a design. The final impelementation however might work well in Haskell. Scheme's approach is conceptually closer to metaprogramming, also wrt. ML. - other dynamic OO languages (Python,Ruby,Smalltalk,...): i'm not particularly convinced they are better than a FP oriented language. Entry: inheritance for state threading Date: Sun May 11 14:14:23 CEST 2008 What is the practical reason for using threaded state instead of real state? to simplify composition of code generators, primarily to allow multiple applications without causing side-effects. It makes the life of the optimization implementer easier. This is probably a spot where multiple inheritance might be appropriate: if there's no clear hierarchy to state extensions, forcing one might not be a good idea. Entry: target expression language (TEL) Date: Sun May 11 14:46:50 CEST 2008 1. what is it? The target expression language is the vehicle for expressions that depend on target labels (static memory addresses), and are passed to the assembler to be evaluated after static target memory allocation. In the integrated compiler + assembler architecture in Brood, these expressions are computations closed over (initially unresolved) target word structures. For external assemblers, they need to be translated into strings that represent target assembler expressions. Because of the need to represent external tools, this language benifits from an intermediate form. (currently, and only for illustration, this is symbolic SCAT code, but will probably be replaced by s-expression code later). 2. where do these expressions come from? The expressions are generated by the peephole optimizing code generator, mostly as partially evaluated target code. I.e. the Purrr code ' main 1 + compiles that code as the expression which adds 1 to the address of the "main" procedure word. At compile time it can be determined that the value can be obtained at assembly/link time, so literal instructions can be generated. However, at compile time only the computation can be stored, due to possible dependency on (as of yet undefined) target label values. Entry: purrr compile time expressions Date: Sun May 11 15:01:22 CEST 2008 One of the cool core features of Purrr is the ability to inline compile time computation without extra annotation. (In Forth traditionally the words '[' and ']L' are used.) This allows for a very flexible macro composition mechanism. However, the computations are performed in infinite precision, and do not give the same results as would do the same code without these compile-time computations. I.e. in Purrr18: 1000 30 / makes no sense on target due to 8-bit limitation (and possibly non-availability of the / operator), but makes sense in Purrr18 because the endresult will be truncated only in the end. What is compile is [DUP] [MOVLW (1000 30 /)] because the target is only 8-bit, this sort of computations is very useful, and really should be the default (over annotated meta-computations). Now the question is: i'd like to build a simulator that can execute the generated PIC18 code. is it possible to generate intermediate code that's easier to simulate than PIC18 code, but produces the same results? This looks like a very hairy problem: PURRR18 is a 'dirty low level' language where constructs are defined as-is, and you should be aware of bit-depth limitations and flag effects etc. The stay-out-of-trouble part of me says i should stick to a genuine PIC18 simulator. It can simulate code generated by other means. Following that approach probably also makes it easier to plugin external simulators. Entry: external tool interface Date: Sun May 11 15:16:15 CEST 2008 To increase the commercial usefulness of Brood, external tool interfaces are absolutely essential. As a test case, it these things should be present: * gpasm + its meta language * gpsim Instead of writing a simulator, it would be a better exercise to skip the interpreted part and create a simulator generator: this would allow testing of the C-code generation facility for 2 reasons: * an external, possibly C-based interface will be necessary * simulators need to be as FAST as possible Entry: gpasm / mpasm expression syntax Date: Mon May 12 11:33:28 CEST 2008 (see MPASM user guide, chapter 8 for expression syntax) http://gputils.sourceforge.net/33014g.pdf something i didn't know: this language is apparently stateful. there are accumulating expressions like '+=' i'm not sure whether state accumulation is really necessary though: most of what it would be useful for can probably be captured by the compiler, unless it depends on target code addresses. usage will tell.. Entry: electronics engineers should learn scheme Date: Tue May 13 00:13:36 CEST 2008 I don't think I know anyone who has written code in some language and at some point realized that the language is not powerful enough to express a certain pattern, then to move on to writing some script that actually generates code for that particular language from a more highlevel description (or simply a set of parameters). The idea is: it's difficult to create a language that will allow one to describe all possible applications. However, it's not so incredibly difficult to create a SIMPLE language that's aimed at being easily EXTENSIBLE using a MACRO language. So, if you know it's going to happen at some point, why not embrace it from the start and call yourself a language designer instead of an application programmer. This happens especially in domains where hand-assembly is still important: deeply embedded software. Entry: documentation Date: Tue May 13 12:56:32 CEST 2008 Time to start documenting. Let's make it a literal program with proper online cross-ref + a way to reference to ramblings. Let's make this into a tool to structure code for refactoring purposes. Starting with scat.ss 2 kinds of comments: * paragraph: ;; blabla * column (+ 1 2) ;; add it comments are attached to an expression. maybe it's best to avoid column comments entirely? http://groups.google.com/group/plt-scheme/browse_thread/thread/1e2cae24ec84b70a/b59b55e3990da368?lnk=gst&q=scribble#b59b55e3990da368 That thread has an interesting comment on source code documentation: you need BOTH reference (per function doc) and general overview / meaning of a bunch of functions. Entry: load vs. require Date: Tue May 13 19:06:25 CEST 2008 the next problem is load vs req.. should i keep load? in the current .f files a lot is done using late binding: require the includer to specify some words.. this goes against the bottom-up module approach. how to solve it? finding a decent solution for this late binding is quite important: code generation is heavily parameterized: it's inconvenient to have to specify.. i'm already using this trick in the core, so why not in the libs.. the only problem is: it can generate run-time errors. Entry: org (non-declarative code) Date: Tue May 13 20:33:58 CEST 2008 boot.f contains non-declarative code that calls 'org'. so.. how to fix org? the result should be a word struct that has an assigned address. the problem however is that the assembler forces addresses. so this needs a fix in the assembler and the compiler. the problem with org is that a word can start at a certain location, but be split after that. the previous mechanism might not be so bad actually.. switching back to pointers = collection of shallow binding stacks. ok. now how to solve this in forth? there needs to be room for assembler directives OUTSIDE of code definitions. nope.. this violates some entry point stuff.. the org shouldn't be an assembler directive, but some command attached to the list of words passed to the assembler. ok, implemented in the compiler: per word, there's one instruction that can be passed to the assembler about where to assemble the code. how to specify this in the language? it's really like ':', but different.. it would benefit from some kind of parameterization.. this is a tough one: jeopardizing the clean per-word forth defs.. Entry: org Date: Thu May 15 12:02:03 CEST 2008 so.. : bla 1 2 3 ; but what about : 123 1 2 3 ; where '123' is the address? the problem is, code outside of a definition is no longer allowed. the only way parameters can be passed to the assembler is through instructions inside a target-word instance. maybe the same route should be followed as for variables? add some pseudo asm.. let's go to the root problem: * allow creation of words/macros from within macros * allow setting of address of these words fuck i'm doing language design again.. actually, it's not so bad.. the trick is to make 'org' operate on the current label. the code to compile a jump at a certain location then goes: macro : install-vec ` VEC label \ create new label #x200 >org \ set current label's org do-vec exit \ compile its code org> drop \ restore org ; looks like it's working.. the idea: to allow creation of WORDS within MACROS. note that to create macros within macros a different mechanism is necessary: introduction of names needs to be done on the Scheme macro level, so words created as such are not accessible to the bulk of the code by name. it's still not optimal.. it gets in the way of straight-line code.. maybe i should add this concept: code that comes from the compiler needs to be assembled in straight line, but the compiler can ask to dump some code somewhere else too. this should make anonymous code possible too.. argh maybe this is good enough: the only place where it will get in the way is re-arranging of code locations by the assembler (or intermediate step). i.e. the connect-words! function in target-compile.ss won't work. the real problem is: whenever an org-pop happens, compilation can continue at the word where the corresponding org-push happened. this might be a clue about how to implement. the compiler doesn't need to provide a list of words, but a list of list of words, where the inner lists have fall-through, and the outer ones are independent. Entry: org and fallthrough Date: Thu May 15 14:15:51 CEST 2008 Another example where a collision of two or more seemingly trivial but annoying problems that resist elegant solution in the current paradigm leads to a better paradigm. Because of fallthrough, which is a low-level property of assembly code i don't like to give up in the PURRR language, the order of words is important. However, words that ORG at a different address are independent of those that came before, and words that EXIT are independent of those that come after. This can easily be reflected by adding a 2nd level of nesting in the representation of a target word collection: (deque-of (stack-of target-word?)) The operations are: * EXIT/ORG: create a new current-fallthrough list (with possible associated address) * QUEUE: move current fallthrough list to the end What data structure is this? * access top element :: (stack-of target-word?) * add new element to the top * move top element to bottom So, it's a combination of stack and a set, implemented using an assymetric deque. Stuff popped off the stack is recorded in the set (and looses its order). Actually, it might be implemented as 2 stacks directly. Now, there's a deeper problem: this accumulation needs to span across words, so the point where words are already packaged needs to be modified to allow accumulation. Entry: labels Date: Thu May 15 18:44:27 CEST 2008 Every name that occurs in a source file corresponds to a "define" in the scheme expansion, and is associated to a macro: a function that generates code for the named construct. For target words, this creates a reference to a word structure. The problem with target words is that they have fallthrough, or multiple entry points. Entry: the Forth parser Date: Fri May 16 10:32:27 CEST 2008 Turns out that collecting Forth into separate definitions isn't a good idea, because there is no 1-1 correspondence between names that start with ':' and eventual word structures: * there are multiple entry / exit points: words can fallthrough and thus are connected * it's possible to generate words on the fly: each _label_ should be captured into a corresponding scheme definition for the macro that generates it, but code inbetween can be accumulated. Maybe it's easier to just accumulate everything into one giant function? Simply using 'compose' on the current structure is probably enough. So.. instead of (define (wrap-macro/postponed-word name loc macro) (let ((w (new-target-word #:name name #:realm 'code #:code macro #:srcloc loc))) (values (macro-prim: ',w compile) (lambda () (compile-word w))))) we can have (define (wrap-macro/postponed-word name loc macro) (let ((w (new-target-word #:name name #:realm 'code #:srcloc loc))) (values (macro-prim: ',w compile) (compose macro (make-target-split w))))) where everything is dumped into a single macro that generates the code, to be executed later by 'compile-word'. this seems to work in first try.. let's clean it up a bit. Entry: labels and multiple entry points Date: Fri May 16 15:48:58 CEST 2008 code with multiple entry points like this : default-bla 123 : bla-it 1 + ; is useful to have, but it's difficult to handle. the problems happen when default-bla is accessed in isolation, i.e. when moving around code. this needs a proper data structure, or at least assembler support.. * split the code into fallthrough chunks = a list of words where only the last one is terminated. * add assembler support for a 'fallthrough' opcode. the point is: make the data structure such that optimization and code migration becomes easier to do. fallthrough code should be treated as a single entity, even if it contains multiple entry points. operations on word w: (A) does w fall into some w' ? (B) does some w' fall into w this is an extra level of linking between words. (A) is essential knowledge, but (B) doesn't matter much for the word w itself.. so, each target word has a possible fallthrough word. what about this: pass the assembler a list of words, where each word is the head of a fallthrough word chain. the extra compiler state this requires is a set of independent chains. Entry: compiler state update Date: Fri May 16 16:18:48 CEST 2008 state is growing larger so pattern matching isn't the best way of updating it. maybe best to use local mutation: copy the state on entry + perform imperative update. hmm.. that looks even more ugly due to long names and explicit get/set this just needs to be factored. ok, with some factoring (struct dict) it's all a bit more readable: compilation generates a list of list of (word code) inside a dict struct, which will be collected into a list of head words that have the code fallthrough structure recorded. next: the 'org' operations + fallthrough disconnect on jumps. Entry: redefining words + compiler build log. Date: Sat May 17 11:39:19 CEST 2008 I need a proper explanation about why it's good or bad to redefine words. This is about installing 'hooks'. The problem with hooks is that they can get difficult to understand. Let's add some warning to this redefine process. Entry: implementing 'exit' chunk splitting Date: Sat May 17 13:23:52 CEST 2008 basicly, this needs to: * split with new label = dead code * collect current chunk ok.. this, together with factoring out the target-post code and dead code elimination, which is now as good as free, seem to work fine. next: org small fix: instead of eliminating dead code, it's better not to generate it in the first place: compiler will drop when code is not associated to a label. Entry: conditional assembly Date: Sat May 17 17:15:31 CEST 2008 Something that's not implemented yet: elimination of " if", which reduces to an elimination of an "or-jump". This requires some thought, but should be not so difficult to do. Entry: Jump chaining Date: Sat May 17 17:18:16 CEST 2008 This needs to be performed at assembly time due to delayed computations. It's straightforward though: examine the opcode at the start of the target word, and check if it's an unconditional jump. This optimization might introduce new dead code.. Is it possible to move it somewhere else? Entry: implementing org using new datastructures Date: Sat May 17 17:25:38 CEST 2008 org is a property that can be attached to a word chain. what i'd like is a way to "inline" code that creates different code chains, without affecting any optimizations. this would also be handy for anonymous words. Entry: org again Date: Sun May 18 12:23:03 CEST 2008 It's not so simple, because there is no way to guarantee there's only a single chunk going to be compiled: the problem is 'org-pop' which needs to restore the current/code/chunk state. The right way to solve this is to dump the whole structure on the control stack, and restore it on org-pop. OK: push-chain and pop-chain work. it's now possible to create new word chains while compiling another one. It's funny how Forth syntax's inherent lack of nested structures makes you appreciate the simplicity of s-expressions. However, it's easily solved by introducing balanced tokens. So, what about this: if a label's name is anything else than a symbol, it will be evaluated and used as code location. Some more changes: forth/forth-tx.ss will now save the prelude code under a #f label. scat/ns-tx.ss is changed so #f labels do not have an associated define, but DO evaluate their expressions for side effect (which will define a target #f word) Changed wrap-macro/postponed-word to not create a target word struct if there's no word name. Entry: delimited control again Date: Sun May 18 17:11:49 CEST 2008 Now.. hold that thought. Maybe it is better to introduce a proper nesting structure to get at macro code.. observe: * 'reset' needs to be a macro * 'shift' can be plain code what about making reset ']' and put the logic of shift in the balancing word? i.e. for[ 1 2 ] Maybe it's best to return to shift/reset from semantics point, and not from the particular implementation scat->scheme i'm using. Entry: Labels and code Date: Sun May 18 18:48:18 CEST 2008 so.. maybe it's time forget about splitting the target code into words? is it a false abstraction? not really.. but the current fallthrough mechanism does look a bit clumsy. the problem wrap-macro/postponed-word solves is the creation of wrapper macros, which is quite essential as it allows ALL names to be handled by the PLT module + lexical scoping.. whatever the representation is of the code that generates the assembler is moot. currently it's this: (#f . prelude-macro) (word0 . macro0) (word1 . macro1) ... the macro are then wrapped with a split (label) and concatenated again. a different, possibly simpler implementation would be to collect all names separately, and define a single big macro that generates the module's code. the problem here is that inline macros need to be handled differently... so let's stick to the current implementation. * macro definitions are clearly delimited: one name for one macro, no multiple entry points. * forth definitions have multiple entry points + there's an unnamed prelude at the beginning of the file. (names merely interleave one big macro that generates the code body) Entry: org again Date: Sun May 18 19:57:01 CEST 2008 ok.. org-push and org-pop now work: they will compile a single chain of words.. however it's not what it should be! * it's still impossible to SET org permanently. * multiple chains wil 'org-pop' by themselves.. why is this so difficult to get right? probably because i'm trying to keep the effect local, while org is really a global effect on the state of the assembler. so... can org-pop somehow garantee there's only a single chunk compiled? no.. we'll get there eventually.. just need to find the right abstraction. another problem: compiling a jump table will look like a bunch of unreachable code.. (a jump table is a bit of a hack) the jump table is easily solved by using a different word to separate entries, which could enable some extra checking.. so, to look at this from the bright side: requiring a restricted bondage-style structure for the compiler exposes a lot of corner cases that exploit side-effects of low-level constructs. such side effects need to be eliminated: the core needs semantic simplicity, where semantics is close to machine semantics for data operations, but closer to abstract semantics for control structures. Entry: jump tables Date: Sun May 18 20:40:00 CEST 2008 Uptil now these have been an abuse of ';' which breaks with the new dead code eliminator. So, how to fix that? : dispatch route read ; write ; help ; reboot ; the 3 last jumps will be eliminated. so a different macro is necessary. something like : dispatch route read , write , help , reboot ; with a bit of abuse of notation, comma is polymorphic and operates both on CW as on QW. on CW it compiles a jump without exit. Entry: quoted macros (the 'address' word) Date: Sun May 18 22:40:40 CEST 2008 There's a problem with ' abc This really should produce [qw #], where # is the SCAT word representing the macro that postpones the compilation of the word, such that (' abc compile) == (abc) Note: choose 'run' instead of 'compile'. But there's one problem. What is this? ' abc , On QW values, comma will always produce a [dw], so this should compile the address of a function, and fail if it's a generic macro. Ok, i solved it before, it's "address". So, i wonder.. Can't this be done automatically, as part of a postprocessing step before things are handed to the assembler? or as part of target evaluation code? What's the idea here: postpone the conversion from macro -> target-value as long as possible, because the former is more general, but cannot survive the assembler. the problem is the inclusion of such values in assembler expressions: in that case the expression evaluator needs to be aware of them. target-rep can't know about macros (to simplify design), so either/or: * catch all macro instances before they go into a (target: ...) expression or end up as a plain macro in the assembly code. * use explicit 'address' after using the tick operator. * give target-rep a means to evaluate macros. the middle one might be best.. that way representations of words (quoted macros) are different from addresses. essentially, they are.. Entry: i need this to be done Date: Sun May 18 23:27:34 CEST 2008 I'm a bit fed up with mucking about in the low level architecture. Apparently, a sane combination between high level constructs (i.e. code graphs) and low level features such as fallthrough make things complicated, and lead to some tough choices. Anyways, it does look like I'm at some kind of end point with this. It's still quite elegant and powerful. One point needs some more exercise: construction of anonymous macros. This probably needs a move to a lazy architecture for macro representation. Maybe instead of concentrating on for .. next and dynamic macro creation, i should really concentrate on static anonymous macro defs first.. The words [ and ] are not used yet. Let's turn them into static anonymous macro creators. EDIT: see, it's getting big before it's documented. this facility is already there, but using the s-expression syntax: box> (macro:: 1 2 (3 4 5) run 6) (qw 1) (qw 2) (qw 3) (qw 4) (qw 5) (qw 6) ok, that was straightforward: (forth/forth-tx.ss) (define (open-paren-tx code exp) (let-values (((code+ rep) ((rpn-represent) (stx-cdr code)))) ((rpn-next) code+ ((rpn-immediate) rep exp)))) (define (close-paren-tx code exp) (values (stx-cdr code) exp)) NOTE: this opens the road for a lot of functions expressed as hof, i.e. ifte. Entry: code annotations Date: Mon May 19 11:54:18 CEST 2008 it's really not working well to add a symbolic representation from the code: sometimes there just isn't any, due to effect of macro transformation. it's probably best to just store source location, since it's only for documentation purpose. Entry: purrr Date: Mon May 19 12:08:46 CEST 2008 so what's special about purrr? if i'm to explain what this is about, the purrr language itself is rather central to the idea. -> partial evaluation -> extensive use of macros -> functional metalanguage Entry: instantiate left-over macros Date: Mon May 19 12:38:22 CEST 2008 Maybe it's possible to leave quoted macros in the code and instantiate them? this would be a really powerful extension. Can be combined with turning local exit points back into return/jump ops. Entry: assembler directives Date: Mon May 19 13:29:15 CEST 2008 The brood assembler has relatively few assembler directives. This is intentional: the assembler performs ONLY linking and relaxation (and in the future possibly related operations that optimze these processes, such as code reordering.) However, in the PURRR language, some control over code location is desired. How to satisfy - control over address location - chained code to facilitate re-ordering Yes, it's 'org' again.. Maybe it's best to let 'org-push' save the chain list too, that way 'org-pop' can ensure there is only one chain, which is what we want.. (maybe this should just push everything, making the internal compiler state acessible to some macros?) Trying: pop-chain will save the recorded chains as only one chain. This works: ensures at least compilation at correct address. So, this fixes the chain bug for org-push/pop, but still doesn't provide an 'org'. Maybe this needs to be specified somewhere else? Entry: next Date: Mon May 19 21:09:55 CEST 2008 look at plane notes, and entry://20080501-014904 two deep problems remaining: - how to solve 'org' (or, is assembler state access allowed?) - dynamically decompose macros (loop optimizations, lazy code) the rest should be straightforward. i do not have the energy to tackle any of them atm.. can they be ignored, and postponed until after porting of the interaction code? Entry: strict/lazy and macros Date: Tue May 20 02:31:56 CEST 2008 so.. in a lazy language, less macros are necessary because evaluation order isn't much of an issue. in a strict language, the existance of 'if' and 'lambda' as a special form infects certain constructs (to be special forms also) now, where would a lazy language need macros? there is template Haskell, so i guess there is some need for metaprogramming.. Entry: use of monads in dsl implementation Date: Tue May 20 11:24:27 CEST 2008 http://www.cs.yale.edu/homes/hudak-paul/hudak-dir/ACM-WS/position.html Entry: constants and the 'parameter' word Date: Tue May 20 13:39:41 CEST 2008 Constants are a sort of typed macro: they represent literal values, but are not necessarily completely defined in the core compiler: (re)definition of constants is necessary to obtain a specialized compiler that can generate code. The problem i run into is where to generate the error for undefined constants. Currently, target-value evaluation uses target-value-abort to signal a value is not available. However, this should really be reserved for labels only. -> fixed: partial evaluation of target-values NEVER calls the actual evaluation: this means make-constant can pass code straight to the assembler (currently: just the constant name) fixed another problem: wrap-macro/mexit requires the state to be compilation-state, while macro->code from 2stack.ss works only on 2stack. added a mechanism to temporarily wrap the 2stack state in a compilation-state object. so.. this can be moved to forth. simply add a word 'parameter' which will create a constant that's later to be redefined. a parameter is something more than a mere macro: it has a guarantee to produce only a single value. (i'm thinking about things like 'fosc' and 'baud') parameter baud parameter fosc i can't call it 'constant' because of confusion with the way that word works in standard forth. so: A parameter is a stub macro that produces a single literal value. These serve to parameterize lowlevel code without resorting to more explicit parameterization. (i.e. 'fosc' might influence a lot of timing related constants) A parameter that is actually used to generate code needs to be (re)defined as a macro that produces a single value. Otherwise an undefined-parameter exception will be thrown at assembly time. Parameters are thus a somewhat controlled violation of the overall bottom-up structure of Purrr code. small prob: the 2stack / compile-state code gave problems again. what i'm doing now is to avoid that problem and require parameters to be defined in forth code with mexit support. EDIT: another problem with paramters: redefining them with another paramter is not a good idea: basicly, there's a sequential element here (load). Entry: bored Date: Tue May 20 19:53:21 CEST 2008 i'm getting thoroughly bored with this.. need to find some new tricks or get something going, because i'm loosing motivation. Entry: hooks / late binding and kernel modularity Date: Wed May 21 15:06:03 CEST 2008 this problem is more serious that i thought. it's not only used for constants, but for generic macros. maybe add something like a 'macro-hook' which is like a parameter, but doesn't guarantee anything about code generation. i need to think a bit deeper about linking and modularity.. the pure bottom-up approach won't work well. maybe the 'unit' approach is really better? is it possible to import a module called 'link.f' that implements the cyclic name resolution? am i going to be anal about names? i'm already going quite far with early binding.. consistency counts.. the 2 uses are: * compiler extension by redefining some core macros (i.e. 'dup') * code parameterization (both constants and generic macros) wait a minute.. if code generation can be postponed until all the macros have been loaded, then simply adding stub macros that are redefined later would work just fine. maybe best to take the inelegance and get the damn monitor to run.. in essence, it's a problem with the .f source code. any abstraction necessary to make that code more modular can be added later. EDIT: it really gets in the way.. 'route' gives problems. but that can be imported? this problem is solvable, but requires some thought.. Entry: another layer? Date: Thu May 22 11:24:05 CEST 2008 I was thinking about putting target-compile.ss in forth/ because it's mostly about extending the macro/ stuff with features necessary for code instantiation and target label management. Or, it should be placed in compile/ Can macro-lang.ss be made independent of target-compile.ss ? Yes, when macro-lang.ss splits off a target-label specific part. This code should move to compiler/ (which is badnop/ now -> get rid of that name) Entry: units Date: Thu May 22 11:38:11 CEST 2008 I need separate compilation with clearly defined interfaces for some components. One would be the logger: since it cuts through everything, changing its code requires recompilation of the whole codebase. A nice excuse to try to understand units, then to move on to using this for .f files too. Entry: another bug in redefine Date: Thu May 22 14:22:19 CEST 2008 whenever a word is created, it creates a replacement macro. this macro should have redefine enabled also. ( i enable mutation and things start going wrong.. ) Entry: no more juice Date: Thu May 22 21:13:30 CEST 2008 Looks like i need to take some time off of the project, do some other things. Looking at what i did in the last 2 weeks: * delimited continuations for loop body optimization: strict vs. lazy * trying to fix org, it's still not fixed (language design issue: i have no way to annotate this in the current structured representation) * struggling with specialization (redefine + super) and plugin behaviour. * trying to write documentation for the project * thinking about simulators, and simulator generators The good things that happened: cleaned up compiler data structures + separated postprocessing optimizations. Those look nice now. The rest was a random walk, however, the EXTEND and LINK problems are quite important, and as far as i can see the only real hurdle. Entry: more juice Date: Fri May 23 09:30:49 CEST 2008 got a good night sleep + some ideas about writing documentation today: - write more docs: create a reference doc extracter - separate some code and change names to make the module hierarchy more clear - write something about forth and closures Entry: introduction documentation Date: Fri May 23 09:31:17 CEST 2008 - It is about language: * Lisp (more specifically PLT Scheme) * Forth (the Purrr dialect) - It is about Meta-language: Macros * S-expressions (Lisp) and concatenative syntax (Forth) are easy to process. It's possible to make an all syrup Squishee. * Forth, viewed as a functional language, has an arbitrary evaluation order. This presents an opportunity for generating static, specialized low-level code from high-level templates by employing implicit compile-time evaluation. The Purrr experiment is about making Forth more declarative. - Design is accessible. * Unit of composition = Scheme module. * Forth source files are Scheme modules * Design is layered: Scat, Macro, Compiler, Forth syntax, ... - Goal * Small business and enthousiasts first * Test in industry setting (needs a specific problem to solve) Now, this should be elaborated in a couple of chapters, with lots of examples. Entry: Community bootstrap Date: Fri May 23 11:23:18 CEST 2008 A plan to attract developers. What is necessary? - it should work relatively flawlessly for 1 target - it should be well-documented - extensions should have a clear API The first two are mostly perspiration. The real challenge is to standardize some APIs. I am hesitant though about standardizing too much: the aim of the project remains the construction of a tool for 'from scratch' development. - Purrr language extensions These are at library level, and can be developed separately from the core project. Purrr standardization is mostly ad-hoc, but due to PLT's module system, glue layers are fairly straightforward to maintain, and standardization can be made the responsibility of the system designer. - Processor extensions: Target-specific extensions can be separated entirely from the Staapl core. The PIC18 architecture specification is an example of this. It boils down to: * Creating an assembler. This can be a layer on top of an external assembler, or a specification in the assembler generator language. (see pic18/asm.ss) * Create a set of macros (compiler extension). Map the Purrr language to the machine structure (data stack, return stack) and implement the primitives. (see pic18/macro.ss) The interfaces for these operations are now fairly standard. - Internal compiler extensions: Changes to Scat, Macro or the compiler would best be incorporated as as core changes. Such changes can be temporarily forked, and later merged into the main distribution. However, I expect this part of the program to be relatively stable. - Simulator extensions: As an addon to the interactive part of Staapl, a simulator (generator) would be nice. The interface for this still needs to be developed. This could be written as a processor extension. - Extra languages: A future goal is to construct a static dataflow language to supplement Forth code for building DSP applications. How to structure this isn't clear yet. - Application spin-off communities I'm thinking about Sheep, CATkit and KRIkit: the applications that have been used to battle-test the Purrr language. This would involve some kind of Purrr library standard. I'm thinking about writing something a bit closer to ANS Forth, but there is insufficient project pull to do it atm. Entry: redefining names Date: Fri May 23 12:37:48 CEST 2008 Let's re-iterate the design choices: CASE A: compiler specialization CASE B: parameterized kernel code for CASE A the choices are: explicit renaming vs. implicit redefinition. for CASE B: linking (using the mechanism from CASE A) vs. explicit instantiation. EDIT: it might be best not to loose to much time on trying to fix this now, and 'emulate' the classic load-style redefine behaviour until a better abstraction mechanism pops up. Entry: namespaces and parameterization Date: Fri May 23 13:11:54 CEST 2008 Instead of trying to use late binding, it might be more interesting to use explicit parameterization where possible. This is one of the problems with Brood 4 which caused a lot of pain. Let's fix that first. Taking the interpreter code as an example. What i'd like to do is to make instantiation explicit: : interpreter ' io make-interpreter compile ; There's an interesting problem here with choice of words. Maybe here 'compile' should be used to indicate that there's an instantiation going on. It's not always clear whether something is dynamic or static. There's a difference between: : interpreter int-body ; and ` interpreter create-interpreter The latter is preferred. It is more general. But is not possible in the current implementation. Labels can be defined, but they are only for annotation, and visible AFTER compilation. Can this be simplified? What is desired is something like ` bla Where at that moment the current context has a macro called bla, which is not yet associated to any code. So how to associate a macro to code? Maybe instead of mapping Forth code directly to 'define' it should be mapped to a 2-step procedure of undefined macro creation + single assignment. I'm touching some core of reflection here.. Something that doesn't really work well together with the way the Purrr language is layed out. There's no problem with doing this in s-expression syntax, but expressing this in Forth syntax is not so easy.. Let's see.. Quick hacking it: \ create the macros declare read declare write \ define them ' read ' write make-io! But, this requires make-io! to have side-effects + read and write not to appear in any code before make-io! is executed. That is definitely not desirable: it would be back to the old imperative style. So, what about using this as the primary syntax: [ 2 3 4 ] define bla And writing ':' as a substitution macro? Continuing the random walk. It is possible to do something like: : read io 2nd ; : write io 1st ; But that's exactly the thing that's not flexible enough when a lot of macros need to be created. Is it really a good idea to try to solve this in Forth syntax, since it obviously has some shortcomings.. It looks like the only way to solve this is to use 'parsing word' preprocessors. Maybe the goal is here to figure out a way to create such substitution macros in .f files? It's certainly possible to create scheme files which have the 3 levels: - parsing words (bound to scheme level 1) - macros (scheme level 0) - forth words: consequence of hitting 'compile' The problem really is how to map the scheme s-expression flexibility to Forth code. Can't be done? This is a dead end.. Solve it with substitutions and be done with it? ((io read write) (:macro read io drop compile :macro write swap drop compile)) Entry: Explicit instantiation and macro assignment. Date: Fri May 23 14:04:35 CEST 2008 EDIT: comes from blog, now degraded to a rambling because it's confusing given i'm embracing 3 level code now (word,macro,parser) There is one pattern that is cumbersome to express at this moment: * Write logic as macros without specifying concrete names. * Provide names during instantiation. If this is about creating a single function or macro, it's straightforward: : interpreter ' receive ' transmit compile-interpreter ; Because the .f syntax at this moment assumes that all macros occur in isolated (macro . code) pairs it is not possible to define a collection of functions/macros. What is desired is something like: ` read ` write create-io Which requires some form of mutation, even if it's single assignment. Alternatively parsing words preprocessors could be used: create-io read write To expand to :macro read make-io-read ; :macro write make-io-write ; This requires a special purpose syntax, i.e. something like parser create-io read write == :macro read make-io-read ; :macro write make-io-write ; end The problem with this is that parser generating parsers are hard to express. Is it possible to create a single abstraction that does not limit the number of orders? It looks like programmable parsers are necessary anyway when macros can't deal with identifiers. The good thing about using the previous approach is that it maps very well to Scheme's define and define-syntax. TODO: place this in the framework of compilation phases and figure out a better syntax for defining parser macros. Entry: substititions Date: Fri May 23 16:11:41 CEST 2008 Why are parsing words necessary? Because modified semantics are allowed for symbol definition (':') What about a Forth syntax for substitution macros? parser variable name == create name 1 allot ; This would solve virtuall all problems, since it gives access to named subsititions, but it adds a level of inelegance to the language. Well, sort of: they are already there, so why not make them available.. These would make sense in interactive commands too. (What about ditching all this crap and creating a flexible alternative s-expression based syntax i think..) So, considering that I don't want to loose the good things about Forth syntax (prefix syntax to eliminate parenthesis) I guess I have to learn to live with the bad things about Forth syntax (necessity for an extra composition mechanism due to prefix syntax). It's not all bad, just some trade-offs.. So, prefix substitutions. They are not like parsing words, but can be used to emulate them. They are the 'last resort' composition mechanism which are used to capture prefix patterns. If I'm going to embrace them as one of the features of Forth syntax, it might be wise to make ':' not a primitive. More general, it might be wise to have this as a layer on top of a simpler, single assignment preprocessor. Entry: Single assignment base language Date: Fri May 23 19:23:32 CEST 2008 So, I'm going full circle. Back to postfix notation with a (set! name value) component? Is it actually possible to do so? Prefix primitives: [ ] macro composition ' macro variable dereference (+ creation?) macro! single assignment definition forth! variable! This would require a separate state machine (that resembles a forth) to run during the preprocessing step. This looks like a nice solution, but i cant help to think: why is this DIFFERENT than the Scat state machine? Can this be written in Scat? And don't i introduce composition problems again? A scat machine with state: (stack, in, out) should be able to parse this without problems. Now I'm confused.. caffeine poisoning.. Entry: More standard forth syntax Date: Sat May 24 11:49:11 CEST 2008 Maybe it's a good idea to have an interpret mode anyway, at least during the parsing phase. This would make it a lot easier to deal with standard forth syntax like 'constant'. Looking at what the parser does now, it is already the case. Only there's just a single mode: compile. Adding an interpret mode, the language that's active could be plain scat. Ha, i'me using the [ and ] word for code quotation. Looks like that's exactly the opposite from how it would be used in Forth. Questions: * is it possible to solve this without interpret mode? * using substitutions only, does the hygienic system provide enough freedom? The latter i mean (io read write) == (:macro io 1 2 3 make-io-object ; :macro read ` read io ; :macro write ` write io ;) The 'io' name is not visible in code since it's introduced by the macro. What can be seen here is that pattern relacement macros need a special terminator. parser io read write == m: io 1 2 3 make-io-object ; m: read ` read io ; m: write ` write io ; parser bla ... == ... end Going that route, why wouldn't you write everything as parser instead of macros? The problem is, parsers allow you to deal with NAMES, while macros allow you to deal with CODE. The fact that macros are associated to names is a practical matter, but they are not allowed to modify or create names. Parser words are necessary because prefix syntax is the ONLY way to modify semantics of names, other than to reference the macro that is bound to it in the current environment. This might be a bit confusing. There are 3 things that can be bound to names: * parser extensions -> non-concatenative source code preprocessor * compiler macros -> concatenative (compositional) code generation * forth words -> macro instantiation It's probably possible to design a language that doesn't need the first step, but technically even scheme has this: the reader, which has special features. Having to do it this way is a FUNDAMENTAL limitation/feature of Forth. This is DIFFERENT from Forth because it is pure input word stream substitution. Forth's parsing words operate on the input stream directly. This is one of the reflective properties of Forth that is eliminated in Purrr. EDIT: But, adding this extra level, does it stop there? How to create parser macro creating parser macros? The problem is easily solved with s-expressions, but hard to do with the current implementation: what is necessary is to add semantics AFTER collecting tree-structured code. This is what s-expressions can do: parse step follows read step. Is it possible to bring this to Forth? TODO: write something intelligent about this problem. it's a deep one, related to syntax and reflection: the difficulty of unrolling Forth. Entry: parser cleanup Date: Tue May 27 11:06:38 CEST 2008 EDIT: this is a mess.. i've reached the conclusion that the 3 layers are necessary, so it might be best to think a bit about the best way to represent the highest layer.. gut feeling says current problems with parser (the non-concatenative part of Forth syntax) are rooted in the way it's implemented. phase 1: what is ':' ? this, in addition to creating a new label, will terminate the previous definition. following the colorForth model, there is no interpret mode, so definitions run upto the next word. those 2 behaviours need to be split up. ( simplifying, why not have macros with multiple entry points? this would not be too hard to solve really. ) so, let's have a look at the single-assignment language for Purrr parsing. the problem to solve is: How to implement a Forth-style syntax (without interpret mode) on top of the purely concatenative Macro syntax used by Purrr. To re-iterate why this is a problem: * I'm convinced purely functional and purely compositional macros are a good idea: they behave well as a data structure for program representation and code generation (compiler structure). * Forth is only a thin layer on top of this, mostly to solve definition of names in a source file using a familiar syntax. Forth syntax in itself is a good user interface. However, the design of the Forth language is firmly rooted in reflectivity, employing an image based incrementally extensible word dictionary, something which i'm trying to unroll into a language tower. * There is already a base syntax using s-expressions which translates directly to Scheme. The problem now is to find a way to translate (linear) forth into tree-structured Macro expressions. One solution is to * create an intermediate language syntax that has all the features needed in Forth, but uses s-expressions and single assignment. * map this language to an explicit list of definitions * find a mapper from linear forth -> s-expressions This should work for the 3 levels of code: - macros - words (instantiated macros) - parser (Forth source transform patterns) First attempt in forth/single.ss Maybe the real focus should be to embed s-expressions in the Forth language. This gives parse-time lists. Care should be taken not to create yet another level that's hard to metaprogram. This needs to sink in a bit. Entry: parser idea Date: Tue May 27 14:21:56 CEST 2008 To further distill the actual idea. There are 3 levels of concatenative code, with each its own interpreter. The real question is: can't they be unified into one? Macros These are Scat words that create target (assembly) code. They are the simplest kind, being built directly on top of Scheme (module) name spaces and an RPN parser. This parser reads from an input stream, and accumulates an output Scheme syntax. IN = stream of identifiers OUT = accumulation of nested Scheme expression Target words Obtained as instantiated Macros. In macro/target-compile.ss IN = stream of macros OUT = accumulation of target code words Forth preprocessor This associates names to Macro or Target code. IN = stream of Forth syntax OUT = list of (name . value) pairs. Are they all really necessary? Macros are the core programming construct and serve to describe programs. Target words determine the instantiation level of code. Forth preprocessor allows naming of macros and target words. The target words layer is a consequence of manual code instantiation. This is a feature: i want to have this level of control. In principle this could be eliminated when the language is made a bit more high-level (i.e. elimination of return stack access). The preprocessor layer is necessary because of limited reflection: if macros cannot create named code, constructs that abstract this cannot be macros, so need to be preprocessors. Wether all 3 layers can be implemented in more of the same way is an interesting but not so urgent question. Wether they have to remain in existence is easy to answer: yes. They are a consequence of two important design choices: * explicit instantiation (macro vs forth) * non-reflectivity to simplify code processing and namespace handling Entry: macro instantiation is memoization Date: Tue May 27 14:51:34 CEST 2008 What about looking at the code instantiation problem as a form of memoization? A macro that is inlined twice can be replaced by a single instantiation and an indirection. Doing this automatically could lead to a simpler (beginner) language that does not need a programmer-specified distinction between Macro and Forth modes. Entry: next Date: Tue May 27 16:14:53 CEST 2008 - fix the parser = a base language + an extension syntax. - add a temporary 'load' function to supplement 'require' for old-style Forth: it might be better to keep the mechanism in there, instead of forcing the use of bottom-up modules. Entry: embedding s-expressions in Forth code Date: Tue May 27 16:27:19 CEST 2008 In order for the parser code to work properly, new parsers should be able to work from within a file. This creates a problem because not all forms can be identified before parsers are created. This problem can be bypassed by allowing genuine s-expression syntax for the parsers. That would also allow parsers to be nested. { parsers { { variable name } { create name 1 allot } } { { 2variable name } { create name 2 allot } } } Why not make this go a bit further and allow generic scheme code? This would solve basicly all other syntax problems with forth files. Maybe this is also the right way to map Forth syntax to scheme.. Quick hack works fine, except for one small problem: it doesn't get evaluated at the right time. Maybe moving everything to toplevel by using this as primitive syntax can help? Entry: rethinking forth parsing Date: Wed May 28 12:43:27 CEST 2008 The problem to solve is to identify parser commands before code is parsed. This is only possible when all s-expressions are collected before linear source code is parsed. This means that it's not possible to define parser words that expand to s-expressions. There's a chicken and egg problem there that deserves some attention. The problem is that identifiers can't really be identified as in scheme. -> look into how this expansion stuff works. This works fine in scheme: #lang scheme/base (define broem (bla)) (define-syntax make-bla (syntax-rules () ((_ bla) (define-syntax bla (syntax-rules () ((_) (+ 1 2))))))) (make-bla bla) So, just expand everything to this, should work fine. Maybe the RPN compiler should be simplified a bit, using syntax parameters instead of compile time parameters.. Makes sense? Entry: today Date: Fri May 30 00:42:34 CEST 2008 was a day of writing 6 paragraphs of introduction. i think i sort of got it going: the reasons for brood: * lisp is cool, especially for metaprogramming * create a small language to metaprogram from within lisp the problem is to really leverage this, some knowledge about languages and implementations is necessary. i'm still not sure about how to sell this to electronics engineers that don't see the point of lisp.. about the code: the parser patterns seem to be important as a final 'highest level of metaprogramming'. i'm still not convinced it is a good idea, but it seems to be a consequence of wanting to keep old forth syntax instead of s-expressions. i need to spend a day writing and thinking about this. Entry: the parser again Date: Fri May 30 10:11:18 CEST 2008 the problem is that parser macros defined in an .f file should have immediate effect. can this be local-expanded or something? i ran into this deeper problem: parsing isn't really factored when being inside a single rpn macro: big reliance on dynamic variables. can this be replaced by something else? probably not so easy.. lets re-iterate over the expansion algorithm in PLT Scheme. done.. basicly, it can fish out 'define-syntax' before expanding the value expressions in 'define'. so, maybe it should be a true preprocessing step like it was before. one that converts the forth language to 'compositions' forms. the thing is, inside a 'macro:' you really don't want any forth prefix syntax. this should be s-expression syntax only. the only exception is locals. -> probably best to separate the changes into those that introduce new global names, and those that don't. EDIT: not necessary. both data and code quotations are expressible in s-expressions and part of the RPN syntax. local variables can be added by using definitions like ((name param ...) body ...) instead of (name body ...) (locals are really special: they make sense only when parameterizing bigger code chunks, with lots of parameter reuse..) hmm... some things are not really very well disentangled yet.. maybe i should accept that 1. macro is clean == enough, 2. the forth on top of that has some hacks to support standard forth syntax on a system that doesnt have forth's kind of reflection.. damn, this is complicated.. one remark about my method though: forth-tx is too complicated. i find it difficult to make modifications because it is a hack on top of rpn body creation. at this moment i don't really see how to change that without introducing more than one abstraction layer.. maybe it needs a couple of days rest, i'm not coming up with anything exciting here.. Entry: syntax parameters Date: Fri May 30 11:17:57 CEST 2008 is this true? using syntax parameters, i can get rid of the hack that calls the transformers directly. need to be careful there: every time the expander is called again, names are marked. because i'm using this mechanism to build a single lambda expression, names might make more sense unmarked.. Entry: Forth syntax, philosophical approach Date: Fri May 30 15:35:18 CEST 2008 meaning, through natural language.. i do this too little with the tough problems that turn out to be huge time sinks.. The problem: It should be possible to _define_ new Forth substitution words, which is implemented by define-syntax, _before_ the expansion of body code. In Scheme, due to the use of s-expressions, this is easy. In Forth however, the names are burried inside a muck of words: expanding all substitition words to expose those words that might yield the definition of _new_ substitition words is (probably) not possible. Question 1: is it at all possible to fish out these macro definitions? If so, how? Question 2: if it's not possible, can we formally acknowledge it as a shortcoming of Forth syntax and work around it? Entry: fix it later? Date: Sat May 31 10:11:37 CEST 2008 since this is more a pride issue than anything else, can it be fixed later? probably.. it's just about * syntax for parsing words * allowing 'load' 'load' can be implemented using include/reader : specifying the reader is essential since it needs to expand to some form that can be included in a file. this means i have to construct 1. a scheme syntax to define forth files 2. a reader that gives this scheme syntax 3. module-reader in terms of those 2 the point to start is purrr/forth.ss This file contains all logic necessary to expand module syntax. Entry: apologies Date: Sun Jun 1 00:14:32 CEST 2008 explain: why (forth) macros are actually (scheme)functions, and code needs to be compiled at (scheme) runtime. (i.e. why is there one level (actually 2 if you count the assembler) that has manual compilation? -> derive from this a proper instantiation syntax for forth code. EDIT: the explanation is simple: a significant part of the program is a long-lived target code juggler. that part cannot be just syntax. the proper instantiation for forth code is of course 'define-ns' in the (target) namespace. then all words are accessible through reflective operations. Entry: module system Date: Sun Jun 1 00:56:05 CEST 2008 http://calculist.blogspot.com/2008/04/dynamic-languages-need-modules.html StoneCypher said... It is important that you learn a well established language that has already successfully grappled with these problems before deciding on your own mechanism. very true.. Entry: rethinking code instantiation Date: Sun Jun 1 10:54:22 CEST 2008 it's frankly too complicated and ad-hoc atm.. i lost oversight. these features/choices make it complicated: * forth words have associated macros * multiple entry and exit points * forth parser is a single macro, but uses factored macros macros are really simple (declarative), but forth syntax + instantiation makes it a lot more difficult. code instantiation produces a single macro that runs with compilation-state to produce a collection of fallthrough words. This is the remainder after lifting out all macro and parser definitions. what about making instantiation an operation on macros? i.e. replace a macro with a wrapper, and collect the body instantiation somewhere. this doesn't work for multiple entry points though.. so: multiple exit is easy: it's simple to fake in macros using a 'macro return stack' in compilation-state. multiple entry points however is quite difficult. is it possible to do this? 1. bring the representation back to single entrypoint 2. write multi-entry point code / fallthrough as an optimization i don't think so.. there is too much code that relies on multiple entry points. i need a simpler way to represent it. Entry: again Date: Sun Jun 1 11:13:26 CEST 2008 try meta level here: i'm loosing oversight because it's not working: i can't make small changes to see how they propagate through a working system. the real problem that started this is the inability to define parsing macros, which lead to a realization that these need to be instantiated before the rest of the code is parsed (chicken / egg problem) which lead me to think that this is impossible unless some form of partial expansion is used, which is then made difficult by the way parsers are implemented: by directly calling them. in short: problem A: it's difficult in the previous setting to get to a structure where the 'define-syntax' occurances can be isolated before they are used. it's easy to solve this by requiring them to occur in different files. problem B: it's currently not possible to include a forth file because the module expander is not factored properly. let's tackle B first. what's wrong with current forth-module-begin-tx macro? the whole register-code! business is not so good. a forth: macro should take an extra argument to represent the instantiated macro. 'register' is used in the macro expansion to store the word struct produced by the wrap operation. 'compile' produces the code graph. the problem is: i'd like these to be composable from different sources, so a chunk of forth code needs to produce something that can be accumulated later into something else. the idea is correct, only the implementation is clumsy: instantiation needs to be solved at a single place, then forth syntax needs to be built on top of it. Entry: nobody uses frameworks Date: Sun Jun 1 12:49:28 CEST 2008 what are the entry points? the ui? brood needs to be api'd as a library. it needs to be a straightforward set of macros on top of scheme, no callback nonsense. Entry: forth syntax / code instantiation Date: Sun Jun 1 14:12:34 CEST 2008 split the forth macros in 2 parts: those that create new names, and those that do not. (the latter contains locals, quote and code quotation). note that quoting doesn't need forth syntax: there is a corresponding s-exp syntax. if the same is done for locals, then forth syntax can be exclusively used in .f files or a forth: macro where it introduces names. so: the '(forth-toplevel form compile (defs ...)' does: * expands to a special form (i.e. begin or #%module-begin) * binds 'compile' to a function that generates words * has preferrably no side-effects the idea is that those forms can be composed (i.e. using 'load') and that the toplevel module namespace initiates all code compilation. the problem is not the expansion to definitions (either 'define' for macros or 'define-syntax' for parsers): it uses the toplevel 'begin' form. the problem is registeration of the forth words / instantiated macro. binding it to a given name is not necessarily a good thing. i really need to think a bit about how this is used in toplevel project namespaces, both for one-shot and incremental code compilation. maybe it is best to put all word instances in the toplevel namespace AND allow for a mechanism to collect them (maybe from the namespace using reflective operations?) this could be an algorithm akin to garbage collection: only compile the code reachable from the roots = exported (forth) namespace names. it only needs a way to take care of the recursive definitions: the word instances are defined in terms of macro names, and can only be evaluated AFTER all macro bindings are evaluated. because of the level split (forth macros are scheme functions) this has to be done manually. it's easiest by just making all forth code into promises though. the only problem remaining is fallthrough: how to guarantee the correct order of compilation? this is one of the main reasons why simple mapping from name -> datum isn't really possible: the order is important. so, the problems: * forth code is ordered (supports multiple entry points) * evaluation of forth code requires all macros to be bound, so has to be done after evaluation of macro body expressions. this is solved atm once per module, but due to 'load' this operation needs to be composable: -> compose the definition of macros -> compose the forth instantiation macros it's probably ok to bind the labels to names so they can be accessed through reflective operations later, but it's essential to also somehow orchestrate the compilation of the code. the remaining question: when does the code need to be compiled? also, answer this in light of incremental compilation (keep the namespace active, just add in more code/macros). is it ok to assume some context? Entry: partial evaluation literature Date: Sun Jun 1 14:49:19 CEST 2008 enough dabbling, i think i'm ready for reading: http://www.dina.kvl.dk/~sestoft/pebook/ * pe = an operation on program text Entry: namespace woes Date: Mon Jun 2 19:35:21 CEST 2008 file:///plt/doc/guide/reflection.html Modules not attached to a new namespace will be loaded and instantiated afresh if they are demanded by evaluation. For example, scheme/base does not include scheme/class, and loading scheme/class again will create a distinct class datatype: This is important for the global registery used for recording Forth instantiations, and datatypes that are shared by the host meta system (i.e. target and macro word structs). Entry: fixing forth instantiation Date: Tue Jun 3 11:14:34 CEST 2008 made the first changes to purrr/forth.ss so it generates just syntax. need to change: - handling of toplevel require forms using parameter - compilation maybe i should just give up the production of forth dictionary 'records' and just use the environment to dump stuff, making compilation of a .f file a side-effectful operation. SOLUTION: move everything that's not part of the instantiation macro to toplevel using the forth-toplevel-forms parameter. this includes macros (and maybe variable definitions?) and definition of the word structs. the remaining running state is just a single big macro which inlines the word structs in the macro code using the 'label' forth word. i.e: (define-ns (target) bla (make-target-word #:name bla)) ... ,(ns (target) bla) label ... this way, forth compilation is what it is: construction of a macro that after instantiation gives a code graph. all OTHER stuff that happens in a .f file (definition of macros, variables, imports, ...) have a straight meaning as scheme module components and can be recorded in a side channel, implemented by a parameter. ( i can't help but think about writing the parser words as scat words.. this is yet another threaded state problem.. ) stubbornness: i'm just going to keep things as they were. cleaned up purrr/forth.ss a bit + finally undersood how namespaces can share code, and it looks like this is enough to build the necessary abstractions. NEXT: load Entry: load Date: Thu Jun 5 10:58:26 CEST 2008 got it working, at least with absolute paths. the trick is to call the forth syntax reader forth directly inside parser-tx.ss, and to combine source location info with the proper lexical information. next: fix path + convert kernel's 'require' stuff back to 'load' so it can be modularized later, and so that most hacks around late binding can be simply replaced by loading stuff into the same namespace. Entry: simulator Date: Thu Jun 5 14:17:16 CEST 2008 Entry: Simulator Date: Thu Jun 5 14:05:22 CEST 2008 http://citeseer.ist.psu.edu/119550.html What I need is a language for simulator design, or more specifically, a strategy for compiling target code + some semantics spec to host executable C code for optimal performance. An advantage is that what needs to be simulated is usually quite lowlevel code with fairly specific space and time characteristics. So basicly, a state machine description language is necessary. Something that can be compiled to a massively parallel C program. If there is one place in Staapl where partial evaluation is going to make a big difference, it's there. -- Instead of writing a simulator, building a partially evaluated simulator might be a better idea, since for simulation speed is very important. What is an instruction? It's a state update. state is memory. Memory is a number of registers, with variable bit size. So an instruction is something with the following properties: * an endomap for (a subset of) the machine state * timing information * encoding (for instruction interpreter) Maybe i should take a step back towards pure s-expressions for instruction set spec, since these are a bit hard to compose (write macros that expand to them) composition would help to define some instruction classes. (addwf (f d a) "0010 01da ffff ffff") ((addwf f d a) ((#b001001 6) (d 1) (a 1) (f 8))) Actually, the simulator descriptor language is as good as the same as the dsp dataflow language. maybe i should do the latter first, then generalize. Entry: The Dataflow Language Date: Thu Jun 5 18:10:53 CEST 2008 see entry://20071211-093307 From the KRIkit project it's quite clear that some kind of dataflow language makes sense for deeply embedded DSP applications. The main problem I ran into during the project is a too low abstraction level to write the modem code. Making the code more abstract at the Forth level would have cost about 10x execution speed. Solving this at the macro level probably a better idea. Some declarative dataflow or array language might be a good addition to Staapl and a better match to RISC architectures with more direct 2/3-operand register file addressing modes than the 1-operand 8-bit PICs. But what syntax should that language have? The main problem that makes a straightforward dataflow language impractical is expressing the linking between code modules whenever a module has more than one output value. Using a scheme-like expression syntax, some form of 'let-values' is necessary to name output values when nested expressions are no longer sufficient due to the move from a tree-structured data flow graph do a directed acyclic data flow graph. Why not use a concatenative/Forth syntax? Instead of using RPN notation as a direct specification of control flow, which would fix the execution order of operations as specified, it could be used only to build the (parameterized) dataflow graph, which could then be compiled and re-serialized specifically for the target architecture. When this is combined with higher order functions (or the macro version: combining forms) this gives the original Packet Forth idea: APL with Forth syntax. I have argued before that writing DSP code in Forth is difficult, but this problem can be simplified using higher order (list/array) combinators. What usually deters is the inefficiency of such composition mechanism when they are implemented at run time. However, when these combinators can be eliminated at compile time (a first-order language with higher order macros) it might be feasible to have both highlevel programming AND efficient code. A place to look for inspiration is probably Factor's data flow analysis. Entry: Array Processing Date: Fri Jun 6 13:25:05 CEST 2008 Programming in an array processing language can be factored in two steps: * construction of primitive, pure many -> many functions * mapping these over tensors Entry: next Date: Fri Jun 6 16:10:22 CEST 2008 paths in load. * find a file in path. now, allowing files with undefined symbols might be a convenient notational device, but it makes it hard to test them individually because they need context. Entry: command line Date: Fri Jun 6 16:56:16 CEST 2008 it's time to start using a forth command line + code store. then, there is only boring stuff left: * fix 'org', or think about how such a direct assembler-state control statement can be allowed in the language. * fix the undefined symbol problem introduced by the switch to module languages -> maybe add some toplevel undefined symbol handler in the badnop namespace management code. make sure some toplevel equivalent of 'load' works. (why can't load be used for import actually?) Entry: namespaces Date: Sat Jun 7 11:17:15 CEST 2008 trying to figure out exactly where to put things. 1. support system = toplevel application namespace 2. one namespace per compiler / project. the parser and lexer for the REPL obviously should be in 2. so it needs an interface. also, it's probably best to make the interface to the repl a macro. uptil now, there were only modules. each module brings its own lexer and parser. the result is Scheme definitions. to add repls, each repl needs to be attached to a lexer and a parser. the problem i run into now is that purrr/repl.ss pulls in too much dependencies, mostly because of purrr/forth.ss the latter should be factored a bit more.. fixed.. ok.. think i got it working: purrr.ss imports the whole purrr base layer with forth syntax (parser words) AND a repl macro. next: fix a problem with module loading.. infinite loop when requiring an .f file hmm... looks like i have a problem. increased the limit to 100mb and now it works.. it runs in 26 mb too.. Entry: redefine Date: Sat Jun 7 18:40:52 CEST 2008 so.. toplevel stuff is working now. so why can't these toplevel definitions be used to change implementation of core macros like 'dup'? the idea is that yes, i'd like to keep the current module system for managing names, but no i don't want to prevent modification of macros. basicly, they are used to change aspects. merely putting them in toplevel to be able to upgrade them would get rid of other advantages of the module system. Entry: require + toplevel Date: Sat Jun 7 19:03:25 CEST 2008 box> ,purrr toplevel in /home/tom/scat/ ;; scat extend: (macro) jump ;; macro ;; forth ;; purrr box> (repl "require test/purrr18-test.f") ;; scat extend: (macro) jump ;; macro ;; asm extend: (macro) + extend: (macro) dup extend: (macro) drop extend: (macro) swap extend: (macro) or-jump extend: (macro) not extend: (macro) then ;; forth ;; purrr ;; pic18 ;; dead: ((jw #)) ;; dead: ((exit)) ;; dead: ((exit)) ;; dead: ((exit) (qw 6) (qw 5) (qw 4)) box> Same for (require "test/purrr18-test.f"). This isn't right: it should reuse the scat stuff.. How to do that? Does the namespace need to be a module namespace? Maybe it is.. require cannot depend on context, so when a module requires another one, that last one needs to be re-instantiated. I guess this is a chance to finally figure out what i'm doing with this namespace / compiler instance business ;) EDIT: * there is one instance of the compiler for interactive use. * each module that is required into the toplevel instantiates a compiler. the latter makes sense: loading an application without a toplevel is possible as long as it has its own compiler associated. one question though: isn't using a toplevel terribly inefficient then? probably, for things not used after instantiation, garbage collection kicks in? The compiler is simply discarded? ok, i think i got it fleshed out now.. the only remaining things to figure out is to remove the dependenices of the data structures on the scat code (make dependency on the badnop side optional) and figure out where to put the assembler (probably best in the target namespace) Entry: comping purrr to C Date: Sat Jun 7 19:39:42 CEST 2008 it shouldn't be too difficult to add a C frontend for purrr. basicly, every word instance is a function + the stack pointer is passed as a parameter + tail calls are forced. Entry: basic Date: Sat Jun 7 23:16:41 CEST 2008 i was wondering how difficult it would be to compile one of the BASIC dialects for PIC or AVR to purrr. Entry: new names Date: Sun Jun 8 17:39:24 CEST 2008 It's difficult to pick good names. The current ones: brood, purrr and scat are a bit difficult to google because they are all common terms. I was thinking about STAAPL, which is a creative spelling of stapel, the dutch word for stack. Maybe retrogrammed as STAck and Array Programming Language. Another one is Staprola, stack programming language. No google hits on that. Or something completely meaningless? Wurzon/Kamizi? What about calling the whole system Staapl, calling the pure language Wurzon and the Forth layer on top Kamizi? Hmm.. the most important thing is the name of the project. Let's try staapl for a while. EDIT: main project is now called staapl. Entry: about that stack Date: Sun Jun 8 18:00:38 CEST 2008 so.. are we going to stick with stacks or not? i'd like to give the concatenative language as specification for a dataflow language some thought. in that case, the system is as good as complete. Entry: factoring Date: Mon Jun 9 10:50:22 CEST 2008 I'd like to factor macro and target: macro: just the functional macro metalanguage, no instantiation target: only instantiation (compilation) and optimization Is this a good use of time? Probably not.. The macro language is never useful without instantiation.. it really is just composition of unary functions which after all isn't terribly interesting if you never evaluate them. So ditch this.. What does need to happen is to trim dependencies of the target representation structure. Currently, it needs 'scat' for some evaluation stuff. Fixed. Entry: concatenative dataflow language Date: Mon Jun 9 11:15:29 CEST 2008 it compiles to a dataflow graph. in order to do this incrementally, i need to think about state. state, as represented in purrr, is the current output of the network, so compilation is just adding a node. (... [N1] [N2] +) -> (... [N3]) where [N1] [N2] | | (+) | [N3] next decision: does this need a functional state, or can mutation be used directly to build the graph? the tricky point is multiple outputs: (... [N1] [N2] div/mod) -> (... [N3] [N4]) [N1] [N2] | | (div/mod) | | [N3] [N4] this could be represented as: [N1] = (div/mod [N1] [N2]) [N2] = (shift (div/mod [N1] [N2])) with dat structure sharing. this completely avoids the problem of having to name intermediates. a more symmetric rep would be [N1] = ((div/mod [N1] [N2]) 0) [N2] = ((div/mod [N1] [N2]) 1) this even has a representation is straight scheme in the form of memoized procedures. let's try to build one on top of the pattern matcher. the first notational problem i run into is specification of primitives with multiple outputs, which is the problem i'm trying to solve! so.. let's stop going in circles. ;; Some important points: ;; * Dataflow macros have a different representation. They have an ;; entirely different compilation mechanism: one which involves ;; register allocation and instruction scheduling. This ;; representation should be made solid. ;; * Give the dataflow macro rep, writing an automatic convertor to ;; concatenative syntax is trivial. ;; As a result, the macro/pattern.ss mechanism is only needed as a ;; building block, not as a ui front end. the composition mechanism should just build the graph, but represented in such a way that 'executing' it is simplified.. this boils down to how to do the binding, whether to use 2-way links, whether to represent subgraph inputs by 2 nodes etc.. it looks like going for an explicit data structure that is later interpreted or compiled might be the best approach. it's easiest to understand. (the other way is to map it directly to scheme code, which is also a DAG). in hardware, all functions are many to one.. the only place where many -> many functions come from is abstraction. can this fact be used to simplify the problem? a subgraph is basically a list of (named) expressions expressed in terms of (named) inputs. Entry: monads and computation Date: Mon Jun 9 12:28:05 CEST 2008 the philosophical idea behind monads starts to dawn on me.. in any programming language, there are 2 things to consider: * a composition mechanism, which takes multiple language elements and turns it into one (or more) composite elements. * primitive elements. this is 'bind' and 'return'. Entry: strategic overview Date: Mon Jun 9 15:04:48 CEST 2008 About how i'm going to tackle the simulator generator problem. Hardware is best modeled starting from a description of its interior, which is registers + logic. Functional/dataflow descriptions are thus a good base language. It looks like using the simulator as a pull for implementing the core dataflow representation seems like a good idea. Entry: representing DAGs Date: Mon Jun 9 15:46:14 CEST 2008 "It's better to separate the 2 concepts of many->one functions and grouping, than to work with many->many functions and permute/connect their outputs. What this does for representing graphs is the ability to use simple nested (scheme) expressions. In this view, mapping a concatenative language to an expression based syntax is completely trivial. Representing one is completely trivial also. So what problem am I solving? ( Some idea is itching in the back of my head telling me that partial evaluation for functional dataflow analysis is really trivial as long as there is only a single type: compilation is nothing more than evaluating the graph while adding postponed semantics to the code. What makes it hard is presence of higher order constructs. I'd like to get a handle on this.. move it from the philo level to concrete code.. Is it all the same thing? Is partial evaluation REALLY better viewed from a compositional pov, as an intermediate form to get the evaluation right, and then to transform it back to a graph for optimizing the register allocation / sequencing? That can't be the case really, since both are easily related to each other. Probably i'm forgetting about associativity here.. ) Entry: base semantics Date: Mon Jun 9 16:06:41 CEST 2008 Looks like i need a representation of base semantics of stack manipulation operators. This can then be used to generate substitution rules for the pattern language, and perform dataflow analysis. I've added a file stackop/stackop.ss that's not used yet to put this info. Entry: coming out Date: Mon Jun 9 16:11:08 CEST 2008 when i start combining data flow analysis with a concatenative specification syntax, it's time to admit that yes, this is about syntax! so, rationalization: * for target implementation, a stack based langauge is nice * anything that can be analyzed before it's placed on the target might benifit from being transformed into a data flow graph, to get rid of the explicit serialization in concatenative code. Entry: base language for simulator description Date: Mon Jun 9 17:07:06 CEST 2008 Let's see if this route makes sense: create a scheme language level for an expression serializer. In = an expression graph, out = serialized graph. This is a pure scheduling compiler, mainly serving as a front-end to a C-code generator. First: what about names. If there are no scoping issues, it's best to work on symbols instead of identifiers. This seems to be the case. Entry: enough dabbling Date: Mon Jun 9 17:27:19 CEST 2008 next: * fix 'load' to perform source relative include and figure out a way to perform temporary code generation with undefined symbols (i.e. assuming they are constants or something). * fix 'org' * port the target interaction code Entry: fixing load Date: Mon Jun 9 17:55:53 CEST 2008 This is not entirely trivial: the environment in which the code is expanded needs to be modified so the load statements inside the code know where to get the code. Currently, it's simply inlined so context can't be tracked. OK. with the control flow out of the way, it's probably easier to override current-load than to try to re-implement that part.. Q: is it possible to use require in a loaded file? if yes, is then a problem to replace the load handler also for requires? hmm.. the parser atm is really confusing.. too much juggling with return values and continuation thunks.. this needs to be solved without a driver routine.. maybe a single dynamic variable to accumulate code is better.. there is aleady one for toplevel defs.. Entry: new parser driver Date: Mon Jun 9 21:22:40 CEST 2008 It's a mess. There have been several occasions where i tried to understand it but couldn't. So, how to fix this. There are 2 things to arrange: * whenever a definition starts, the name, srcloc, and mode need to be recorded. -> implement as thunk. * whenever a definition ends, the current expression needs to be combined with the stored header information and collected as a record. the basic driver seems to work. it's a lot simpler to understand now: (define (definer mode) (lambda (code expr) ((finalize-current) expr) (syntax-case code () ((_ name . code+) (new-record #'name (mode) (stx-srcloc #'name)) (collect-next #'code+))))) now need to adapt all the other macros to this new way of doing things.. should be straightforward. nesting can be implemented with dynamic scope and an exit continuation like for load. ok, with some minor shuffling in what to return to the continuation (which is now implemented as a prompt) it seems to work. now locals: maybe i can get back to using (rpn-represent)? this requires the function to return.. is this possible? ok: rpn-next can _only_ return when all input is parsed. this way parsing can still be nested locally without the driver loop needing to restart parsing. ok, got some generic nesting working, now do the same for load so all nestings can compose. something seems to be wrong with 'load' though: probably a continuation barrier.. nope: the procedure embedded in the syntax was of course wrapped as a syntax object, and my printer routine automatically unwraps it.. locals: the problem with not allowing rpn-next to return is of course that now it is no longer possible to modify the closing expression (the lambda wrapper): this used to happen by returning. the solution is to add yet another parameter that represents expression closure. it's actually already there in the form of 'rpn-lambda' but this makes it a bit complicated.. the following modification should do it: allow 'locals-tx' to modify rpn-lambda, and reset rpn-lambda in the forth parser so every definition can start from a clear wrapper. ok.. the thing is this: building an expression one wants to be able to insert nesting expressions above and below: passing on just the inner expression is a bad idea. maybe this needs to change? 3-value parser state? hmm.. alternatively, write the parser in terms of scat threaded state updates, but that might go too far and lead to bootstrap problems. the problem is now to make sure that wrappings are only used once. this needs an interface: pfff... i'm getting myself into lowlevel mess again because one feature doesn't fit into the simple abstraction. what about making 'expr' an expression generator: a list of functions that can be composed and evaluated. OR: expr = a cursor inbetween: (outer . inner) this should really solve all parsing needs. ok. got it working with a bit of juggling with rpn-lambda: * at every rpn-compile, the current expression wrapper is set from rpn-lambda. * expressions are allowed to override the current expression wrapper * if entry is not trough rpn-compile, you need to initialize the wrapper! pff.. next problem: the locals macro seems to have a problem with non-2stack states. solved: wasn't fixed after abstract state update was changed. Entry: compiling monitor Date: Tue Jun 10 17:32:22 CEST 2008 without 'org' and some things disabled here and there. but it does seem to compile, at least in toplevel namespace with 'load'. it compiles, but doesn't assemble. some 2stack problem in target-value->number Entry: local variables Date: Tue Jun 10 19:13:29 CEST 2008 A side-effect of the way locals are implemented, is that they can occur anywhere in a macro definition or code word, and will bind literals. box> (repl ": foo 1 2 | a b | a b a | d e f | e ") box> (print-all-code) foo: [dup] [movlw 2] box> Entry: next Date: Tue Jun 10 19:56:35 CEST 2008 org: let's see.. the real problem with org is that it permanently changes assembly state. currently it's possible to set the org of a chain of code, which is a local effect only. asm: monitor doesnt assemble.. problem with 2stack <-> compilation-state confusion in evaluation of target values. the error happens in /home/tom/staapl/macro/instantiate.ss:80:3 wrap-macro/mexit which means a .f generated macro is evaluated. ok, it's a constant that's evaluated with macro->data. it should be possible to convert macro->data so it runs on a dummy compiler state, but the real problem here seems to be: is it somehow possible to not wrap macros with local exit if they don't need it? well.. i can always run it once, then decide on how to encode it. a macro can throw an error, but it is not allowed to have other side effects. problem: the m-exit mechanism makes it impossible to evaluate macros on a 2stack state, which is possible for most 'clean' macros. should the concept of 2stack macro be discarded? or is the concept of clean macro important? (don't you just love dynamic typing!) intuition goes towards: keep 2 classes because the compilation-state class deals with 'lowlevel' features like multiple entry and exit points. is it possible to more clearly separate these 2 classes?l Entry: meta Date: Tue Jun 10 23:44:47 CEST 2008 last two months have been, well, long.. i didn't get so much done really. mostly reorganizing, fixing bugs and thinking about the new features.. some topics: * load vs. require, module expander and redefining words * org and labels + multiple entry/exit points * namespace juggling (badnop) * parser cleanups + new syntax for scheme expr + code quotations * documentation * simulator ideas + dataflow language so i did get things done, they were just more difficult than anticipated.. all of them involved significant choices and backtracking on dead ends, not much straightforward coding as i expected. maybe that's a good thing in the end.. it's just that now i'm a bit drained on the creative front. once compilation works (maybe tomorrow?) the road onward should really be straightforward: port the interaction code, and solve bugs in the compiler that are exposed. the goal should be to move to a working 'ping' beginning next week. Entry: serialization for incremental dev Date: Wed Jun 11 00:03:33 CEST 2008 it does look like the serialization problem for incremental development is relatively easy to solve: save the forth words from the namespace, and rebuild it later by loading macros from source code, moving them to a new namespace, and augmenting them with compiler macros for the serialized words. Entry: m-exit Date: Wed Jun 11 09:49:43 CEST 2008 ok, another choice to be made. does 'macro->data' use 2stack or compilation-state or something else? maybe it can be parameterized? since it's more about a configuration issue than anything else: when not using mexit and in-line word creation, use 2stack, otherwise use the extended compilation state. ok, something else to clean up: state:2stack : create a state object with update function make-2stack : create the raw struct now, the solution i have is to use a parameter called macro-eval-init-state, but isn't it better to store this kind of type information in the macro itself? in fact, this is a universal property: each scat/rep.ss word has a native type on which it operates, so let's make that mandatory. the type class can be represented by a constructor for an empty state. plan changed: * add a new record to words to indicate type. the type is actually a state constructor. * new-state:2stack -> parameterized constructor * state:2stack -> type value ha, this doesn't work for compostions! there it needs to be inferred at compile time, but that's not possible. anyway, i'm going to keep it to see where it ends up. maybe an order relation for state types can be defined, so at least this type analysis can be performed at run-time. no.. it's too flakey, let's get rid of it. Entry: error reporting Date: Wed Jun 11 11:27:43 CEST 2008 box> (assemble! (all-code)) asm-overflow: (bpz (p R) ((112 . 7) (p . 1) (R . 8))) (bpz 126 1) (-131 8 -1) that's nice, but where does it come from? what i want to know here is: * where in the assembly code does this happen * what is the corresponding source location the latter might be difficult, but at least it should be possible to find out in which word this is. ok, got better error reporting now: it tells you which word chain it's in, and dumps out the assembler code before it re-raises the exception. /home/tom/staapl/pic18/interpreter.f:59:2: n!f+ n!f+: [jsr 0 async.rx>] [movwf 4068 0] [drop] _L72: [jsr 0 async.rx>] [movwf 4085 0] [tblwt*+] [drop] _L75: [decf 4071 1 0] [bpz _L72 1] [movf POSTDEC1 1 0] [jsr 1 ack] asm-overflow: (bpz (p R) ((112 . 7) (p . 1) (R . 8))) (bpz 129 1) (-134 8 -1) the problem is quite clear now: relative instructions need to be initialized differently so they don't overflow in the first pass. no.. problem is something else: (bpz (p R) "1110 000p RRRR RRRR") p is the first argument. where did that come from? i think i remember: all jump instructions were changed such that the target address is the first argument. however, apperently the assembler hasn't changed accordingly. this looks like a relic. let's change it back.. ok, seems to work now. next problem: all jumps are long. (typo) next problem: dead code elimination for jump tables (fixed) next problem: org Entry: comma Date: Wed Jun 11 16:42:06 CEST 2008 the problem with comma is that it is one of the reflective words. (like 'constant' it accesses the run-time stack and produces code). i use it in purrr to change postponed literal semantics to inlined raw words / bytes. since this has no standard semantics i took the freedom to use it as a replacement for ';' for jump-tables, one that doesn't terminate the code. is there a better way to do this actually? can't the current code word be marked that it contains a run-time jump and thus needs to have chain splitting disabled? let's keep it manual: write a macro on top of the low-level dispatcher later. Entry: some slogans.. Date: Wed Jun 11 17:37:45 CEST 2008 .. to later remember why some decisions are taken * it's important to have purely functional macros + some abstract way of handling the threaded state. * for the parser i've opted not to, because this purely functional infrastructure is not necessary: it's merely a frontend for forth code, and has a composition mechanism in the form of substitution macros. internally it uses an explicit serial interpreter ('next' routine). * i'm trying to find a good trade-off between low-level control (i.e. raw jump tables, where the language is basicly an assembler) and high-level code analysis and manipulation, which serve as the basis for high-level metaprogramming constructs. * yes, i like to split the code in chunks of about 300 lines Entry: org Date: Wed Jun 11 20:38:24 CEST 2008 so, maybe just hack it. whenever a name is a number, it's a permanent org change. the obvious requirement is that it's an expression that can be evaluated at compile time. this is already used: if target name is a number, the current chain will be assembled inside a code pointer push/pop. in macro/instantiate.ss the function 'combine-if-org' is used to combine multiple chains if the current store has an org specified, to make sure it stays bundled. so, what needs to be done is a simple marker in the assembly code that sets the code pointer. that's all really.. why is this so difficult? because it's a dirty operation, and there's no clean way to do it. it probably needs some management at some point, i.e. to disallow it for certain code contexts.. to summarize: ORG FOR CHAINS: * compiler uses state push/pop to control stack * on org change: all chains are combined * assambler recognizes org chains PERMANENT ORG: * some 'magic packet' in the chain stream. permanent org needs to terminate the current chain, but doesn't need to link up chains. let's just make names (org ) and (org! ) Entry: compiler state operations Date: Thu Jun 12 12:19:39 CEST 2008 These need to be factored out a bit. I do need to be careful about providing primitives that are non-destructive, and leave destruction to simple destructors. org-pop is: terminate-chain combine-chains pop-chain I'm going to leave this as is until I need a different mechanism that needs to split/merge the compiler state. AHA: split and merge. pop-chain = terminate-chain combine-chains ;; only for org merge-state push-chain = split-state split-state: save current asm, rs and dict on the control stack, and start with a clean slate. merge-state: merge current asm, rs and dict with the one on the control stack. Entry: Compiler Code Hierarchy Date: Thu Jun 12 13:06:34 CEST 2008 During compilation the assembly code (the result of instantiating macros) is organized in the following hierarchy: * A word is a single entry point, represented by a target-word structure associated to a chunk, which is a list of consecutive assembly code instructions. Code inside a word can only be reached through a jump to its label, and is thus not observable to the world. Words serve as the unit of code generation (and recombination). Any operation on code that doesn't alter semantics is legal within a chunk. * A chain is a list of words (chunks) with implicit fallthrough. Each word indicates a single entry point. Chains are terminated by exit points. Chains are the unit of target address allocation: each chain can be associated to an address independent of other chains. Some chains have fixed addresse (org). * The store is a stack of recently constructed chains. Entry: next Date: Thu Jun 12 14:08:10 CEST 2008 fix some bugs.. * "if ; then" doesn't work -> plug-in library support for macros * labels are registered when compilation fails. is this ok? * looks like macro delegation in 'pattern' is not a tail-call, is this a problem? * begin doesn't work (typo) * comma: byte or word? * org is sometimes dead code, sometimes not? silly: gets redefined. + another problem.. some org labels seemingly get dropped. Entry: library fallback Date: Thu Jun 12 15:01:40 CEST 2008 It would be really nice to be able to automatically link in functionality. It's actually not too difficult to do so, but it needs access to the filesystem, which currently is only possible in the forth syntax layer (macros are pure). So, where to put it? This was solved using macro redefinition, without automatic inclusion of library functions. The convention is to prefix library fallback functions with a tilde '~' character. Entry: comma Date: Thu Jun 12 17:22:57 CEST 2008 In case of pic18, should comma compile bytes or words? Byte tables are useful, but the native code word size is 2 bytes, and all code words use word addresses. Since comma is mostly for data tables it's probably best to let it compile data word size instead of code word size when they are different. Entry: next Date: Fri Jun 13 17:51:16 CEST 2008 i think i got the most important bugs nailed down. time to go from code -> ihex and upload something. (save-ihex (all-binary-code) "/tmp/broem.hex") Entry: words and chains Date: Fri Jun 13 22:02:10 CEST 2008 I'm already regretting the internal linking of word structures. It feels unnatural to have to convert things to a list, and have to remember if something is a sequence or not. On the other hand, it's a clear sign of fallthrough: a word can never be mistaken to be standalone.. Maybe i should define iterators / comprehensions? It's probably better to define an explicit chain type, instead of using words for that.. A chain is a list of words. The address of a chain is the address of the first word. Basic idea: directly encoding fallthrough from a given word instance is less important than having a proper data structure that distinguishes entry points from code grouping due to fallthrough. Entry: Toplevel vs. module namespaces Date: Fri Jun 13 19:35:37 CEST 2008 When you talk to somebody, you'd like them to remember what you were talking about before. Conversation needs context. When you read something, you hope for all context to be explained in the text you are reading. Exposition needs completeness. Same goes with interaction to a machine. This is the image based accumulative repl vs. the transparent repl debate. I find it quite interesting to give it some thought, as both are valuable for some uses. This is about load vs. require: Interactive incremental development vs. the 'run' button. (explain) Entry: machine code org Date: Sat Jun 14 10:05:19 CEST 2008 * instruction -> list of binary machine code * word * chain * store In the code base, whenever assembly code, binary code, or word structures appear in a list, it is REVERSE SORTED. This is easier from the point of compilation. Code ANALYSIS needs the reverse of that: code is linked in the direction of instruction flow. This is how words are internally linked: given a target word (entry point) it's inline fallthrough can be easily obtained using 'target-chain->list', which again returns reversed list. Entry: addresses Date: Sat Jun 14 10:26:20 CEST 2008 The Forth uses byte addresses, because it might access data bytes in flash memory. The Assembler uses word addresses, because this is the basic unit for flash memory. Entry: Don't step on composition. Date: Sat Jun 14 20:45:55 CEST 2008 Whenever you write a program, do not EVER limit the way in which people can combine your primitives. This is actually quite hard: limitations tend to creep in whenever a minilanguage arises. I.e. in Brood-4 there was a problem with interaction macros: they did not have a composition method that was easily accessible from Forth files. The catch-all solution in Brood-5 is to provide access to the underlying scheme core through Forth syntax. (explain) Entry: The compiler dictionary structure. Date: Sun Jun 15 09:10:46 CEST 2008 There's a problem right now which looses labels, probably because the state reaping in the compiler is too complicated: asm code collects with current word -> current chain chains collect to current store this state can be pushed/popped. Drops should be made explicit. Trouble is I see no drops. Problem is probably that 'terminate-chain' for 'org' also needs to terminate the current word. The bug occurs after 'comma' of the config values, which are not terminated.. So, let's make sure that terminate-chain only works for #f labels. No, that's not it.. weird one.. needs proper machinery to track down. Entry: dataflow language Date: Sun Jun 15 09:39:48 CEST 2008 Compiling an expression language to C is fairly trivial, since C has an expression language built in. GCC also has SSA (static single assignment) form, so presenting C code that uses single assignment should be ok. Expression evaluation is straightforward, so trusting GCC to handle this properly should be no problem. GCC also has a mechanism for proper tail calls: http://community.schemewiki.org/?gcc-does-no-flow-analysis So, as long as there are no first class functions or comprehensions, compilation is really easy. Entry: array comprehensions Date: Sun Jun 15 10:09:14 CEST 2008 The trouble is then, the problem I want to solve is to generate efficient code for a dataflow graph + array comprehension combination. The real problems for comprehensions (translation to nested loops) are performance (P) and correctness (C). * Inner loop generation (P) * Cache memory optimization (P) * Handle border conditions. (C) Here (P) need to be fast and (C) if applicable (i.e. in convolution) might be approximately correct. There is a discussion on the concatenative list about Backus' FP and APL not having first class functions, but comprehensions. Is it fair to say that this is the thing to do for numerical code? First class functions are overkill, but anything you would want to solve with them can be solved with comprehensions: you turn higher order operations into syntax, which converts them to easily inlined loops. There's a thread on concatenative about this: http://www.nabble.com/Joy%27s-relationship-to-FP-%2B-a-Joy-variant-with-combining-forms-td17576284.html My answer, without thinking too much, would be: macros. Languages based on composition can have partial evaluation. Higher order macros can expand to combination forms: Have higher order macros, but don't allow such functions at run time. Is this cheating, or completely beside the point? "We owe a great debt to Kenneth Iverson for showing us that there are programs that are neither word-at-a-time nor dependent on lambda expressions, and for introducing us to the use of new functional forms." - John Backus, 'Can Programming Be Liberated from the von Neumann Style?' The basic idea; use concatenative syntax for specification of: * pure functions, which are translated to a dataflow representation. * higher order macros which serve as combining forms (combining forms). Maybe I should try to answer this, and relate it to unrolled reflection and partial evaluation: http://www.nabble.com/Re%3A-Joy%27s-relationship-to-FP-%2B-a-Joy-variant-with-combining-forms-p17802001.html I don't know how though.. Maybe best to try the combinator route first. The idea here is that "combining forms" as found in FP and APL are related to the macros in Purrr that are not compilable. My stress on macros is really about giving up first class functions at runtime, but not at compile time. Entry: data flow + aspects Date: Sun Jun 15 10:09:26 CEST 2008 So, for simple scalar DSP code (i.e. no FIR filters), it should be possible to define such a language fairly easily, and concentrate on 'hints' for compilation. What I mean is this: make the description highlevel, and add hints to influence compilation. These hints would choose number systems, scalings and bit widths for fixed point, etc.. Entry: for .. next Date: Sun Jun 15 11:36:02 CEST 2008 The (hypothetical) higher order macro: [ ... ] for-next and the Forth equivalent for ... next do the same thing, but in current RPN representation, 'for-next' does have access to the macro quotation to perform optimizations. It would be nice to make the first one primitive. In the current implementation however, this is difficult to do because the ... in the 2nd form is evaluated before 'next' is evaluated. This approach would need a change in representation. It does seem that using quotations directly is a far better approach: it doesn't need any forensics to recover quotations from flat Forth syntax. A limited translation of Forth to this form as a syntactic operation might be feasible, but it's not possible to take it all.. This is at the core of the conflict of 2 forces in Purrr: (L) The lowlevel compiler which gives explicit control to the programmer about how to use nesting. (H) The highlevel approach based on explicit quotations and s-expression syntax. Unifying both is difficult, but they can probably be built on common ground: providing a macro language with simple (conditional) jump primitives. This is (virtual) machine design: the Purrr control primitives. Entry: Purrr control primitives Date: Sun Jun 15 12:07:38 CEST 2008 conditional jumps : VM primitives Entry: Combining Forms and Higher Order Macros Date: Sun Jun 15 12:25:15 CEST 2008 EDIT: This used to be a blog entry, but is too confused to qualify as such. Most of this is about the clean coma language with higher order macros, which needs to be implemented first before any of the array processing extensions can be added. Coma is used as the pure functional macro kernel of the Forth language dialect template called Purrr, which is further specialized to yield a PIC18 Forth dialect. Coma is essentially a template programming language: all words that occur in a source file are generate and process intermediate code if they occur in an instantiating context, i.e. a Forth word. Classical macros expand code. The essential extra step to make program specialization work is a reduction step: after expansion of several primitives, examine the resulting code and reduce it by evaluating part of it. In Coma the expansion and reduction operations are combined into a single step which are specified as primitive intermediate code pattern matching words. In a compositional/concatenative language, the reduction operation becomes simpler compared to lambda reduction, due to the absence of variable names. Other than that, the principles are the same: anything done in Coma can be extended to expression languages with formal parameters with a few extra substitution rules. What I am interested in is to see in what way this can be applied to higher order programming. Currently Coma is used mainly to support Forth (a first order language), using a simple extension that allows the implementation of Forth's structured programming on top of conditional and unconditional jumps. Alternatively, a more Joy-like language can be constructed on top of Coma, where all control flow is built on top of recursion and the 'ifte combinator. This is where I'm a bit in the dark. It is possible to create replacements for Forth's structured programming words by using higher order combinators that perform partial evaluation on code quotations, inlining them. A construction like (a b c) (d e f) ifte would essentially compile to Forth's if a b c else d e f then There are two main problems with this approach. One is should 'ifte have a semantics if it has arguments that cannot be evaluated at compile time? It doesn't seem too difficult to allow a run-time code quotations to _exist_ at run time; they are just code pointers. However, allowing them to be _constructed_ at run time (the analogy of closures) requires some run-time memory management (either full GC or linear memory). Another problem is explicit recursion. The main difficulty about PE is when to stop. It's not possible to unroll recursive definitions completely since this provides an infinitely large program. Wadler talks about this in the orginal deforestation paper, where such infinite expansions are caught by introducing run-time recursion of newly generated functions. An alternative approach (Bananas and Lenses) is to solve the PE problem at a higher level: by identifying (a limited set of) higher order combinators and deriving a set of rules about how they combine, program manipulation is possible on that higher level, instead of working with the mostly unstructured recursion (goto!) in unfolded combinator bodies. In Backus' FP, all the combinators are fixed, and there is no composition mechanism for combinators. According to Backus (Iverson?) this would force programmers to learn to use the available operators efficiently. What I'd like to figure out is what set of data types + functional forms would be good for certain classes of problems related to resource constrainted programming. How to reduce the space of programs such that efficiency is guaranteed, but programs can still be written in a high level declarative form. References: http://www.stanford.edu/class/cs242/readings/backus.pdf http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.36.1030 http://citeseer.ist.psu.edu/wadler90deforestation.html http://citeseer.ist.psu.edu/meijer91functional.html Entry: partial evaluation of higher order functions Date: Sun Jun 15 23:46:16 CEST 2008 * higher order macros (HOM) * list comprehensions * combining forms + 1st order functions (FP) * forth -> pure macros. what is this level shift? * deforestation EDIT: STA(ck)APL According to the wikipedia FP entry, limiting a language to 1st order functions and a limited number of (non-composable) 2nd order functions creates a simple algebraic structure. Combining forms are quite simply defined as 2nd order functions. List comprehensions are similar to limited order combining forms: they avoid the use of higher order functions to perform iteration/folding. However, they do not generate first order functions as objects: they are merely syntax. I found a definition for HOMs here: http://foldoc.org/?higher-order+macro "A means of expressing certain higher-order functions in a first order language." P.L. Wadler "Deforestation: Transforming programs to eliminate trees" http://www.springerlink.com/content/7217v376n7388582/ original paper: http://homepages.inf.ed.ac.uk/wadler/papers/deforest/deforest.ps Where i'd like to end up is to find the relationship between my forth macro approach and fixed / composable 2nd order functions. I need a theoretical framework, probably some type system restriction, to get out of the anything->anything lisp world. What I use this for is not necessarily to define a clean langauge specification, but to see if it can help choose between higher order functions and inlined expansion. Since I have higher order functions, but would like to have inlining as optimization. Somewhat related, i can make things such that each word is linked to its originating macro. This might lead to functions that are both instantiated, and available as a macro whenever they are used in a combining form that cannot accept a function at run time. Entry: more input Date: Mon Jun 16 16:18:41 CEST 2008 Shopping around for input: http://citeseer.ist.psu.edu/440438.html "Macros as Multi-Stage Computations: Type-Safe, Generative, Binding Macros in MacroML (2001)" This is actually an interesting paper which deals with a lot of stuff that's on my mind now. Might be interesting to take my implementation directed approach a bit further. Macros in statically typed languages: difficult? Is syntax tree manipulation in dynamic languages less ad-hoc? This is an interesting hub-paper. Entry: preparing for state shuffling Date: Mon Jun 16 16:29:14 CEST 2008 the current syntax in instantiate.ss and 'state-lambda' is not very good. i already ran into trouble mixing out variables into something returning a next state object + data. it's also not abstract enough to be able to replace structure type based encoding with list (stack of stacks?) encoding. -> use syntax parameters Oh, they are not so scary :) Maybe it's best to make the rpn transformer like that too, instead of all these compile time parameters. That's a major overhaul that could be used to write it in cps actually. Entry: wandering into confusion Date: Tue Jun 17 19:13:25 CEST 2008 In very general terms, what I want to do is to see if I can do DSP in a concatenative language with the proper higher order combinators for list/array processing, and find a way to optimize all combinators away to produce optimal loop code. That's all really: design a language that feels high-level and lispy, but is guaranteed to compile to efficient constructs. My hunch is that this is actually not so very difficult to do with a concatenative language. What I don't see is how to perform instantiation automatically. How to move from high -> low, treating higher order functions as macros. And, how much am I not just re-inventing APL. As far as I read, Wadler's deforestation paper deals with this kind of higher order macros in section 5. Do I need to understand deforestation before being able to use these macros? I don't really need arbitrary intermediate tree data structures: i'm really just using arrays. Looks like deforestation is necessarily expressed in a first-order language, and higher order macros are a way to get some of the effects of higher order functions in a first order language. Thinking about this, it probably needs automatic instantiation: no manual macro/forth defs. Now this: http://www.cs.ucsd.edu/~goguen/ps/will.ps.gz looks like it is pretty close to what i'm doing. I was wondering about how to relate C++ templates to purrr composable macros, but looks like the root of this idea is in the OBJ language. Entry: apology Date: Tue Jun 17 23:39:50 CEST 2008 Why am I doing this? It is really about the language, about its algebraic feel. Maybe I should be honest and keep that as the only real reason. It's like Legos. It clicks. Then there are explanations of why I might like it: * Concatenative languages span a wide spectrum in a useful way. This allows me to use similar paradigms from the very low to the very high. * One can get far without closures (which take the form of curried quotations created at run-time). * Partial evaluation is simple for functional concatenative languages: scopes don't get in your way. * Imperative concatenative language can have a large functional subset. * Linear memory management becomes non-intrusive. Some fragments of correspondence. Question about breaking new ground with the Staaple approach: Well, I know for a while now I need some form of compile-time program specialization that can turn higher order functions into specialized first order loops. The real question is in how to simplify the programming language such that this the problem of writing the compiler can be solved by me in a limited time. Doing it only with an interpreter + specialized manually crafted C core routines (like PF, Pure Data, Supercollider, Matlab, ...) is not powerful enough. Untyped lambda calculus is too general to solve the problem with a simple compiler. Typed lambda calculus works better but such a language is not so straightforward to implement. So I'm looking at something first-order with higher order macros, closer in spirit to APL, Backus' FP and C++ templates than lisp. The only ground I broke is that I ended up with a non-intrusive way to combine compile time operations and run time operations in one language without semantic problems, simply by taking a functional programming view where evaluation time might be thought of as unspecified. Concatenative macros are a very natural way to do template programming, because name bindings don't get in the way. Concatenative form can also be easily transformed to nested expression form so when I need data flow analysis I can do it, but for some program transformations it's really easier to keep it concatenative. Code is more 'algebraic' and less 'logic' in that form if that makes sense at all.. Lists instead of trees. What I have now is still manual: there is no automatic loop inlining happening. I'd like to figure out if this is necessarily a part of the language (1st order language + some 2nd order functionals) or if i can automate it, so it becomes a language with higher order semantics preserved in case the opti doesn't apply. So, while I find it interesting, I am getting in territory where I should be careful not to be too general, and try to stick to the problem of making a language wich is very close to machine language, but has access to higher order constructs. I'm already there: the macroassembler on steroids idea: for PIC18 the bottom layer concatenative language almost maps 1-1 to assembler. It's automatic code juggling part on top of it that is giving me headaches. About writing beginner languages: I did some workshops with forth now, and i find people pick it up pretty fast. The real problem is not language though. Some languages go smoother in the beginning than others, but I found the real problem to be the point where you leave simple scripting (filling in parameter values) and code composition enters the picture: how to divide and conquer. I think a beginner language should stretch the scripting part as long as possible, but i sort of gave up on that idea. It only makes hitting the abstraction wall more painful. What i got a bit discouraged about is that more often than not, no matter what you try people like to stay in that scripting area. I don't know if there's a way to trick people into crossing that barrier unknowingly. Did you run into something like this with scheme? Entry: about deforestation Date: Wed Jun 18 12:39:51 CEST 2008 Maybe his earlier work on lists is better suited to translate to my problem: folding combinators for array processing. I'm facing a gap between my understanding of theory, and a particular practical problem. I should just try to solve one such problem with higher order combinators to get a better view about concrete problems, to get out of the muck of abstract confusion.. The problem is really one of datastructures and their iterators. In FP, there are only nested arrays and some map and shift operators. So.. Deforestation is about eliminating intermediate nested data structures in a first order language with recursive pattern matching for tree deconstruction. Wadler defines the property 'treeless', which is used to construct transformation rules to transform a composition of treeless functions into a treeless function. Definition: a term is treeless with respect to a set F of function names if: * it is linear (each variable is used only once. this is to make sure transformations don't introduce redundant computations, and can be relaxed for integers) * it only contains functions in F (these are the 'exceptions' that will be expanded) * every argument of a function application and every selector in a case term is a variable. (obviously, otherwise there would be an intermediate tree result) The algorithm that performs deforestation maps a linear term which contains variables and functions with treeless definitions to a treeless term and a possibly empty set of treeless definitions. The core of the algorithm is the standard function 'inlining': replace each function application with an inlined definition. If i start to think from that point, shouldn't i get to something simple? After all, i have no variable names to worry about, and no destructuring or creation of run-time data structures. That is all quite different. So, question: what is the equivalent of deforestation in a concatenative language with a simply managed array data structure and a 'map' operator? Let's write down some things that need to 'actually happen' * loop transformation to eliminate intermediate buffers * array memory management and reuse (linearity?) * dereferencing indirect addressing (on PIC18) Dereferencing indirect addressing is a particular problem i ran into writing highly specialized DSP code for microcontroller. It's a fairly extreme level of templating which makes sense due to very limited indirect addressing and multiplication on PIC18. There is a difference beteween translating 'map f l' to (cons (f l1) ...) and making sure the same operation happens in place, or with a fast reuse array. There's something in Wadler's paper about instantiate/unfold/simplify/fold, found in Burstall and Darlington: "A transformation system for developing recursive programs." Entry: eliminating intermediates in a concatenative language Date: Wed Jun 18 15:14:52 CEST 2008 Suppose [ f map ] applies an operation to a number of data structures and produces a number of data structures, according to the arity [ f ]. The transformation that eliminates intermediate storage is: [ f map g map ] -> [ [ f g ] map ] In opti talk this is called loop fusion. For simple video processing, this is about the single most important optimization: it eliminates storage of intermediate frames, which take up a large part of cache memory. This optimization is in practice bounded by: - dependency depth for deep pipelines - instruction cache size Naively, to take care of those issues it can be beneficial to limit loop fusion and allow for a limited sized intermediate buffer. These 2nd order problems can be ignored for now. The most important life saver is loop fusion, which if not for speed, can save a lot of memory. Translating this optimization to current concatenative macro architecture, it requires access to all functions in macro form. Final instantiation of 'map' can be the generation of a for loop and buffer allocation, but the important step is the fusion. How to use current code substitution rules to implement this? For one, it requires a 2-pass algorithm in the current form. The map macro cannot be instantiated until all fusion has happened. Postponed partial evaluation is solved in the pattern language using pseudo assembler instructions. EDIT: http://www.randomhacks.net/articles/2007/02/10/map-fusion-and-haskell-performance apparently, in haskell this kind of optimization is pluggable There are a couple of interesting links in that article. It's all starting to become a bit more clear: concentrate on properties of higher order combinators. One of the links seems to be about automatically deriving these kinds of rules (Wadler's Theorems For Free). Entry: code transformations Date: Wed Jun 18 17:22:49 CEST 2008 The multi-pass optimization algorithm has an ad-hoc form: the first pass instantiates macros, while subsequent passes perform specific substitutions only. I can't express it properly, but shouldn't code originally have the form ([qw ] [qw ] ...) with a 'run' application? An important question: is there a way to get rid of this 2-pass mechanism? Is it necessary or just more elegant to use a demand driven pipeline? To implement [ f map g map ] -> [ [ f g ] map ] we need to represent 'map' as a pseudo op: (([map f] [qw g] map) ([map (macro: f g)])) (([qw f] map) ([map f])) And the map can be eliminated in a 2nd pass by instantiating it: (([map f] map-pass) (macro: f do-map)) This is entirely too trivial, so all the beef is hidden in the instantiation of 'map'. Which makes me think, what is 'map'? An implementation of a data structure iterator + a specification of its abstract properties used for source manipulation. next: Think about 'fold' over arbitrary data structures and related things like loops, iterators, comprehensions, compile time folding, ... There's some rewriting in Cat that does this http://www.cdiggins.com/cat/cat.pdf Entry: doubts about compilation Date: Thu Jun 19 09:12:11 CEST 2008 I'm wondering wether it isn't better to keep macro code representation in list form. The assembler output is again a concatenative language in source form, so why doesn't it have the same form as the input? Also, maybe the eager algorithm is too simple? The problem is that given a composition [a b c] it could be split as [ab c] or [a bc]. One could be significantly simpler than the other syntactically, while semantically they denote the same function. An eager algorithm is always going to pick the first option. This is the reason why some rewriting operations are postponed to a next pass. More specifically: the multipass algorithm now is a simplification of a general non-deterministic algorithm that optimizes ideal combination of terms. Maybe multiple passes need to be defined more abstractly? Entry: back to fixing bugs Date: Thu Jun 19 11:39:57 CEST 2008 since i'm just getting confused over ideas that need fermentation and more reading, maybe best to start fixing bugs. the alledged problem with 'org' seems to be a problem with forth/macro mode switching. OK: macro \ : config #x10 ; forth #x20 org : bla WRONG: macro : config #x10 ; forth #x20 org : bla probably 'forth' needs to terminate previous macro defs, otherwize non-labeled code will be concatenated to the last macro def. looks like that was indeed the problem. that was the last known bug in the way of uploading code. time for hands-on! Entry: interaction code Date: Thu Jun 19 13:25:59 CEST 2008 might be better to try to get some communication going with the previous monitor, before uploading the freshly compiled one. start with target.ss first part is upgrade to plt 4.0 for loops: comprehensions. Entry: lazy-connect: book vs. conversation Date: Thu Jun 19 15:06:04 CEST 2008 about the 'current state' issue for interactive development. i guess it's ok to have state. the previous approach of making everything temporary is maybe a bit too brutal. i.e. a current connection is really ok. use custodians to manage that kind of stuff, not parameters. On the other hand, for lowlevel interaction it might be a good idea to flush buffers on every message exchange, since things tend to go wrong. basic interaction works: box> (with-io-device '("/dev/ttyUSB0" 9600) (lambda () (scat> ping))) CATkit <0> box> ordinary target access seems to work without trouble. the thing that needs to change is interaction with the target dictionary, which is now a scheme namespace + serialization, incremental compilation and code upload. Entry: dictionary / namespace Date: Thu Jun 19 20:51:48 CEST 2008 Should the interaction code live in the same namespace as the compiler? It would be nice to be able to specify interaction code in source files, so maybe it's best to do that. I need to be careful here not to fall into the same pit: host side code should be composable + interaction templates too: prefix syntax is ideal to override default target semantics so need to have it, but should be composable. Entry: data types + HOF Date: Thu Jun 19 22:08:06 CEST 2008 So, along the lines of http://www.randomhacks.net/articles/2007/02/10/map-fusion-and-haskell-performance The basic idea is: whenever you define a data type and a map/fold/... HOF you need to somehow obtain transformation rules to simplify compositions, by moving operations inside loops. The problem is, mapping this to Purrr, it's quite easy to add these transformation rules (manually), but the problem is really the representation of the data type. How to add the runtime support for say a video frame? It is also pretty clear that typing is essential here: i'd really like '+' to be polymorphic, so i can make every occurence of 'map' implicit. Functions should be upgraded (coerced?) automatically. I'm getting confused now.. Polymorphic macros. Is assembly pattern matching a genuine type system? Just had a look at the wikipedia C++ template page, and it says: a feature of the C++ programming language that allow code to be written without consideration of the data type with which it will eventually be used It looks like what i'm doing is more general that that (compile time decisions based on literal values), but on the other hand, you can probably hide everything that can be done with values in Purrr, in classes in C++. Maybe this is an essential difference really: value based metaprogramming instead of type based? Does this make sense at all? ( Objects instead of classes? Prototype templates? ) Entry: Faust Date: Fri Jun 20 00:28:30 CEST 2008 http://faust.grame.fr/ basicly what i want to do, but i don't like the syntax. makes me think that i'm on the right track with a concatenative specification frontend to solve the 'bussing' problem: connecting multi in/out things.. another thing to solve when doing a dsp language like that is block-based algorithms. so, am i on the right way with programmable macro semantics? say i can use a pic18 program that imports a module with a different macro semantics that produces a static dataflow network + buffer management? maybe i need to just write another synth to pull this thing through. something more classic dataflow + feedback, with emphasis on compilation to pic18 architecture. NOTE: translating concatenative code to expressions has one advantage: it makes usage explicit, so allocation might be simpler? Entry: rewriting Date: Fri Jun 20 00:51:18 CEST 2008 should i give up eager pattern matching, and move to a different rewriting system, or is the current one good enough when it's equipped with an easier to use multipass architecture? this is interesting: http://lambda-the-ultimate.org/node/1658#comment-20313 Entry: live parsing words Date: Fri Jun 20 09:54:32 CEST 2008 problem is that this is a map from (live) -> (scat) while the other substitutions macro defines endomaps. it's straightforward to map to a different namespace: (unquote (ns (scat) id)) but doing this one looses nested macros, which i just conveniently used for defining substitution-types. (using a primitive language, one needs 2 composition methods) wait.. the live->live subsitutions for 2sim can remain as is. just need to define the primitive properly. it's probably easiest to just use quoted code + run. next: tfind Entry: 3 different languages Date: Fri Jun 20 11:01:48 CEST 2008 compositions: (name w1 w2 w3) postfix asm patterns: ((a1 a2 name) (b1 b2 ...)) prefix substitutionsn: ((name a1 a2) (w1 w2 w3 ...)) Compositions are the core of the functional language. Postfix asm patterns are used to implement eager rewrite rules during translation and prefix substitutions are used for changing semantics of symbolic names and numbers. Entry: the meta namespace Date: Fri Jun 20 12:06:22 CEST 2008 i run into a problem with different instances of the target-word structure. maybe the solution is to make sure badnop runs in a module namespace? looks like the problem is with using the namespace anchor attached to the module namespace.. maybe best to attach it to the repl's toplevel namespace. There's still some confusion: if a module A is in namspace NS, and module B is required, but B requires A, then A will be re-instantiated, right? Unless NS is a module namespace. Let's see, what's the difference between: > (define ns (make-base-namespace)) > (dynamic-require "target/rep.ss" #f) instantiating target/rep > (namespace-require "target/rep.ss") > (namespace-attach-module (current-namespace) "target/rep.ss" ns) > (parameterize ((current-namespace ns)) (namespace-require "target/rep.ss")) and doing this from within a module doesn't work... An explanation: when a require form is evaluated inside a module, the module registery of the required module is not the same as that of the namespace in which it is required. A toy example: (module A scheme/base (printf "instantiating A\n")) (module B scheme/base (require 'A) (printf "instantiating B\n")) box> (require 'A) instantiating A box> (require 'B) instantiating B box> again: box> (require 'B) instantiating A instantiating B box> so... A is not re-instantiated.. what am i doing wrong? ok.. the namespace.ss code works just fine: ;; Create a namespace with shared and private module instances. (define (shared/initial-namespace src-ns shared private) (let ((dst-ns (make-base-namespace))) ;; See PLT 4.0 guide, section 16.3 ;; Reflection and Dynamic Evaluation -> Sharing Data and Code Across Namespaces (define (load-shared mod) (parameterize ((current-namespace src-ns)) ;; make sure it's there (dynamic-require mod #f) (namespace-require mod)) (namespace-attach-module src-ns mod dst-ns) ;; get instance from here (parameterize ((current-namespace dst-ns)) ;; create bindings (namespace-require mod)) ) (define (load-private mod) (parameterize ((current-namespace dst-ns)) (dynamic-require mod #f) (namespace-require mod))) (for-each load-shared shared) (for-each load-private private) dst-ns)) The problem seems to be about other modules that are loaded into that namespace. They seem to re-instantiate the the target/rep module. Ok, it was really really stupid: (namespace-require "pic18.ss")) -> (namespace-require 'staapl/pic18)) Just a module name issue. Entry: prj Date: Fri Jun 20 13:54:13 CEST 2008 Prj is a small bit of glue to enable loading of different specialized compilers in their own namespace. All namespaces share some code for space efficiency and some datastructures so they can communicate through the badnop layer. Now, what about 'find'? This is a reflective operation: it looks in the current toplevel namespace to map a symbol to a value. Ok, looks like everything is there now, just needs to be patched together. Entry: live interaction language Date: Fri Jun 20 20:06:01 CEST 2008 the current interface doesn't seem to support what i want to do. - to invoke macros properly i need to set the ide map to (live) prefix - however, this map leads to undefined words for default semantics, which would take items from the (target) prefix. however, it is possible to put the macros in the 'target' namespace. there's a simple workaround: make sure the target word is executable, or add an interpretation step. the latter is probably going to be simplest, since performance is not really an issue here :) used the interpretation step. interaction seems to work fine now. got both a prj> and a live> language. what is missing is incremental compilation with suspended state + inspection. Entry: terminology cleanup Date: Sat Jun 21 10:21:53 CEST 2008 chunk: list of binary/asm code associated to a word (entry point) chain: list of chunks with fallthrough for manipulating blobs of code for ihex + upload, a different name is necessary. let's call it blob then instead of chunk. bin: (number (listof number)) binary code list that happens to be consecutive line: upload unit (8 bytes on PIC18) block: erase unit (64 bytes on PIC18) to ease uploading, the chunk/chain subdivision is converted to lumps, which are then concatenated into bigger lumps. and split into lines. Entry: binary data objects Date: Sat Jun 21 10:51:04 CEST 2008 Handling binary data involves a lot of fixed size tables. Instead of using index lists, it might be more elegant to use 'for/list' comprehensions and sequences. This deserves a bit of attention. Let's define the type better. - bin = (listof bin-chunk) - binchunk = (listof number (listof number)) : address + codelist Operations: - splitting/joining bytes/words - line splitting - aligmnent - binchunk combinations Entry: avoiding O(N^2) Date: Sat Jun 21 12:10:04 CEST 2008 The problem of combining binchunks when they are consecutive is interesting: I can't find a way to use my usual collection of higher order functions without running into O(N^2) complexity due to iterated append. The problem: given a list of address/code pairs, create a new list which combines them if they are consecutive. ((a c) (a c) ...) To avoid N^2 and multiple traversals, the easiest way to do this is using a state machine with an accumulator, but is it possible to use HOFs for this, other than fold (which just factors out the explicit recursion/looping), with the choice of using constant space (left fold) or minimal hassle (right fold). Maybe i need to look into parser combinators, or try to write one figuring out the core routines. On the other hand, the right abstraction for this might be stream processing: convert binchuncks to a stream of word/address pairs, and recobine them. Let's try that first. Looks like i'm looking for 'for/fold' This is actually an interesting point where first order functions are syntactically more convenient than higher order ones ('for' is a form!). The difference seems really syntax: it's probably straighforward to convert between HOF and comprehension form. In the guide: 11.8 Iteration Performance. Looks like they've been thinking about optimization too :) This actually looks like a good candidate for a ``pre-scheme'' style first order language that compiles to straight machine code without runtime support. I don't understand why there is no 'in-append'. This would be a nice exercise for sequence combinators. Entry: Parsing combinators. Date: Sat Jun 21 14:53:40 CEST 2008 A lot of code in Staapl is about converting one datastructure into another one. Serializing one is simple, but collecting into another seems more difficult. Upto now I've been using manual stack manipulation to collect data structures. Is there a better way to tackle this? Almost always this is insertion into trees + postprocessing (reversing). Let's see.. stack levels 1 -> push (a b c) -> (x a b c) 2 -> push + push' ((a b c) (d e f)) -> ((x a b c) (d e f)) -> ((x) (a b c) (d e f)) Let's start with a simple list-of-list parser with the operations: ->0 0->1 1->2 I tried it with vectors: ;; ---- #lang scheme/base (require "list.ss") (define (llp-push-level! v n x) (let ((stack (vector-ref v n))) (vector-set! v n (cons x stack)))) ;; (llp-move! v n x) push x to stack n ;; (llp-move! v n) push stack n-1 to stack n (define (llp-move! v n [x (let* ((n- (- n 1)) (x- (vector-ref v n-))) (vector-set! v n- '()) ;; move to x x-)]) (llp-push-level! v n x)) (define (llp-push! v x) (llp-move! v 0 x)) (define (llp-compact! v n) (for ((i (in-range n))) (llp-move! v (add1 i)))) (define (make-llp n) (make-vector n '())) (define v (make-llp 3)) ;; ---- But it's probably better to just use lists, since the operations themselves are simple tree operations. push: (a b ...) -> ((x . a) b ...) compact: (a b ...) -> (() (a . b) ...) this becomes: ;; ---- ;; Stack of stacks. (define (make-sos n) (for/list ((i (in-range n))) '())) ;; Convert a collapsed sos to a list of lists, applying an operation ;; to each list level. (define (sos->lol sos [op reverse]) (let ((dim-1 (- (length sos) 1))) (let down ((l (list-ref sos dim-1)) (n dim-1)) (if (zero? n) (op l) (op (map (lambda (le) (down le (- n 1))) l)))))) (define (sos-push sos x) (cons (cons x (car sos)) (cdr sos))) (define (sos-collapse sos n) (if (zero? n) sos (cons '() (sos-collapse (sos-push (cdr sos) (car sos)) (- n 1))))) ;; ---- Entry: comprehensions + delimited control Date: Mon Jun 23 08:14:22 CEST 2008 Looks like a nice alternative to lazy lists. Screams for coroutines, so might want to add some abstraction to it. http://groups.google.com/group/plt-scheme/browse_thread/thread/d0ff99391f9ac53f/f5de867c296afcbe?lnk=gst&q=yield#f5de867c296afcbe (define (in-yielder f) (define end (list 1)) (define i (iter f end)) (make-do-sequence (lambda () (values (lambda (_) (i)) void void void (lambda (e) (not (eq? end e))) void)))) (for/list ([x (in-yielder (lambda (yield) (for-each yield '(1 2 3))))]) x) Entry: enumerators vs. cursors Date: Mon Jun 23 11:08:51 CEST 2008 http://okmij.org/ftp/Computation/Continuations.html#enumerator-stream http://okmij.org/ftp/papers/LL3-collections-talk.pdf http://okmij.org/ftp/Scheme/enumerators-callcc.html http://lambda-the-ultimate.org/node/1882 Oleg makes a case for providing enumerators natively, and deriving cursors from them if necessary, since they are the less useful variant. stream: encapsulated iteration state enumerator: collection fold Comprehensions are similar to enumerators, but they do not iterate over an abstracted datastructure, but over a concrete (sum/product) of (possible abstract) data structures. They are trivially translated to a map/fold HOF + fold function + a data constructor. http://srfi.schemers.org/srfi-42/srfi-42.html According to Sebastian Egner the main reason for this srfi is a simple form for the naturals. I find the middle way: using 'in-generator' from tools/seq.ss to convert generators based on delimited control to sequences usable in comprehensions quite convenient. Entry: got ihex Date: Mon Jun 23 11:48:23 CEST 2008 with the comprehension based binary code formatters we're now at the point to upload code and generate the monitor code in proper format: :020000040030CA :0E00000000C80F000080800003C003A0034072 :020000040000FA :020000001FD00F :020000040000FA :02000800FFD027 :020000040000FA :02001800FFD017 :020000040000FA :10004000C8D09EBA01D0FDD7ABA203D0AB98AB8885 :10005000F8D7EC6EAE50ABA402D0ED50F2D7120040 :10006000ACB201D0FDD7AD6EED5012000500E9DF56 :10007000FD6EED50E6DFFE6EED501200EC6E000EF0 :10008000EFD7A68EA69C88D8F9D7A68EA69C87D82F :10009000F5D70F0B8DD812000ED0E2D70ED041D07D :1000A00047D0ECD7FF0019D00DD02CD020D047D0AE :1000B00033D0E7D7EAD7C5DFE1D7D8DFDFD7C1DF55 :1000C000E8DFFDD7BEDFE46EED50EC6E0900F550C1 :1000D000C7DFE706FAE1E5521200B3DFE46EED5048 :1000E000EC6EDE50BDDFE706FBE1E5521200A9DF52 :1000F000E46EED50A6DFF56E0D00ED50E706FAE177 :10010000E552BCD79EDFE46EED509BDFDE6EED5016 :10011000E706FBE1E552B2D75BD8EC6E0800F528A4 :10012000D2D78FDF8EDFDA6EED50D96EED50A6D7C5 :1001300088DF87DFF76EED50F66EED509FD7EC6EDF :10014000400EE46EFF0EEC6E0900F550ED14E7066C :10015000FAE1E55285D7EC6EFD50F66EFE50F76E73 :10016000ED5006001200D96E800AE834DA50DA5AEF :10017000ED501200F8DFEC6EDE501200F4DFDE6EA0 :10018000ED501200EC6EF29E550EA76EAA0EA76EF1 :10019000A682F28EED501200A684A688F3D7A6841C :1001A000A6980A00EFDF09001200EF60EF6EED5035 :1001B000E844FD26010BFE22ED50120000EE7FF018 :1001C00010EE8FF08A68EC6E700ED36EED50120058 :1001D0001200000EFC6EF2DFC18AC18C93889382FC :1001E000EC6E330EAF6E240EAC6E900EAB6EED5017 :0801F00081A801D064D704D0FE :00000001FF Entry: ssa Date: Mon Jun 23 13:07:38 CEST 2008 http://www.cs.princeton.edu/~appel/papers/ssafun.ps http://lambda-the-ultimate.org/node/2860 Entry: DSP language Date: Mon Jun 23 22:12:09 CEST 2008 let's write a simple synth that runs on PIC18, but uses a nontrivial hardware mapping. Entry: booting the monitor Date: Tue Jun 24 10:05:51 CEST 2008 and it's not working :) good. time to get some debugging tools online. that's what we're here for, right? first: slurp + printing a hexdump from a sequence. got hexdump + easier io stuff working, problem is not the serial line but something else (debug-transmit and debug-loopback work) box> (io> (target> 1 2 3 ts)) <3> 1 2 3 the problem seems to be just with 'ping' ok. i remember: there's no 'hello' string defined. fixed. looks like it's running now, but i can't seem to execute words. Entry: playing with generators Date: Tue Jun 24 10:30:59 CEST 2008 A problem that i've run into is 'wrapping' a sequence around a loop to build a 2D view. It pops up more than once (i.e. list->table), so lets make an abstraction for it. This is trivial to solve with a generator + comprehension: (for ((row (in-naturals))) (printf "~a \n" row) (for ((columns (in-range 8))) (printf "~a " (generate))) (printf "\n")) But the problem here is termination condition. Can this be turned into a comprehension that abstracts termination? A simple way is to turn the printing loop itself into a generator: (define printer (in-generator (lambda (yield) (for ((row (in-naturals))) (yield (lambda (x) (printf "~a ~a" (* row 8) x))) (for ((i (in-range 6))) (yield (lambda (x) (printf " ~a" x)))) (yield (lambda (x) (printf "~a\n" x))))))) which can then be easily combined with the sequence to be printed (for ((p printer)) ((i sequence)) (p i)) The only thing that's awkward here is the missing newline in case the sequence terminates in the middle of a line. This could be solved by using some 'strings with backspace'. This is the cleaned-up hex printer sequence: (define (in-hex-printer [start 0] [data-nibbles 2] [address-nibbles 4]) (in-generator (lambda (yield) (define-syntax-rule (lp formals . args) (yield (lambda formals (printf . args)))) (define (addr x) (hex->string address-nibbles x)) (define (data x) (hex->string data-nibbles x)) (for ((row (in-naturals start))) (lp (x) "~a ~a" (addr (* row 8)) (data x)) (for ((i (in-range 6))) (lp (x) " ~a" (data x))) (lp (x) " ~a\n" (data x)))))) Which can be used as in: (define (slurp) (for ((i (in-thunk in)) (p (in-hex-printer))) (p i))) Now, turn it into an enumerator + a sequence derivative. Does that makes sense? What does a fold look like if the print output is seen as a data structure? It actually makes more sense as unfold operation. So, really, print is a consumer, but i turned it into a consumer producer. Moral of the story? * comprehensions abstract termination conditions which makes them easier to use than generators (eof?/read) * in cases where generators are more convenient than nested/parallel loops (parsing/printing-style data representation conversion problems), the consumer can be turned into a a squence of consumer procedures, which can then be linked to a producer sequence in a simple for loop. Entry: target mode / simulator Date: Tue Jun 24 15:43:04 CEST 2008 in target mode, it would be interesting to allow all kind of macros, and try to simulate them, at first only the 'qw' and 'cw' instructions. Entry: interactive compilation Date: Tue Jun 24 16:33:25 CEST 2008 2 things: * what to do with the global code accumulator? * no more 'compile' mode for quick & dirty interactive compilation: use files/buffers instead. let's have a look first at serialization. it's not possible to serialize macros, but it might be possible to serialize constants and aliases. ok. got a function to read/write the namespace. on write, the macros get recreated. (define (target-words [words #f]) (if words (for-each* (lambda (name realm address) (let ((word (eval `(new-target-word #:name ',name #:realm ',realm #:address ,address)))) (eval `(begin (define-ns (target) ,name ,word) (define-ns (macro) ,name ,(case realm ((code) `(scat: ',word compile)) ((data) `(scat: ',word literal)))))))) words) (for/list ((name (ns-mapped-symbols '(target)))) (let ((word (find-target-word/false name))) (list name (target-word-realm word) (target-word-address word)))))) the global accumulator could be replaced by a parameter, so file->words conversion is possible locally. this deserves some cleanup. Entry: multi-stage programming Date: Tue Jun 24 19:18:23 CEST 2008 http://www.cs.rice.edu/~taha/teaching/03F/511/ talks about metaprogramming (manual staging) in a type-safe manner. Entry: questions Date: Wed Jun 25 02:25:19 CEST 2008 The problem is not answers; it's asking the right questions. An attempt: Q: Why are multiple passes for the rewriter so essential? Given a satisfactory answer to this, is it better to rewrite first to a simple qw,cw language, or are per-target patterns better? Q: Is it possible to see non-compilable pseudo assembler results somehow as type errors or contract violations, and associate blame? Following John Nowak's advice, let's have a look again at the Joy page about rewriting, and Backus' Turing Award lecture. http://www.latrobe.edu.au/philosophy/phimvt/joy/j07rrs.html http://www.stanford.edu/class/cs242/readings/backus.pdf Entry: Joy and rewriting Date: Wed Jun 25 10:46:45 CEST 2008 The setting: Any programming language can be given a rewriting system, but for Joy it's particularly simple. The idea is thus to put it on top: a rewriting system as a metaprogramming system: a source code transformation. ( Actually, no. It seems to be about giving a full semantics to a Joy program using JUST rewrite rules. ) Q: Does it make a difference if the rewriting system works on the source language (as in Joy) or the target language (as in Purrr)? It becomes interesting at the point where "The role of the stack" is discussed: reduction strategies. There's this 'duality' between programs and stacks that's right in the middle of my representation. Looks like Manfred was there first: This is the key for a semantics without a stack: Joy programs denote unary functions taking one program as arguments and giving one program as value. The literals denote append operations; the program returned as value is like the program given as argument, except that it has the literal appended to it. The operators denote replacement operations, the last few items in the argument program have to be replaced by the result of applying the operator. Similarly the combinators also denote (higher order) functions from programs to programs, the result program depends on the combinator and the last few quotations of the argument program. It is clear that such a semantics without a stack is possible and that it is merely a rephrasing of the semantics with a stack. Purists would probably prefer a system with such a lean ontology in which there are essentially just programs operating on other programs. But most programmers are so familiar with stacks that it seems more helpful to give a semantics with a stack. This is exactly what the Purrr primitive macros do: they take programs to programs. Essentials: PRIMITIVES: rewrite rules as endomaps of target machine code. semantics of concatenative program expressed in terms of these primitive machine rules is the composition of these rules applied to the empty target program. COMPOSITION: already hinted above: ordinary composition serves as the main abstraction mechanism to construct new endomaps of target machine code. PARTIAL EVALUATION: The 'stack' shows up here as the local view of target machine code. If the target language has a notion of a run time parameter stack, there is a possibility for staging: moving computations to compile time while preserving semantics. In "Quotation revisited" Manfred talks about the "draconian" measure of not equating lists and quoted programs. In Scheme terms, this is about constructing lambda expressions vs. quasiquotation + eval. Using the solution of only allowing the construction of quotations, but not the destruction (intensional definition, defined by its properties), isn't really that bad. (It's how Scat does it: all quotations are opaque, no reflection.) Q: For Purrr it's possible to talk a whole lot about the semantics of macros without even mentioning target semantics. Does it make sense to see target semantics as an extension of the semantics introduced by the rewriting rules, to capture the cases that the rules don't handle: those that are somewhat general? Or is it better to see the macro semantics as the extension of limited target runtime semantics. So, to relate Purrr and Joy a bit more: using rewrite primitives and function composition, Purrr will reduce a program to a value whenever it is a pure program. However, the target semantics isn't pure, so not all programs can be completely reduced. Entry: code registration Date: Wed Jun 25 12:12:03 CEST 2008 The point is to record all word structs as they appear in code, in the proper load order. This operation should be nestable. The problem I run into is that 'define' needs toplevel/module context, and making code registeration nestable seems to conflict with this: trying a parameter gives problem, because the defines will be expanded in an expression context. If it can't be made nestable, let's make the code storage write-once. Maybe it's better to define 'compilation unit' (one invokation of 'register-code' per 'forth-begin' = module or load. The problem to solve is to figure out which code was already uploaded. Maybe it should just be marked? Done. The remaining problem is: how to handle errors during upload? It might be wiser to only mark code as synced AFTER upload was successful. Lets provide an enumerator interface instead. Entry: Swapping the two stacks : using just rewrite primitives? Date: Wed Jun 25 18:32:16 CEST 2008 In fact, since the 'assembly stack' is of such paramount importance for giving a semantics to the macro language: Q: why not use it as the primary stack, and define Forth primitives that manipulate program entry points (conditional jumps) as an extension to that? (sticking with pure rewrite rules at first?) Q: if so, can a concatenative eager rewriting macro language like Purrr be equated with a purely functional typed concatenative stack language without full reduction? To answer the first rule: if code quotations are allowed without higher order functions then my gut feeling is that this should work pretty well. This brings the metalanguage VERY close to Scat: simply extending Scat with assembly code data types already does the trick. It looks like this is the way to find a better link between target semantics and macro semantics. Entry: Automatic instantiation Date: Wed Jun 25 19:34:45 CEST 2008 from the blog post, which will probably be edited once i find a way to express this properly (and solve the problem maybe..) Now, that's a nice story, but that's not how it happened :) The specification of rewriting rules came natural as a syntactic abstraction for some previously manually coded peephole optimizations. This rewrite system however seemed way more useful than just for performing simple optimizations, and by introducing pseudo target assembly instructions has been transformed into a different kind of two-level language semantics: powerful compile time type system with an analogous run time type system with 'projected semantics' derived from eager macro rewrite rules. The downside of this is that in order to write programs, one has to make decisions about what to instantiate, and what not. It might be interesting to try how far this can be automated: don't instantiate what cannot be instantiated, but try to isolate subprograms that have real semantics. Two things: Q: macro semantics and target semantics are not the same for some words like '+'. is this good or bad? it's useful for computing constants, but dangerous for overflows. is it better to completely embed the target semantics, and use different symbols for the metaprogramming operations? Q: is automatic instantiation really that difficult? compile time + might be seen as a different type.. (like + and +. in ocaml) except for optimality (inlining might sometimes be better), solving the instantiation problem based purely on semantics (inline when a composition is 'real' + doesn't mess with the target interpreter) might not be so difficult. Entry: and? Date: Wed Jun 25 20:07:57 CEST 2008 did we learn anything? - to really know what i'm talking about, i need to concentrate on a simpler concatenative macro language without higher order functions using a single stack and automatic instantiation of real macros. - the relation between semantics introduced by the rewrite rules and the the partially/fully postponed operations needs to be clarified a bit. Entry: them stacks Date: Wed Jun 25 20:29:13 CEST 2008 In the current compiler there are 4 stacks. Following the previous remarks about concentrating on rewriting first, the order will probably change to this: 1. Assembly stack, contains target assembly code and is used for target code rewriting. (in Forth compilers this is the allot buffer. in non-rewriting compilers it grows monotonically.) 2. The Forth Control stack, used for recording jump labels to implement looping and conditional structured programming constructs 3. The dictionary stack: (actually a set of stacks, supporting multiple entry points / fallthrough and control flow analysis) 4. The macro exit stack: for supporting multiple exit points in forth style macros. (an emulated run time stack). Only the first one is essential for Pure Purrr (Puurpr, Purrepr, ... Paars?). The other ones are there to support Forth's state in a functional macro system + give some freedom to exchange macros and instantiated words and perform control flow analysis. The problem I mentioned before is that the Forth approach with separate control stack is a bit of a dead end, since it's not a very structured way of dealing with code. Macro quotations are probably a lot better. Unfortunately, there is no simple way to convert forth syntax to quoted macros, without getting rid of the control stack. 5. Probably, in a language based on 1. with automatic instantiation (otherwise there would probably be some kind of code explosion), a stack of instantiated words might need to be added. However, this is just a write-only registery (log). Entry: 2 stage semantics Date: Thu Jun 26 10:15:58 CEST 2008 correspondence: It might be helpful to put on a background light: i'm trying to write a system for parameterized programming of tiny computers (currently Microchip PIC18 microcontrollers) based on concatenative and functional languages. I'm interested in limited order semantics mostly from a perspective of optimal implementations: how simple can the eventual semantics be made without having to sacrifice space/time efficiency. Currently I'm leaning towards a full macro system with first class macros, but I'm interested in this limited order semantics, and like to see if it can somehow be embedded in my approach. What i'm trying to figure out is how i can use 'higher order macros' in my system to allow for limited order semantics as you are suggesting. The approach i'm taking is: * Use 2 stages: concentrate on the first stage which consists of a joy-like language that operates on a stack of machine code instructions (stage 1) and a stage that executes the resulting machine code (stage 2). * Start building stage 1 semantics from rewrite rules that operate on programs built from a single stage 2 instruction QW (quote word), which loads a number onto the run time stack. For example, the stage 1 function '+' performs the following program transformation: ... [QW 1] [QW 2] -> ... [QW 3] A complete Joy-like semantics can be built from this, if the fact that QW can only accept numeric arguments is ignored. At this point, some operations might not be defined for all input programs. For example '+' applied to the empty program is not defined. What can be done here is to start building target semantics based on the program rewrite rules: use a couple of instructions that manipulate the run time stack to make sure '+' can be defined on all input programs: ... [QW 1] -> ... [ADDLW 1] ... -> ... [ADDWF POSTDEC0 0 0] Doing this for the whole set of primitives gives a language with 2 stage semantics: stage 1: program text represents machine code rewrite functions stage 2: rewriting of the empty program results in 'real' programs The remaining problem is that some values used as arguments to machine code instructions might not be numbers: the Joy like language is higher order, so quoted programs are an example of such values. (We can create another problem by introducing intermediate instructions, which are stage 1 data objects that do not represent target values nor instructions. However, this is beneficial for the eventual goal of paramerized programming.) As a result, not all (macro) programs that have a stage 1 semantics can be attributed a stage 2 semantics, because applying them to the empty program does not yield a program that lies in the target program space due to the use of non-numeric values, or the use of pseudo machine instructions. What I'm already convinced about is that this approach works pretty well for manual metaprogramming: by requiring the programmer to instantiate the 'real' programs as parameterized general macros, programs can be built in a Forth style language. (Think Forth run-time and immediate words). Allowing for compile-time data types that do not translate to the (necessarily) limited target machine semantics gives access to a very powerful way to factor/modularize parameterized programs (specialized code generators). What I'm interested in is to figure out how to perfrom automatic instantiation which gets rid of the 2-mode word/macro Forth-style semantics, how to turn non-specialized program generation into type errors where the source can somehow be blamed, and how to embed limited order operators in a sound way. Entry: rewrite semantics Date: Thu Jun 26 12:18:20 CEST 2008 a remaining problem in my reasoning about rewrite semantics is this: * Manfred talks about giving Joy a semantics through rewrite rules. This REPLACES the stack semantics, but stack semantics is later re-introduced as a STRATEGY for implementing the rewrite rules. * I talk about rewriting target programs. The definition of the function + can be written as [qw 1] [qw 2] + -> [qw 3] This syntax represents the definition of a function which maps a target program of 2 instructions to one of 1 instruction. (Let's not use parameterized numeric values for simplicity.) But this is trivially changed into a system of purely syntactic rewrite rules like: [qw 1] [qw 2] [qw +] -> [qw 3] What's the difference? They are really the same, no? Extending the function system with functions that are 'self-compiling', i.e. : 123 -> [qw 123] and extending the rewriting system with a preprocessing step that maps all syntax elements X to [qw X], we have two ways of interpreting: 1 2 + As functions, and inside the rewrite system. Entry: chip bootstrap and monitor protocol Date: Thu Jun 26 14:58:17 CEST 2008 The problem with CATkit is that it needs a bootstrapped chip that listens for commands on the serial port. The problem with this is that there is a threshold for people to start using Purrr for PIC18 without buying a programmer: they have to build one. It would be a lot more convenient to do all the communication throug the ICD port. In theory this isn't so difficult, but does require some juggling to get going. The Purrr console is an RPC protocol: host sends command to target and waits for reply. The Microchip debugger protocol is a master-slave protocol. After bootstrapping using the one-way Microchip programming protocol the PIC can be made to do anything, but requiring the host to wait for asynchronous replies isn't so easy to do with custom hardware. I was thinking about a serial port based interface in which the target -> host protocol is simple RS232, but host->target can be synchronous (for initial programming) or asynchronous. Entry: Backus Turing Awared Lecture Date: Thu Jun 26 20:39:54 CEST 2008 Interesting points about FP: * all functions are unary. (this fits in a concatenative but not necessarily stack-based) approach. * primitive combining forms are chosen not only for their computational properties, but also for how they behave in the algebra of programs. It has crossed my mind about using objects different than stacks to perform the chaining. Looks like that is what FP is: lists of lists. ( PF is FP backwards. very funny. giving FP a postfix syntax wouldn't be completely insane, really.. having an embedded array processing language is a concatenative one could get away with the strange semantics of 'map' in a concatenative language: turning a value from a list into a stack, processing it and turning the result back into a value. ) * functional forms and parameterized programming are quite related. * to apply a defined symbol, replace it by the RHS of its definition. so, if you change the syntax around such that forms are implemented by postfix macros that expect quoted programs, but do NOT allow quoted programs to survive the compilation, you're done right? a concatenative functional macro language. Entry: point-free style and monads Date: Fri Jun 27 11:24:25 CEST 2008 Cons: M t Unit: t -> M t Bind: (M t) -> (t -> M u) -> (M u) A monad M is a way to organize a collection of t, together with a way to sequence computations of such collections. The 'bind' operation takes values t from M t, produces a collection of monadic values M u and combines those into a single collection M u. The thing with '>>=' and 'do' is that they introduce new names. Instead of mapping one monad to another one directly, this is unwrapped to a 'do' comprehension that is then 'iterated' by the implementation of '>>='. It's probably better to look at the alternative formulation, replacing Bind by: Map: (t -> u) -> (M t -> M u) Join: M (M t) -> M t which can be used in point-free style, and is closer to the spirit of FP and stack languages. Entry: lambda: why names? Date: Sun Jun 29 10:25:11 CEST 2008 http://www.latrobe.edu.au/philosophy/phimvt/joy/j08cnt.html Lambda names are a user interface: lexical locality works well for human brains. However, manipulating lambda expressions is tedious. (DeBruijn indices fix this problem, but are not so readable.. Maybe it's best to regard names as syntactic sugar?). What I don't understand in a lot of texts about Joy is the emphasis on composition instead of application. I understand that the _structure_ of a program is better seen in terms of composition alone, but the eventual use of a program is really application. Let uppercase denote data items and lowercase denote functions: S a b The first space between S and a is an application, while the second one is a composition. Eventually, you're interested in the value. Maybe the nuance is to really get rid of the value altogether and see the semantics of [a b] as the 'output' of a program? Feels wrong.. syntax: concatenation semantics: composition of functions execution: application The only real down-to-earth use i see is syntactic manipulation leaving semantics invariant: optizing compilation. Really, Joy in its 'composition only' lore is really about compilation, about 'relative semantics' which stops at the actual application. Maybe my intuition is too much attracted by operational semantics (the 'real' world?). After all, there is something to say for "Application has not a single property. Function composition is associative and has an identity element" -Meertens. In S a b, dragging the S on the left along is rather pointless, even if the 'real' thing that happens is ((S a) b), semantically all that matters is the composition of a and b, because the application can be associated out: (S (a b)) Anyways.. Onward: Backus' FP: all functions are unary, but functional forms can take multiple parameters. Embedded in Purrr this means that at runtime there is a single 'token' going around, but at compile time, there might be a stack of functions and forms combining them. Actually, this seems like an interesting embedding! Purrr as a metalanguage for a non-stack language, based on the observation that both languages are concatenative, but having a different threaded state: stack vs. list of lists. Then about Category Theory and CAM. Too much for now.. Entry: is interpretation really different? Date: Sun Jun 29 12:37:10 CEST 2008 This popped up before, but I'm not sure if it's an arbitrary re-arrangement. Consider the expression from last post: S a b Where 'S' is a state, and a and b are functions. Turning the data/code roles around, one could interpret S as a function and a,b as data, where application of S yields a new function: ((S a) b) This has the semantics of an interpreter: 'S' is an interpreter state that takes the input code sequence (a b) to produce a new state. Compare this to the state monad in Haskell. Somehow it feels as if (S a) or (a S) are really only two sides of the same coin: producing a new state interpreter S from the message a, or computing a new state S to be interpreted by function a. Is this related to different order of evaluation/currying of the same function? Entry: oleg metaprogramming Date: Sun Jun 29 23:29:32 CEST 2008 http://okmij.org/ftp/Computation/Generative.html#framework Entry: gnuplot Date: Tue Jul 1 14:37:34 CEST 2008 First: There's a small inconvenience in sandbox that shuts down ports created inside a sandbox whenever there are eval-limits. Setting space and time limits to #f fixes this. For the rest, gnuplot works nice, but i get zombie processes. FIXME: closing the port doesn't stop the process. add a custom port wrapper. Entry: custodian + custom port Date: Tue Jul 1 21:06:13 CEST 2008 Trying to avoid the creation of zombie processes when custodian shuts down the gnuplot pipe. This works well with 'close-output-port but doesn't work with custodians, presumably because the inner port gets shut down first? (define (open-gnuplot) (let ((co (current-output-port))) (match (process/ports co #f co "gnuplot") ((list stdout stdin pid stderr control) (make-output-port 'gnuplot stdin (lambda (bytes start endx _ __) (write-bytes bytes stdin start endx)) (lambda () (printf "closing gnuplot\n") (close-output-port stdin) (control 'wait))))))) (define p #f) (define c (make-custodian)) (parameterize ((current-custodian c)) (set! p (open-gnuplot))) (custodian-shutdown-all c) Q: does the custodian shut down custom ports? (define (make-dummy-port) (make-output-port #f (current-output-port) void (lambda () (printf "closing\n")))) (define p #f) (parameterize ((current-custodian c)) (set! p (make-dummy-port))) (custodian-shutdown-all c) -> nothing happens.. (close-output-port p) -> prints "closing\n" EDIT: solved with double fork using an external utility: #include #include #include int main (int argc, char **argv) { int pid; if (argc == 1) { fprintf(stderr, "usage: %s ...\n", argv[0]); return -1; } pid = fork(); if (!pid) { char *a[argc]; int i; for (i=0; i<(argc-1); i++){ a[i] = argv[i+1]; } a[i] = 0; execvp(a[0], a); fprintf(stderr, "%s: can't execute %s\n", argv[0], a[0]); } return 0; } Entry: matlab-like behaviour Date: Wed Jul 2 00:59:08 CEST 2008 Except for heavy-duty floating point, and a ton of library code, most of what is in the matlab language is ease of working with vectors and matrices on the syntax level. What would it take to have a scheme-like clone? Are there any already to take some inspiration from? Entry: chaos Date: Wed Jul 2 01:01:45 CEST 2008 got lost in chaotic patterns again.. what i've been doing last couple of days: - LFSR + Hough - Reading the FP paper + joy rewriting (algebra of programs) - misc stuff about point-free languages - metaprogramming - simplified core semantics: 2stack -> 1stack + sexp (code quotations) EDIT: using dynamic-wind (define c (make-custodian)) (parameterize ((current-custodian c)) (thread (lambda () (dynamic-wind void (lambda () (let l () (sleep 1) (l))) (lambda () (printf "shutting down\n")))))) nope, doesn't work either Custodians don't manage processes, and it doesn't look like there is a way around that.. Maybe ignore the problem for now? Or try to get something similar working with subprocess? Maybe the "double fork" trick works here? Entry: Multi Stage Programming: Its Theory and Applications Date: Wed Jul 2 10:26:40 CEST 2008 PhD Thesis of Walid Taha: http://www.cs.rice.edu/~taha/publications/thesis/thesis.pdf About typed metaprograming (MetaML) Entry: datatypes and iterators Date: Wed Jul 2 17:14:37 CEST 2008 Starting from the ideas in FP, how can we build a minimalistic algebra of programs specific for image processing? This means: * keep the data type simple (tiled images) * functionals are special cases of map/fold/shift for image ops starting from this, building a framework for effective loop fusion should be doable. the problem is composition of shift operators: combining two convolution maps gives a bigger convolution map. (Is it possible to work only in terms of the 4 direction unit shifts?) This can be tested in 1D first, i.e. for block based audio processing. Algebra of programs. Ingredients: * binary functions +,-,* * scalars + vectors Is it possible to make something smaller than FP? Entry: lab/image-io.ss Date: Wed Jul 2 17:27:50 CEST 2008 PGM input and YUV4MPEG output seem to work, but it's quite slow. Working towards the algebra of image processors, it might be interesting to start with the same basic structure in scheme: represent images as 1D vectors, and generate 'fast iterators' for it. Entry: that DSP language Date: Wed Jul 2 21:47:25 CEST 2008 Buzzword time. Or, what are the different ideas I'm trying to solve at the same time by being confused for months in a row.. Basically, I can't take the time to read all research on this topic, and find it hard to follow such without proper hands-on experience. So how far can I actually get with common sense alone? metaprogramming: * dynamicly typed metaprogramming (Purrr) * concatentative composition based languages + evaluation time dsp: * an algebra of programming languages / rewrite rules * real-time memory allocation + organization: maximise locality * combining protyping + implementation (meet-in-the-middle language) * solve the tiling + shifting problem Currently I got a 2 projects to finish (snowcrash + staapl PIC18), but after that, I need some free hacking time to tackle the next problem, or some study time. Before I do anything else, I need to do: * Pierce TAPL * Muchnick ACDI Entry: tile problem Date: Wed Jul 2 21:58:43 CEST 2008 Suppose it is possible to obtain a data-flow graph which maps inputs to outputs. Use this to: * create a core loop for infinite data * optimize core loop to introduce software pipelining and eliminate multiple reads. * solve boundary conditions Maybe it's time to start reading Muchnick ACDI, and combine it with information from Pierce TAPL and the vague idea of Algebra of Programs + how to use it to perform loop fusion for DSP stream and image processing. Entry: image iterators + dont-care regions Date: Thu Jul 3 20:34:44 CEST 2008 Algorithms simplify a whole lot if dont-care regions can be constructed: no need to handle border conditions, except for the initial tiling step (duplication). The duplication this gives is probably not problematic since it's 2nd order. The players: semantics: * unary operators + 1 image mapper * binary operators + 3 image mappers: - 2 images - 1 image with X or Y shift implementation: * image accumulators * coordinate iterators Maybe I'm just getting tired, but it's really hard to chop this into primitives. One of the things that keeps getting in the way is to create the correct type of result accumulator from the input type. The problem is: the right model is not (+ a b) but (set! r (+ a b)) which can then be used to create a (+ a b). Maybe build it around this: (define (inner-loop! i i->j fn r a b) (vector-set! r i (fn (vector-ref a i) (vector-ref b (i->j i))))) r: result vector i: main index i->j: main index to secondary index map fn: binary function a: first input vector b: second input vector (can be same as first) Funny, what i'm re-inventing here are the x and y operators in the generating function / Z transform (xy transform?) representation of 2D sequences, but bound to a function. I.e. lifters : function => sequence operator lift : + -> + lift-x : + -> 1 + x lift-y : + -> 1 + y Maybe the base language should just be ordinary math functions and 2-variate polynomials? This should be more than enough to generate tiling + appropriate iterators. I've come full circle: starting with algebra -> DSP -> implementation of algorithms -> generalization in language -> algebra. Anyways, for the 2 shifts in a framework of 2^n tiles, it is possible to use modulo addressing, which simplifies code a lot. Sobel: (1 - y)^2 + (1 - x)^2 Ok, got it to work: (define (sobel i) (define ^2 (U (lambda (x) (* x x)))) ((B +) (^2 ((X -) i)) (^2 ((Y -) i)))) U unary lift B binary lift X binary x-shift lift Y binary y-shift lift So.. These are 2 different views: the operator view (where X and Y denote the shift operators) and the higher order function view, where unary and binary scalar operations are mapped to unary and binary image operations. The latter language seems more general: easier to work with multiple arguments. Entry: automatic lifting Date: Sat Jul 5 10:41:02 CEST 2008 Some of the lifting operations can be automated: U and B can be inferred from the arity of the operation. X and Y need to be specified. Entry: scheme vs. purrr PE macros Date: Sat Jul 5 12:25:00 CEST 2008 It's about macro arguments. The fundamental idea is that the expansion depends on the input _values_ not just the input structure. In ordinary day-to-day Scheme macros this is seldom the case. What I'd like to find is a way to explain the essential difference between Scheme's macros system and Purrr's macro system, which is a polymorphic concatenative language where values represent postponed operations. Building such a Scheme partial evaluator in transforming all functions to macros shouldn't be too difficult. This is called "introducing staging". The analogous intelligent scheme macro: (define-syntax-ns (pesel) + (lambda (stx) (syntax-case stx () ((_ a b) (let ((da (syntax->datum #'a)) (db (syntax->datum #'b))) (let ((na (number? da)) (nb (number? db))) (if (and na nb) (datum->syntax stx (+ da db)) #`(+ a b)))))))) So, is there ar real difference between the concatenative (string) rewriter and the tree rewriter? Not really. The only problem is that for a tree rewriter which optimizes applications, the appropriate rules for lambda rewriting need to be implemented. The only difference is thus convenience: this kind of stuff is easier to do in concatenative languages due to absence of names. * Concatenative languages: non-primitives can be expanded as a concatenation of primitives, which are simply applied in order. * Lambda languages: non-primitives need to implement the usual lambda reduction mechanics. So, partial evaluation of pure lambda expressions is actually not so difficult: if you start from normal order reduction, just reduce things that can be reduced for a certain expression. So.. Applied to the case of image processors. If they can be written as pure expressions, making evaluation order irrelevant, a program is easily specialized: 1. library of primitives + combinator HOFs 2. specialized expressions -> eliminate all applications of HOFs to yield a single expression So, really, this seems quite straightforward. Am I missing something? Yes. This does not include deforestation (or, the simplified version for image data structures). Roadmap: * write a lambda expression reducer * obtain rewrite rules for image HOFs ---- * alternatively, formulate it in a concatenative language to avoid the lambda reducer. Entry: order of parameters Date: Sat Jul 5 14:07:59 CEST 2008 Q: For highly parameterized code, the order of arguments in a higher order function decomposition is a bit ad-hoc. Is there a way to make this less so? Entry: split coma/macro Date: Sat Jul 5 18:02:40 CEST 2008 Merged the split-off staapl-coma project: swaps the order of the two stacks, such that there is a 1-stack metalanguage that doesn't use Forth style control words. The log entries are inlined below. What this does it give a clear separation between the languages: * COMA An s-expression based COmpositional MAcro language of which the values represent atomic target programs. Using pattern matching, program rewrite rules are implemented that perform partial evaluation and program parameterization. * MACRO On top of COMA, a Forth macro language with Forth control words, labels, code fallthrough and local exit macros. ----- _Entry: swap the 2 stacks _Date: Wed Jul 2 23:38:47 CEST 2008 I'd like to move to a single stack model for a clean Macro language, all the other stacks are for Forth style control words. This is the prototypical "deep change" that's hard to make in a dynamic language. Is there a way to make this easier? Maybe separating out part of the macro language (mos) which will implement the core compiler + pattern matcher. It involves changing all primitives, since they no longer move stuff from the Scat stack to the asm stack, but transform data in-place. Got pretty far already: got basic coma macro language to run + simple macro> command line. OK. got a bit further. stuck at: box> (require "pic18.ss") ... box> (repl ": asdf 123 23") ;; (macro) asdf STATE:# non-null-compilation-stack: ((23) qw (qw 123)) === context === /home/tom/darcs/staapl-coma/macro/postprocess.ss:36:0: empty-ctrl->asm /home/tom/darcs/staapl-coma/macro/postprocess.ss:44:0: assert-empty-ctrl /home/tom/darcs/staapl-coma/macro/instantiate.ss:218:0: compile-forth /home/tom/darcs/staapl-coma/macro/instantiate.ss:384:0: target-compile-1 /home/tom/darcs/staapl-coma/macro.ss:35:0: target-compile /usr/local/plt-3.99.0.26/collects/scheme/sandbox.ss:459:4: loop while (macro> 123 23) works fine.. time to go to bed.. _Entry: bug fixes _Date: Sat Jul 5 17:54:14 CEST 2008 Nothing serious, just some missing dependencies due to file splits, and the expected ctrl/asm confusion here and there. What i did note: pic18/test.f doesn't 'require' but it does 'load' Looks like we're done. Time to merge. Entry: rewrite rules for HOFs Date: Sat Jul 5 21:25:02 CEST 2008 Q: What is the essence of the 7 deforestation rules in the Wadler paper? (page 8) (1) variables are left alone (2) distribute over type constructors (3) function application: substitute terms in parameterized body, and recurse transformation (4) distribute over case (variable) (5) given constructor, pick one branch and substitute terms (6) case of function application: substitute argument (7) case of case: push inner case through to the branches The case statements are there to handle pattern matching for union types. You need those to be able to stop recursion! The rest is really just term substitution and elimination of constructors through rule (5). Translating this to what I want to build: either I find a way to use this representation together with a final step that optimizes data recursive constructors to use arrays, or I use a special set of data types.. The higher order macros are interesting. (Wadler mentions OBJ btw. I probably got it from there.) 'where' terms are introduced, a kind of 'let' for local function definitions. Macros are then like functions whos variables can reference function names, but cannot be recursive. Lack of recursion guarantees they can be expanded out. First order recursion is still allowed using 'where' clauses. Q: Make a summary of what the OBJ system is about. Rewriting + first order equational logic + ordered sorts (types). Quite elegant, but I'm not sure whether I can use any of this in my untyped ad-hoc approach.. Theories are quite surprising: an ability to define properties of operations. Q: Algebra of programs? The Backus paper really seems to be seminal for all this work in functional programming about program transformation.. Nobody seems to call it algebra of programs though.. Q: "Functional Programming with Bananas, Lenses, Envelopes and Barbed Wire" "We develop a calculus for lazy functional programming based on recursion operators associated with data type de nitions. For these operators we derive various algebraic laws that are useful in deriving and manipulating programs." Seems to be about moving from basic, low-level recursion to transformation on a higher level: using combinators. Entry: optimizing lists to arrays Date: Sun Jul 6 00:35:12 CEST 2008 This deforestation business seems doable. Remaining problem is to map recursive list processing algorithms to vector algorithms by somehow faking 'cons'. Or, it's really 'cons' with cdr coding. On the other hand, HOFs could be used for this: operations that lift scalar ops to container ops. Entry: not writing a single line of C code.. Date: Mon Jul 7 18:08:52 CEST 2008 Attempt to generate C code for Sobel + Hough transform, based on a higher order macro specification. Basicly the same as hough.ss but with partial evaluation of some functions. Roadmap: * start from a purely functional description in HOF combinator form * prove some transformation laws for the combinators * use these to transform the algorithm The real problem is making the X and Y combinators combine into something that can be easily compiled out into n x m rectangular region combinators. Basicly, start with the loop you want to end up with, and factor it into separate parameterizable pieces. It seems like 'adding parameters' to an inner loop is what makes this difficult. This can be solved with HOFs and partial application. When going that way (fixed arity) maybe using a stack approach can be done immediately? EDIT: see onward entry://20080710-121719 Entry: about recent changes and insights Date: Wed Jul 9 02:06:36 CEST 2008 Moving from 2stack->1stack for the core macro language (Coma) seems like a pretty significant step. This will allow most Fortisms (semantics) to be concentrated in a single implementation, next to its syntax. For the array/dsp language design: moving toward algebra of combinators seems to be the right approach. The problem is how exactly. I'm trying to align with these ideas in finished research to see where I can add a bit of originality without re-inventing everything. Entry: That DSP language: fanout Date: Wed Jul 9 14:02:19 CEST 2008 Thinking about the particular problem of writing the sobel algorithm in combinator form, doing this in stack notation makes me miss a ``parallel'' function mapper. Basicly, a 'distribute' operation. The essential problem is that a lot of DSP algorithms are many -> many maps: they are far from linear (single use of variables), and are quite parallel. To solve this, combinators need to do the duplication and parallel application. (anamorphism followed by catamorphism). It might be interesting to allow for code quotations to be parallel. I.e. a list of functions can be interpreted as a composition, or as a parallel application: It's useful to have a vector of functions/closures. Entry: porting old ip.ss Date: Thu Jul 10 12:17:19 CEST 2008 Let's start from the previous approach from entry://20070330-160157 and see if there's something to do with the new insights. This code is mainly about generating C code from a grid-function specification: (fn (gain x) (* (gain) (+ (x 0) (x -1) (x +1))) This has the advantage that it translates straight to array accesses: arrays are finite functions mapping coordinates to value. Porting to the new cgen.ss involves some switch from symbols/lists->syntax Got syntax port working, now need to change the 'grid' and 'loop' macros + check if they worked before.. Looks like there's some code missing. Hmm.. Should i go back to working with symbols, or get the syntax working? The problem is 'substitute'. This should be replaced by a variant of 'syntax-case' that recurses over expression arguments. Ok. using 'syntax-case/r' this becomes quite simple. Got it ported. NEXT: this implementes 'map', now find a way to express fold/integral style functions like the Hough transform, on top of this mechanism. No DSP without inner product.. Maybe focus on the Hough-like accumulator style first: that requires random access, which is genuinely different. It seems like a lot of generality is necessary to express such a specific data flow. Entry: functional forms (FP in Coma) Date: Thu Jul 10 16:21:51 CEST 2008 So, embedding functional forms in Coma. Q: What is a functional form? It is a macro which takes multiple macros as arguments. Elaborating on the first example in Backus' FP paper: the inner product, this becomes inner = trans (*) map (+) insert This is as much about datatypes as it is about functional forms. 'trans takes a 2D vector and transposes it, 'map takes a 2D vector and applies a function to each inner vector, returning a vector of results. 'insert takes a vector and folds it with a binary operator. The thing which feels a bit strange here is to have unary functions. All stack operations are automatically mapped to vector -> scalar functions. We could use a vector -> vector lift too, for macros with multiple return values. Embedding FP in a concatenative macro language: * all functions are unary (arity 1 -> 1) * pure stack macros can be lifted to FP functions in 2 ways: vector -> scalar (one result) vector -> vector (one or multiple results) * functional forms have arity n -> 1 and operate on unary functions. So, the trick is to somehow hide the lifting of unary stack ops -> unary vector ops. The easiest way is to see the stack as the outer vector wrapper. This however doesn't 'unpack things by default. I.e., in FP, the '+ function behaves like: +:<1,2> = 3 and not +:<1,2> = <3> Using only the top element seems a bit dirty (for the same reason that 'map feels dirty), but it does seem to be the most convenient approach. Q: Does everything look like a nail? Is this just a small gimmick, to have an embedded FP macrolanguage for creating (inlined) expression evaluators, or is it genuinely useful to construct (static) DSP structures in control applications where most glue logic is Forth? I guess if I can re-implement Krikit I have a proof of concept. Entry: fold Date: Fri Jul 11 11:26:04 CEST 2008 Given a way to transform loop body specification (- (x 0) (x 1)) into an expanded loop, how to perform folds? The Hough transform: for each (x,y) accumulate r = x cos t + y sin t into an (r,t) plane. Maybe it's best not to write this as a fold. The problem is that this is not a fold of a simple symmetric binary operator, but of a piksel and the state (r,t) accumulator. In pseudo: (fold (lambda (piksel accu x y) (+ accu (sino x y))) accu0 image) How to add this kind of operation? Maybe formalize the grid version first: Sobel is (fn (a) (let ((dx (- (a 0 0) (a 0 1))) (dy (- (a 0 0) (a 1 0)))) (+ (* dx dx) (* dy dy)))) Aha! This is why sets are necessary for local binding of loop pointers. It eliminates common subexpressions! The 'let' form seems to work, however, current parsing messes up the syntax by mis-identifying expressions as grids. This needs some kind of parameter substitution. 'let' doesn't seem to work for expression statements. The code below expands fine, but cgen.ss expands the 'let' incorrectly. (src->code '(loop (= (grid result 0 0) (let ((dx (- (grid a 0 0) (grid a 0 1))) (dy (- (grid a 0 0) (grid a 1 0)))) (+ (* dx dx) (* dy dy)))))) => (statements (block (var int i) (for-head (= i 0) (< i (* 400 300)) (+= i 300)) (block (vars (float* a_p0 (+ a (+ i (* 0 300)))) (float* result_p0 (+ result (+ i (* 0 300)))) (float* a_p1 (+ a (+ i (* 1 300))))) (statements (block (var int j) (for-head (= j 0) (< j 300) (+= j 1)) (block (vars (float* a_p0_p1 (+ a_p0 (+ j 1))) (float* a_p0_p0 (+ a_p0 (+ j 0))) (float* result_p0_p0 (+ result_p0 (+ j 0))) (float* a_p1_p0 (+ a_p1 (+ j 0)))) (loop (= (grid result_p0_p0) (let ((dx (- (grid a_p0_p0) (grid a_p0_p1))) (dy (- (grid a_p0_p0) (grid a_p1_p0)))) (+ (* dx dx) (* dy dy))))))))))) Seems like this is called let* -> some small bug still. Ok, fixed. It's working now: (p '(loop (= (grid result 0 0) (let ((float dx (- (grid a 0 0) (grid a 0 1))) (float dy (- (grid a 0 0) (grid a 1 0)))) (+ (* dx dx) (* dy dy)))))) => { int i; for (i = 0; i < (400 * 300); i += 300) { float* a_p0 = a + (i + (0 * 300)); float* a_p1 = a + (i + (1 * 300)); float* result_p0 = result + (i + (0 * 300)); { int j; for (j = 0; j < 300; j += 1) { float* a_p1_p0 = a_p1 + (j + 0); float* a_p0_p1 = a_p0 + (j + 1); float* a_p0_p0 = a_p0 + (j + 0); float* result_p0_p0 = result_p0 + (j + 0); *(result_p0_p0) = ({ float dx = (*(a_p0_p0) - *(a_p0_p1)); float dy = (*(a_p0_p0) - *(a_p1_p0)); ((dx * dx) + (dy * dy)); }); } } } } Now, how should i see this? It is an image comprehension with built-in shift operators. What about the following specification syntax: (for/grid (a result) ;; grids ((i 100) ;; dimensions, possibly inferred (j 100)) (= (result 0 0) (let ((dx (- (a 0 0) (a 0 1))) (dy (- (a 0 0) (a 1 0)))) (+ (* dx dx) (* dy dy)))))) where 'let' uses type inference from the values. I'm not sure wether it's a good idea to have a grid iterator. Maybe factoring in smaller for-loop like comprehensions is a better idea? Sufficiently confused again.. The Hough loop + edge detection should look like this: (for/grid (a) ;; grids (let ((dx (- (a 0 0) (a 0 1))) (dy (- (a 0 0) (a 1 0)))) (let ((sobel (+ (* dx dx) (* dy dy)))) (if (> sobel 600) (accu! x y))))) It doesn't look like this is going to work. Too experimental still. Let's move to straight C. Entry: writing C Date: Sat Jul 12 12:24:01 CEST 2008 Funny, how i've been disgusted by C to then move on to a higher level of abstraction, only to find that i'm actually enjoying writing C quite a bit because I'm getting better at writing properly factored code. Maybe the trick is really to define an s-expression based language that can do anything C can do, so the compilation becomes incremental rewriting? The approach in ip.ss should maybe be a bit more factored + expose lowlevel constructs? How to make a better C? -> type inference + polymorphy -> local functions (macros): maybe like purrr: manual inlining? -> 2 stacks? would make downward-only local functions easier My intuition says that it really can't be too hard to do this in a proper, not too ad-hoc way, but any time i dive into it, it seems as if i don't understand the problem fully. It does look like types are the essential part. Maybe it's best to look at C without the polymorphism of the math operators? Another thing that might help is to simplify the control flow constructs. Maybe only 'if and 'goto should be kept? Or 'if and 'while(1)? The 'for loop is at least better replaced with a 'while loop. With SSA and CPS being equivalent, how to generate C code such that the SSA form that the compiler sees is actually the one we intend? Entry: typed vs. untyped Date: Mon Jul 14 10:59:55 CEST 2008 Been browsing through Oleg Kiselyov's papers on code generation. Most of it seems to be based on MetaOcaml and a translation to C. It might be interesting to try to summarize the difference between MetaOcaml's approach to staging, and the hygienic 'syntax->datum / 'datum->syntax. Entry: multi-stack Forth support Date: Mon Jul 14 11:24:01 CEST 2008 Currently i'm using structure type inheritance together with pattern matching to be able to perform base type operations on derived types: they simply leave alone the extended state. This is essentially the same as operating on a stack: each type extension adds one stack element. Does this have implications on the implementation level? Is it better to represent state as a stack of states right from the beginning? This would make the update method trivial. Current conclusion: Maybe it's best to keep that method abstract. Entry: rewriting Date: Mon Jul 14 11:34:12 CEST 2008 What I call 'eager' rewriting probably has a better accepted name in the literature. Entry: Generating optimal code with confidence Date: Mon Jul 14 14:55:20 CEST 2008 http://okmij.org/ftp/Computation/Generative.html "A Methodology for Generating Verified Combinatorial Circuits", Joint work with Kedar N. Swadi and Walid Taha. roc. of EMSOFT'04, the Fourth ACM International Conference on Embedded Software, September 27-29, 2004, Pisa, Italy. ACM Press, pp. 249 - 258. http://www.cs.rice.edu/~taha/publications/conference/emsoft04.pdf This paper and related work seems to be a ticket into the field of Resource Aware Programming (RAP), to find a way to place Staapl's dynamic type approach, and see how static type systems can be of benefit. Reference number [3] talks about a linear functional language, which is pretty close to where i'm going. The roadmap seems to be something like: * get educated about type systems (TAPL) * see what there is to learn abot Cat's type system * translate this to a type system for Coma References: [3] related to http://www.sac-home.org/ [30] "Generating heap-bounded programs in a functional setting", TAHA Walid, ELLNER Stephan, HONGWEI XI. Resource aware programming: * highly expressive untyped substrate * stage distinction * static type systems The latter is about typing code/circuit generators so they can be composed. I don't know what the untyped substrate is about. Entry: binary operations Date: Mon Jul 14 15:23:22 CEST 2008 Composition of binary operations have the following structures: non-struct: tree assoc: list comm+assoc: set Read the OBJ paper yesterday, and I'm thinking wheter the 'theories' approach might be usable in Coma: expressing properties of operators, i.e. associativity, commutativity,... Entry: MetaOCaml / MetaScheme Date: Tue Jul 15 12:18:50 CEST 2008 ( Edit from: Fri Jun 27 13:02:41 CEST 2008 ) http://okmij.org/ftp/Computation/Generative.html#meta-scheme 4 special forms: bracket escape lift (cross-stage-persistence) run "Scheme's quasiquotation, being a general form for constructing arbitrary S-expressions (not necessarily representing any code), is oblivious to the binding structure." But quote-syntax and unsyntax do this correctly, right? Hmm.. I don't see it without thinking.. EDIT: "... uses a complex macro ALPHA that is aware of the binding structure. ALPHA traverses its argument, presumed code expression, and alpha-converts all manifestly bound variables to be unique symbols." This I can understand: alpha-renaming to make sure names are unique before splicing in code. "Since syntax-rules can only produce globally-unique identifiers but not globally-unique symbols, we must use syntax-case or a low-level macro-facility. OK, if the goal is to create code that has a symbolic representation, this is clear. The syntax-case case uses generate-temporaries for the unique names. But does it need to be like that? If we're generating code that is eventually to be compiled, why not generate a graph structure directly? "The macro ALPHA is implemented as a CEK machine with the defunctionalized continuation." Ok, so it's an interpreter basicly. CEK is the machine underlying Scheme, as opposed to i.e. SECD for lisp. The CEK is implemented in syntax-rules. It might be interesting to see how alpha renaming and cross-stage persistence are problematic or avoided in Staapl/Coma. Alpha renaming is avoided by using a point-free target language. CSP is qw. It supports numbers, target-address dependent expressions which can be reduced during the assembly stage, and macros which need to be eliminated during postprocessing. Ok. Staapl is quite a bit simpler, because I'm doing metaprogramming and code generation in the same spot. It's only because point-free code is linear + that all code generators need a finalization step that this trick works. Entry: cleanup Date: Tue Jul 15 12:33:47 CEST 2008 Since the roadmap for further work is pretty clear (TAPL, MetaOcaml, typed stack languages) it's time to finalize Coma so it can be released. * Fix undefined symbol bugs for monitor=module (OK) * Load monitor as a module (OK) * Upload monitor from within Staapl (OK) * Load/save addresses (OK) * Write documentation (OK) * Incremental upload. * Make Snot repl * Get the synth going Entry: control.ss and label.ss Date: Tue Jul 15 13:53:51 CEST 2008 About the space between state:2stack and state:compiler. It is possible to define the control primitives in 2stack using the 'label' pseudo-op as it used to be. Later replacing 'label' with the intelligent construct in instantiate.ss gives the possibility to build structured code graphs. Maybe it's worth to separate both? This works well. It leaves the words exit or-jump sym label: as hooks that can be used to plug in the control flow analysis code from instantiate.ss The code then uses the pseudo ops: label jw/if To represent labels and conditional jumps. Finalized this: added separate control/ project directory and renamed the remaining macro/ to comp/ to indicate it's purely about compilation (code tree generation) and postprocessing, not about language definition. Entry: name troubles Date: Tue Jul 15 16:01:16 CEST 2008 Loading the monitor code into the target namespace works fine. However, requiring it gives trouble. (undefined macro/f->) This is probably due to the use of macro/f-> in the parsing words, while that word is later defined in a source file: at the time of expansion, the macro isn't yet defined. So it needs a stub. This can be solved by adding pic18 specific parsing words that are required into the file. OK, that works. Added a stub of 'f->' in pic18/macro.ss and created pic18/parsing-words.ss to add the word 'fstring:'. Entry: documentation -> parser itches Date: Wed Jul 16 13:00:14 CEST 2008 I feel that the most important layer Scat -> Coma + Control is ready. It's simple enough now to be documented. However, the Forth + parser part isn't very well written.. Does it make sense to spend some time on cleaning it up? The main questions are: Q: Is it possible to write the compiler more as a substitution system instead of a CPS parser? Q: Is that desireable? Sticking with the CPS approach, the state that's passed might need some simplification. The mistake I made is to pass an expression that will only be added to on the outside, but what is really necessary is to be able to insert into expressions. This requires a cursor into a tree. The problem is that I don't know that data structure well enough to get to this implementation from an intuitive approach. Q: What is a zipper? http://okmij.org/ftp/Scheme/misc.html#zipper There are two views, one is a fairly straightforward 'reversal of pointers' where each node encodes a path through to the top node using the following data structures: (define-struct path (left path right)) (define-struct cursor (current path)) A path contains a list of left sibling trees, a list of right sibling trees and a path. If path is #f the current node is the top node. According to this: http://okmij.org/ftp/Scheme/zipper-in-scheme.txt Zipper can be represented as a delimited continuation of a depth first traversal. It seems this works only for updating nodes, not for adding them. (or not?) Anyways, simple straightforward manipulation might be enough to transform the tree used in the expression accumulator of the Scat parser into a zipper structure, to get rid of the 'wrapping' problem. Zipper as continuation is actually not so hard to understand: the 'reversed pointers' are really nothing more than stack frame links in the recursive decent of the structure. Entry: incremental dev Date: Thu Jul 17 10:51:19 CEST 2008 I think all is in place, I just need to flesh out the normal flow of operations. How to create a project? - make sure Staapl is installed in plt/collects/staapl (sudo make install) - (require (lib "pic18/monitor-p18f1220.f" "staapl")) This file is an example of a self-contained Purrr project for PIC18. Loading it like this will compile the file and import all its macros and target words into the current namespace. In the following we'll take the interactive approach, but remember that it is possible to automate all of this: .f files are really PLT Scheme modules and can be composed as such. In order to handle the code interactively, it's more convenient to use a prj environment. This is a scheme namespace object into which all support code can be loaded. - (require staapl/prj/pic18) Once the environment is loaded, a Forth repl is available using (repl ). This will provide the string to the reader present in the prj namespace. The following command will load a file into the current namespace. Note that this is different from require. - (repl "load staapl/pic18/monitor-p18f1220.f") This loads, compiles and assembles the code. Use (print-all-code) to view it. Use (ihex) to view intel hex dump, or (save-ihex ) to save it. For convenience, it's possible to call (piklab-prog ) to program it. Set the port the target is connected to, i.e.: (current-console '("/dev/ttyUSB0" 9600)) Use (prj> . code) to execute prj Scat code, and (target> . code) to execute possibly simulated interaction code. I.e. (prj> ping) - re-establishing contact. this requires target word addresses. (save-target-words ) (load ) Entry: Walid Taha RAP video Date: Thu Jul 17 13:25:40 CEST 2008 Jan 22 2007 @ google: http://video.google.com/videoplay?docid=915594482273345538&q=type%3Agoogle+engEDU Research question: What are the high level abstractions that can be used to keep control over resource use? goal: - support expressive abstractions - ensure safety by static analysis - don't let this get in the way means: - multi-stage programming (MSP) - reactivity (I/O events) - advanced type systems ( Staapl does MSP in a traditional untyped / partly dynamically typed way, without special tools for reactivity and static type analysis. Contrasting principle: get MSP to work first in a simple paradigm, add static tools later using DSLs. ) Ideas behind MSP are old. The new approach is to combine this with static tools. Reactivity is combined with MSP by creating program generators for reactive programs with static guarantees. Essence behind typed MSP: extend with types that are - delayed value - annotated what kind of delayed value ( An essential concern that seems to be solved by MetaOcaml is variable capture. The lucky thing about Staapl is that this problem is avoided: there is never any confusion about binding of values: in the pattern definition language standard Scheme lexical binding is used, while in composition, there are no bound names: all is point-free. The big disadvantage of course is that a stack language has no parameters. For multi->multi DSP code this is a problem. However, FP like array processing languages can be embedded in a similiar point-free style, replacing the stack with array structures that also can be re-used in every step. The essential insight is that it's not stack computing that's important, but point-free threaded state that gets discarded after each function application. ) What pops out of the FFT example in the talk is the use of mathematical properties in the generation of the code: algebra of programs, where the program in this case is ordinary algebra :) Reactive programming: E-FRP, a scaled down version of FRP from Haskell which compiles to event-loop C-code. Linear types: values can be used only once (consumed). Hoffmann (LFPL). The idea is that in pattern matching, on deconstructs a cons cell, which is then passed to the RHS where it can be reused: cons(x, xs) at d -> cons(1, xs) at d Indexed types: a bit like polymorphic types, but with parameters that have values, i.e. lists of a certain size. This goes pretty far: you provide proofs that the type checker checks. This can be very specific: nasicly, the types could be made to perform the whole computation completely in the type system. EDIT: look at dependent types, Pierce p. 462 Onwards, maybe this is interesting: gradual typing: http://lambda-the-ultimate.org/node/1707 ( About domain specific knowledge and rewrites: basicly, these are theorems about the data types and operators. This might be abstracted by generating parameterized theorems, but then those can probably be specofied as more complicated rewrite rules. One possible step is to start from equations, and distill directed rewrite rules. ) Entry: The Expression Lemma Date: Thu Jul 17 16:20:18 CEST 2008 http://blogs.msdn.com/ralflammel/archive/2008/07/16/the-expression-lemma-explained.aspx Is this right in the middle of point-free code, where imperative and functional meet? I.e. the application of the composition (f g h) to the state x in x @ (f g h) can also be seen as the interpretation of the sequence of messages f g h by the object x: [x f] [x g] [x h] Entry: Algebraic types Date: Fri Jul 18 10:11:48 CEST 2008 Roadmap for today: find out exactly how 'pattern is based on algebraic types or not. Explain this in the scribble doc, upload doc to server and send an email to Walid Taha. http://planet.plt-scheme.org/package-source/dherman/struct.plt/2/4/doc.txt http://en.wikipedia.org/wiki/Algebraic_data_types An algebraic data type is a datatype each of whose values is data from other datatypes wrapped in one of the constructors of the datatype. Any wrapped datum is an argument to the constructor. In contrast to other datatypes, the constructor is not executed and the only way to operate on the data is to unwrap the constructor using pattern matching. Let's stick to duck typing. I don't see any essential differences between the way the instruction mapping works and a re-implementation in terms of algebraic types. The pattern matcher solves the basic organization of the data stack. On top of that (data within instructions) any scheme data type and plt-match syntax can be used. Entry: Generating typed programs Date: Fri Jul 18 22:39:03 CEST 2008 Actually, generating a statically typed system in a dynamically typed language isn't such a crazy idea. The type checking of the generated program can happen at generator run time, at which also the generator dynamic types are available. However, what I am doing in Staapl is to generate the target program, but to also compile it immediately. There is not really an intermediate generated program representation other than the data types exchanged between code processors. This data however could really be representing code bodies for embedded languages. In any case, static guarantees about the target program are really dynamic checks of the data passed between code processors / generators. At generator compile time, I can't check anything. But is this really necessary? What is lost is the ability to typecheck generator components. Only when they are instantiated, they can be tested. Adding an explicit test suite for generators solves that problem. Entry: The API Date: Fri Jul 18 22:48:13 CEST 2008 I'm worrying about publishing the API to the Staapl system. The important observation however is that the 'patterns 'compositions and 'scat: and 'macro: forms are really enough to start building abstractions on top of if necessary. Maybe I should really take the slow approach: document only those functions that are necessary, and take a slow start. Entry: Zipper for the parser Date: Sat Jul 19 12:28:27 CEST 2008 The idea is this: currently the parser api uses 2 syntax elements to pass around, and some context. It would be better to dump as much of the context into the passed state, and also encode that state into a single object that acts as normal syntax transformer. The standard tree to accumulate is this: (lambda (state) x) | state where the x point is added to. The 'locals' parser takes the current lambda expression and assigns it to a variable in a let expression, and builds a new lambda expression. How to represent zipper in syntax objects? If we're only representing trees where the cursor is at the rightmost slot in a node, the data structure becomes simpler: (state ((lambda (state) #f))) :: >> (define (stx-zip-up stx) (syntax-case stx () ((node ((siblings ...) parent)) #`((siblings ... node) parent)))) (define (stx-zip-down stx) (syntax-case stx () (( (siblings ... node) parent) #`(node ((siblings ...) parent))))) Entry: incremental upload Date: Sat Jul 19 14:07:20 CEST 2008 The idea is that code goes 'somewhere'. It is associated to a resource. All code compiled inside a compiler namespace is registered to a central code registry. Every time code gets TRANSFERRED somewhere else, the corresponding code is marked as old. Transfer means to either write out a hex file or similar, or to upload it directly to a target. The simplest interface seems to be indeed the mapper: this can ensures the operation completed before state is changed. Onward: using map/mark-target-code, create a function that uploads the binary code using code defined in tethered.ss Ok, I remember: last what i did here was to use the comprehensions to build a formatter for upload-bin. That code now needs to be tied to getting the last binary code. Simple match? No: i got annoyed by the absence of a Scheme interface to the interaction code in tethered, one that automatically connects to the console. Let's write that first. Maybe the simpler solution is to add a default somewhere? I moved some of the tools/io.ss code to live/console.ss since it's quite specific. Can probably merge together. Added the 'with-console function to run arbitrary code in connection with the console. This is still not good enough. Need a real connection. But let's postpone this decision until after the highlevel part of the code is done. Verified that the programming worked using a read 'fbytes>list The 'ihex function in pic18.ss uses 'auto-bin to produce binary code. Maybe upload should work similarly? ( This level is really a bit of a mess.. Internally code is organized well, but on the outer levels it's a patchwork.. probably because it's a state machine written around code state, and there are several format conversions going on.. ) The interactive state consists of: * code compiled upto now, possibly marked as old * core + project macros (the concatenative language) * the current upload point for interactive dev The last one is still missing: assembly doesn't save the memory pointers. Let's place them in target/code.ss Entry: functional code graphs Date: Sat Jul 19 20:02:24 CEST 2008 I'd like to move back to functional data structures for the code graphs. There is simply too much fuss with bookkeeping, so let's move to functional types. * Graphs: In order to make graphs in a functional language, one needs to see the graph as an infinite tree. Such structures can only be defined in a lazy manner. In scheme, this requires explicit use of delay/force. * Unroll the updates: It is necessary to write updates as different data structures that refer to their parents. This involves: - code compilation + linking (target-word-code target-word-next) - assembly (target-word-address + target-word-bin) Of course, thinking about it now, the reason this all is imperative is that linking is simplified using the code -> word patch after instantiation, and assembly is easy because the address and bin slots can be updated multiple times. Maybe this is a more sane approach: once code is marked 'old' it is effectively frozen, and can never change (be re-compiled or re-assembled). It is also completely concrete at that point, and should be serializable to disk. FIXED: made it a bit simpler, using separate *chain* and *bin* stacks in pic18.ss NEXT: 'upload-bin' seems to perform a binchunk-split, while this is also done in pic18.ss : where is best? Entry: Reachable vs. Incremental Date: Sat Jul 19 20:40:51 CEST 2008 There are 2 models of developing code: * Standard incremental Forth: assemble everything that's generated, in the order in which it appears in source code. For subsequent code, just append to the already defined code. * Reachability: define some entry points into the code, and assemble the serialized reachable graph. This allows for more elaborate dead-code elimination. To make it clear, i'm renaming code.ss to incremental.ss since that part is only necessary for incremental dev. Also, it seems there was a confusion between 2 parts of state that need to be maintained: - compilation: symbolic assembly code - assembly: binary code + allot pointers Maybe it's best to provide a simplified interface that performs all of this at once? What is necessary is some kind of transaction model. Instead of having 'repl' perform just compilation, it needs to do assembly too. The result of 'repl' is an updated target/incremental.ss state and a list of to-be uploaded code. Q: Is separate compilation/assembly necessary? Probably only when debugging macros, and then it is probably easier to use the 'macro> interface, so the control flow analysis and optimization doesn't get in the way. We'll see later on if a finer granularity in the api is necessary. Let's replace 'register-code' with a hook, so the behaviour of what exactly happens when a file is loaded/required is pluggable. Hook works fine: this makes things a lot easier. For example now prj/pic18.ss has full control over what happens when code gets loaded: it defines two modes: one that accumulates binary code and increments code addresses, and one debugging mode that simply prints out symbolic asm. Entry: hygiene Date: Sun Jul 20 03:28:30 CEST 2008 Looks like variable capture in macros is quite a bit more complicated than I thought. Reading the MacroML paper: http://portal.acm.org/citation.cfm?id=507635.507646 Basicly, introducting binding forms during a source code transformation requires the assurance that no free variables are captured (= hygiene). A sure way of doing this is to use generated names that come from a namespace exclusively allocated to that particular source code transformation. The other way around (referential transparency) the names introduced should be related to those visible at syntax transformer definition time, not those visible in the lexical expansion context. As far as I understand, in MetaML and MacroML renaming is used also. Q: what are freshness conditions? No idea. Maybe this has to do with generated names? The paper contains an interesting section about recursive macros and 'early' parameters: those necessarily evaluated to make sure expansion is finite. Hmm.. So MetaML is really about evaluation order: making sure some evaluations happen before other ones, independent of the language's default normal/applicative order. However, the point about substitution at the end of section 3 I don't really understand. Why is there never any variable capture? Entry: Staapl pilars Date: Sun Jul 20 03:51:45 CEST 2008 STACK/POINTFREE: - stack machines have efficient VMs / hardware implementation - maps to clean functional semantics - imperative code looks functional (stack gives referential transparency) - easy to express partial evaluation / rewrite rules - metaprogramming simplified: no hygiene or reftrans problems DYNAMIC TYPING: - simple dynamic type system: easy to understand. basis = pattern matchin transformation. INTERACTION: - incremental development - target-view console OPEN QUESTIONS: - type systems: how to add more static analysis - embed array processing languages As a summary, I think over the course of a couple of years I've found the proper factorization of the program, and as good as optimal syntactic constructs for extending it. Disadvantages? Mostly that base language is a stack language (matter of taste). And dynamic generation can produce obscure errors and won't catch type errors. Bottom line: simple highly extensible metaprogramming system for tiny controllers without arbitrary abstraction walls + a practical interactive framework. Entry: the target: language Date: Sun Jul 20 12:59:14 CEST 2008 somehow '(target> ts) doesn't work any more: reference to an identifier before its definition: scat/ts in module: "/home/tom/staapl/live/target-lang.ss" but it is defined in the namespace. FIXED: didn't include target.ss in parsing-words.ss so the substitutions macros didnt see those words. NEXT: full target console + synth Entry: the synth Date: Sun Jul 20 14:37:39 CEST 2008 i don't have control stack juggling words defined, so using the opportunity to use some macros + locals (very useful for 1 -> many maps like accessing 2-byte variables) Entry: ,,geo-seq test case Date: Sun Jul 20 15:06:38 CEST 2008 An opportunity to test table generation and recursion in macros. geo-seq ( start endx length -- ) This brings up an inportant issue: availability of target values. In the MacroML paper these are called 'early' parameters. Let's define them in Coma to mean values that do not depend on target word addresses, and as such can be evaluated at compile time. The generator works fine, had to change some things due to the new tscat: macro. But.. there's something wrong with phases: requiring the code doesn't really seem to work! Identifiers don't get required in time.. This looks like trouble... Let's avoid this for now. Entry: asm overflow errors Date: Sun Jul 20 20:17:37 CEST 2008 Forward jumps cause problems due to target addresses begin aligned at zero. The easiest way around this is probably to ignore these errors in the first phase? done. got it to compile now. Entry: pattern matching guards Date: Sun Jul 20 21:23:28 CEST 2008 next problem: bang: 0401 6EEC [dup] 0402 52EF [movf INDF0 1 0] 0403 52EF [movf INDF0 1 0] 0404 52EF [movf INDF0 1 0] 0405 52EF [movf INDF0 1 0] 0406 52EF [movf INDF0 1 0] 0407 52EF [movf INDF0 1 0] 0408 52EF [movf INDF0 1 0] 0409 50E9 [movf 4073 0 0] 040A 6E18 [movwf other-task 0] 040B 0E10 [movlw 16] 040C 6EFC [movwf 4092 0] 040D 0EF0 [movlw 240] 040E 6EE1 [movwf 4065 0] 040F 0EE0 [movlw 224] 0410 6EE9 [movwf 4073 0] 0411 52EF [movf INDF0 1 0] 0412 501A [movf (sound 1 +) 0 0] 0413 D50E [jsr 1 execute/b] the first part comes from suspend, which properly expands using 'macro> box> (macro> suspend) [save] [movf 4085 0 0] [save] [movf 4086 0 0] [save] [movf 4087 0 0] [save] [movf 4057 0 0] [save] [movf 4058 0 0] [save] [movf 4092 0 0] [save] [movf 4065 0 0] [save] [movf 4073 0 0] It's in the binary .hex code too. Maybe a bug in postprocessing ? It's this one: (([,op POSTDEC0 0 0] [save] opti-save) ([,op INDF0 1 0])) ;; NEED SYNTAX hmm.. how to match to the value of a parameter. ok, fixed by using a general curried function creator Entry: compile/execute vs. run Date: Mon Jul 21 11:23:30 CEST 2008 Due to multi-stage semantics, the meanig of these 3 words requires a little thought. There are several cases of quoted data to be handled: macro label symbol Currently, 'compile can handle it all, 'run handles macros and delegates to ~run, while 'execute handles labels and delegates to ~run. What about providing a basic ~run, and wrappers around it? (Note that this is probably a symptom of ill-typed code: macros cannot be target values.. why is that?) Conclusion: They really do different things. 'run is the clean Coma version (Coma doesn't have labels), 'compile wont delegate to the runtime ~run and 'execute is a possibly optimized lowlevel execute which delegates to ~run. Entry: Higher order macros Date: Mon Jul 21 12:38:23 CEST 2008 It seems pretty clear now that higher order macros should be built on top of the Forth control primitives. * Forth code is not structured on the syntactic level: all control structures are a consequence of semantics of control macros. Now, this is a powerful mechanism in itself, but it really is more concrete/lowlevel than quoted code fragments: I don't see a simple way to extract structured data from this. * Otoh, all functionality to implement higher order macros is defined in the Forth control language. So, to add control structures to Coma (anything that involves branching), it is better to build those on top of control.ss and shield that namespace using the module system. Because higher order Coma has loop bodies in a clean rep, it can perform more optimizations. Conclusion: - Forth Control depends on pure core Coma - Coma Control depends on Forth Control. Entry: snot repls Date: Mon Jul 21 12:46:48 CEST 2008 Roadmap: - compilation repl OK - parser + interaction repl OK - polish commands.ss It's probably more useful to only keep track of assembly code that's not been uploaded or saved yet. So I'm changing pic18.ss moving kill-bin! to kill-code! Upload is working from console now. Next: load .f files into the namespace using something a little less raw than "load ". This requires to move a piece of code from forth/parser-tx.ss to forth/lexer.ss ... In order to get the relative loading to work properly, forth-load/compile just expands to the 'load' word + filename inlined as string. One more thing: in order to be able to use 'load' in the interactive console, one needs to have access to reflective operations. So this should work: (define (forth-load filename) (eval `(forth-load/compile ,filename))) This seems to work. I put it in live/reflection.ss Next: mark (hmm.. lot of this convenience stuff needs to be re-implemented..) Entry: mark & empty Date: Mon Jul 21 17:28:29 CEST 2008 Mark probably won't work like it used to: it needs a stack of current words.. Maybe the run-time state in pic18.ss needs to be implemented as a stack? Q: Can this be implemented properly instead of hacked together? This means: perfect restoration of a namespace. Can the namespace itself be dupped? Q: Can we somehow serialize the namespace? Q: A procedure can be serialized, but a closure can't is this true? Let's hack it together first. Hmm.. All this depends much on what I want to accomplish. Simply put, the only operation I'm interested in is to REPLACE some interactively loaded code. In all cases I've working in sofar, the application consists of: (C) a fixed core (I) incremental replacements Here (C) is completely source-defined, while (I) is the incremental part. Maybe this is a better model to work with than setting (C) to be only the monitor code. Maybe 'empty' should always go right upto (C), and not use a stack of restore points. It sounds as if it is cleaner, but i've never used it effectively because it requires some mental tracking while mostly you just want to start from a clean sheet. So, let's pick the best of both worlds: no mark/empty. If you want empty, recompile and reflash the app. This also reflects a need that occured in the previous approach: sometimes things go bad, and what you want is to go back to a working point fairly quickly. Eventually, this will require custum programmers. But let's do it with the ICD2 first. 'mark and 'empty are currently implemented in the simplest way possible: just tracking the words. Some extra safety can be built on top of this, but essentially, once you use 'empty the namespace and target are out of sync. Entry: substitutions Date: Mon Jul 21 19:44:50 CEST 2008 Something I dont' really get is why substitutions don't get name-checked before used.. They are macros, maybe that's why? The problem is that some definitions might not work. Is there a way around this? Maybe evaluate the code somewhere? No.. the identifiers are only interpreted when the macro is invoked. Before that no checks can be made. Entry: project reload Date: Mon Jul 21 21:48:52 CEST 2008 Can't install new namespace from within the namespace, so need to work around this by throwing some exception/abort. Entry: done? Date: Mon Jul 21 22:07:27 CEST 2008 Need to check the synth code if it's still working, but as far as I can see I'm done. Some minor toplevel organization things + ICD2 programmer interface. Entry: problems Date: Tue Jul 22 00:08:20 CEST 2008 So, what didn't work out? I'm getting a bit hyped up with a nearing release, maybe time to list the things that I've been stressing about: * catching loop bodies into functional representations * the simulator: over-the-top staging challenge * dsp language: probably will become AP language * C code excursion: need firm ground to work on the grid iterators Entry: disassembler Date: Tue Jul 22 00:58:05 CEST 2008 Forgot about that.. Maybe try to get it working first. Entry: Graph structured lambda calculus, SECD, ... Date: Tue Jul 22 01:29:08 CEST 2008 I'm tired so this might be nonsense.. Something i never understood is the obsession with keeping lambda representations flat. For source transformations it makes a lot more sense to represent lambda terms as a graph instead of a tree: explicitly connecting reference sites with binding variables. EDIT: this is actually what de Bruijn indices do: they point upwards in the graph structure, counting abstractions. Writing this as a graph gives a directed acyclic graph which is (related to?) the dataflow graph of the computation. Anyways: SECD and Forth S = param stack E = allot stack C = instruction pointer D = return stack http://www.cs.utah.edu/classes/cs6520-mflatt/s00/secd.ps SECD is lispey while CEK is schemey http://planet.plt-scheme.org/package-source/robby/redex.plt/1/0/doc.txt Entry: dasm Date: Tue Jul 22 12:41:44 CEST 2008 The assembler has an ad-hoc type system, where operand names determine the type. This is used for checking overflows of jumps and implementing absolute/relative addressing. Anyways, I'd like to use the disassembler to build target code chains, so they might be used later to be re-generated. The question is, where should the labels refer to? Maybe solve that problem later, and first get a bin->chain converter working. Ok. minimal dasm working. Needs some tuning + more configurable behaviour (symbol resolve + word / address size etc) Entry: the synth Date: Tue Jul 22 21:05:59 CEST 2008 almost there: thing to do is * boot + isr vectors * whole app build script * piklab-prog * project reload (scratch) Entry: piklab Date: Wed Jul 23 14:01:16 CEST 2008 Synth doesn't work. Time to get piklab-prog to work without having to re-plug the board: using run etc.. OK The problem seems to be in the binary code chunking: the first chunk it produces is correct, but the remaining ones are not: (map car (car *bin*)) (576 0 688 48 50 2142) The problem is data chunks. How do they end up in the code? The problem is conversion of words to binary: this should take only code words. the error is in 'target-chain->bin : added a realm filter. Looks like there's still a problem: there are 3 code chunks remaining now: box> (map car (car (bin))) (576 688 2142) The problem could be that data chunks do not get disconnected. Looks like that was it: added 'terminate-chain after variable macro. Ok, booted the synth, but it doesn't work properly. This means i get a chance to test some of the debug features. There's something wrong with the 2nd instruction 089A E1B3 [bpz _L717 1]. The address is way off. 3 is correct, but where does the 'B' come from? (It should be E103). compile> : boo 0 xor z? if -1 else 0 then ; command> print-code boo: 0898 0A00 [xorlw 0] 089A E1B3 [bpz _L717 1] 089C 6EEC [dup] 089E 0EFF [movlw -1] 08A0 D002 [jsr 1 _L718] _L717: 08A2 6EEC [dup] 08A4 0E00 [movlw 0] _L718: 08A6 0012 [return 0] The 3 above looks like it's accidental. compile> : boo z? if 123 then command> print-code boo: 08A8 E1AC [bpz _L719 1] 08AA 6EEC [dup] 08AC 0E7B [movlw 123] _L719: This should be E102 Let's go back to only the monitor. This is a problem with this: (bpc (p R) "1110 001p RRRR RRRR") (bpn (p R) "1110 011p RRRR RRRR") (bpov (p R) "1110 010p RRRR RRRR") (bpz (p R) "1110 000p RRRR RRRR") wich i thought was fixed. This line was wrong: (([flag? opc p] [qw l] or-jump) ([,opc (flip p) l])) compile> : foo z? if 123 then command> print-code foo: 0240 E102 [bpz 1 _L184] 0242 6EEC [dup] 0244 0E7B [movlw 123] _L184: It works! Entry: packaging + prepare release Date: Thu Jul 24 10:01:05 CEST 2008 http://docs.plt-scheme.org/mzc/plt.html - clean up darcs project init (local collects?) OK - build plt package - clean up forth.pdf Entry: old web site Date: Thu Jul 24 12:04:55 CEST 2008

To understand the development approach and the current form of the source code, it might be necessary to see it in the right context. I am an electrical engineer working mostly on embedded control and signal processing projects. I seek to optimize the development process of highly specialized software for embedded systems by small groups of 1 to 3 people. I got fed up with ad-hoc methods of metaprogramming and code generation that I see used in this engineering subculture, and decided to build a clean system on a solid base that can be understood and used by a single electrical engineer with an open mind towards modern programming language technology. I am not a programming language theorist, and if you want to use Staapl, you don't need to be either.

The current emphasis is on work towards Purrr, a stand-alone standard Forth layer for generic microcontroller architectures, and Purrr18 an interactive tethered cross-compiled Forth dialect designed for the 8-bit Microchip PIC18 Microcontroller. Future goals include the design of a linear concatenative language as a successor or drop-in replacement for the Packet Forth interpreter, and the design of a declarative Scheme derived data-flow language to implement DSP functionality on a microcontroller or DSP processor. Eventually I want to cover the whole spectrum of tiny 8-bit microcontrollers to 32-bit machines that can run unix with an integrated language tree based on Forth and Scheme dialects, and an interaction system that can handle live software updates and debugging for distributed embedded applications. Entry: staapl home Date: Fri Jul 25 14:19:48 CEST 2008 In the reflection code, there are hard links to the location of the staapl tree. Maybe these should be made soft such that staapl can be installed anywhere: trying to host it on Planet gives some trouble.. Maybe I should provide a 'staapl-install function that will install wrapper modules around the planet modules. OK. I got some solution. The preferred module language is #lang planet zwizwa/staapl/pic18 This will allow easy install of staapl through planet. I've removed all #lang references from the .f files though: going to use 'load' for most things, and only wrap toplevel code in module languages. Entry: cleanup dist + docs Date: Sat Jul 26 09:26:22 CEST 2008 - make sure examples are in planet dist - clean docs + add to planet dist examples. there are 3 * compile + burn monitor * compile only synth * start only repl REPL is moved to core and renamed to staapl/prj/pic18-repl while the examples are now written as modules and accessible through mzscheme -p zwizwa/staapl/examples/upload-monitor mzscheme -p zwizwa/staapl/examples/build-synth Next is to cleanup the docs. Maybe the Purrr manual should be moved to scribble too? This would allow some testing + documentation of live interaction. For the Forth doc, it would be nice to write a small macro for evaluating chunks of literal forth code. Entry: offline compilation example Date: Sun Jul 27 12:07:27 CEST 2008 What is necessary is a script that compiles a PIC18F1220 application from an input forth file, including a monitor and a proper boot sequence. This means: * read input arguments * create namespace with instantiated monitor * instantiate the script * add a simplified boot mechanism * dump out .hex and .dict Added staapl/prj/pic18f1220-serial wich loads the 18f1220 defs and the monitor code. NEXT: need to solve the path issues with 'load'. But where? For convenience i'm going to put it in the lexer module, but it should really be somewhere else.. Yes. This is not trivial, since the path needs to be available at compile time. Currently: - lexer is free of paths - relative paths come from rpn-search-path parameter in parser-tx - pic18 path is encoded in parser-tx - that can move to forth-begin, where the param can be set Problem: how to make the load path available at compile-time? More specifically, how to set the parameter rpn-search-path? Simply setting it at runtime doesn't work since it's a different instance. Maybe using 'eval' helps? Hmm... this is a can of worms. Maybe a way out is to add a form that sets the load path, just like before. This works. Remaining question: should this be a permanent state? Also, what with modules? Requiring a module will already use load-relative i think. YES. So, the remaining problems are purely about interactive 'include'. This means it can probably be best solved there. The remaining question is wether in parser-tx.ss, the search path should be reset on each compilation. Entry: the state file Date: Tue Jul 29 10:03:44 CEST 2008 So, now that i'm converging to a certain workflow (fixed core application + incremental dev on top of that), it's possible to define a state file which contains: - a reference to code to be loaded for macros FIXME - a dictionary with target words OK - the pointers OK - console OK FIXME: Make setting up the console part of the responsability of prj, so it becomes easier to metaprogram from Scheme. In the end, Scheme is the main composition mechanism, not Scat.. Oops. That sounds good but it messes up the unquoting in the macros. Entry: associativity after instantiation Date: Wed Jul 30 10:30:07 BST 2008 Working on the staapl/pic18 documentation... There's one thing I'm noticing now: the highlevel semantics talks about associativity. This is true for composition of macros, due to associativity of function compositon. However, this property is NOT conserved through instantiation! I(C(x,y)) != C(I(x),I(y)) C = concatenation I = instantiation Actually, this property is essential for some optimizations that expose 'observable' code: jump targets. This is why the chain splitting is so important in the instantiation step. Entry: Onwards: concurrency / types / ans Date: Wed Jul 30 08:51:04 CEST 2008 * This is interesting: http://www.transterpreter.org/docs/index.html I'm a bit sad alloted time didn't allow to implement the distributed debugging system for KRIkit last year. But surely, concurrency is the next step for Staapl, next to some more static analysis. * Reading about types lately, especially TAPL. This article summarizes it well: http://www.pphsg.org/cdsmith/types.html Types are "things to prove about programs". This can really mean anything. According to Chris that's the big idea. This is significantly different from "types are sets" in the dynamic/lisp sense. * ANS Forth on top of Staapl Forth? It is not straightforward but could take the form of a standard Forth compiled to staapl primitives with simulated dictionary. More importantly, is it necessary? Should Staapl contain a mechanism to build a standard reflective Forth on top of the unrolled macro Forth? It looks like this is a ``marketing problem'' more than anything else. Entry: editing the forth paper Date: Wed Jul 30 12:01:32 BST 2008 Took this out because it might confuse people: \footnote{Being aware of patterns is what programming is all about. It is important to see patterns in your problem, so you can compress the problem down into a feasible solution. However, it might be \emph{more} important to close the loop and see the patterns in your \emph{solution}, so you can bring your understanding of the problem to a higher level. For significantly complicated solutions burried in code, the code can really talk back by throwing residual patterns at its creator.} Entry: time for play? Date: Thu Jul 31 02:23:33 CEST 2008 I'd like to use the dsPIC (PIC30) to make some sound. Was thinking about targeting the gpasm assembler for it, instead of writing one from scratch, since the architecture is significantly different from the 8-bit ones. It would be an interesting test case for serializing target label computations. Hmm.. Looks like gpasm doesn't support PIC30. Ok, from scratch then. Will start tagging posts. Entry: dsPIC30 Date: Thu Jul 31 12:54:50 CEST 2008 I'd like to generate an assembler from an instruction set table for dsPIC30, however, I can't seem to find one. Just sent a request for information about the dsPIC30/33 to Microchip. Also added the 12 and 14 bit core instruction sets. Maybe try to get to a blink-a-led app for the 14 bit arch? It would be interesting to figure out some pic code sharing, to test the flexibility of the Staapl design. EDIT: got an unhelpful reply from microchip, trying again. In the mean time i found this: http://ww1.microchip.com/downloads/en/DeviceDoc/mplabalc30v3_00.tgz Which contains a file src/c30/c30_device.info and some C routines to manipulate it. Entry: 14 bit arch Date: Thu Jul 31 16:58:15 CEST 2008 How to add a new architecture? ASM - create an assembler description COMP - add some pseudo ops for stack manipulation - write metapatterns for the arithmetic and logic operations COMBINED - connect code to purrr like purrr.ss (so 'forth-compile can be used) - connect the assembler Works pretty well. Now trying to restructure it a bit.. After fixing bug in meta-pattern which prevented the use of (macro: ...) and some minor cleanup, the 14bit core seems to work. box> (forth-compile ": foo 123 + ;") (print-code) foo: 0000 307B [movlw 123] 0001 0780 [addwf INDF 1] 0002 0008 [return] I'm taking a different approach for MC14: no intermediate instructions except for the ones used in purrr. See where I get.. Entry: meta-pattern Date: Thu Jul 31 18:09:24 CEST 2008 This is a classical evaluation order manipulation. I know what I want to do from a high level, but I somehow don't understand the particularities of it. Quasiquotation is really confusing.. Factoring it in simple steps might help.. meta-pattern (M0) is a macro that generates a macro M1 M1 expands to a number of applications of a template defined in M0 Let's try to construct a toy example first. What I don't understand is nesting of syntax-case, and nesting of quasisyntax and unsyntax. The rule: an unsyntax corresponds to the toplevel quasisyntax, just like quasiquote, and nesting of syntax-case just binds new variables, but the toplevel ones are still visible. Nothing special.. Basicly, syntax-case is a binding mechanism that allows you to avoid unsyntax. Nested quote/unquote is a mess, so solving this with merging of namespaces (variables and metavariables) is more convenient. For higher order macros it's best to stick with syntax-case and syntax, and leave nested ellipsis and quasisyntax/unsyntax alone. Looks like it's working now. It's quite readable. I've also added a macro 'patterns-class that combines 'meta-pattern and its invocation. This gives a pretty compact representation: (patterns-class (macro) ;;--------------------------------- (op pe/op opcode w/op ~op) ;;--------------------------------- ((+ + addwf w/+ ~+) (- - subwf w/- ~-) (and and andwf w/and ~and) (or or iorwf w/or ~or) (xor xor xorwf w/xor ~xor)) ;;--------------------------------- ((w/op) ([opcode INDF 1])) (([qw a] [qw b] op) ([qw (tscat: a b pe/op)])) (([qw a] op) (macro: ',a movlw w/op)) ((op) (macro: ~op)) ((~op) (macro: w<-top drop w/op))) Entry: multiple targets Date: Fri Aug 1 00:04:29 CEST 2008 Time to start factoring and parameterizing code. First thing to tackle is the chain/bin state management: everything that goes into mc14.ss should be factored out. created live/state.ss Entry: ANS Forth Date: Fri Aug 1 01:17:06 CEST 2008 So.. It seems like a good idea to pick up the ANS Forth again. The only freedom in implementing one is what kind of threading model to use, and where to put stuff. Some requirements: * subroutine threaded * call/jump/or-jump * data word doubler This can probably be implemented as a small layer on top of standard forth as done before. num -> 'num dup upper Entry: control flow analysis Date: Fri Aug 1 10:35:10 CEST 2008 Chunks of code in target-word structs are basic blocks in muchnick's terminology. (Function calls don't count here, because they don't actually change the flow of control arbitrarily: they are equivalent to inlined instructions). One thing I probably need to change is to separate basic blocks after conditional branches. These are the basic building blocks for control structures: * unconditional jump * conditional jump * conditions It seems essential to be able to represent the condition generators in an abstract form, so they can be easily inverted. (The code that generates the flag can be inverted, instead of the flag being inverted after its generated). Otoh. It should be possible to build a control flow graph with non-instantiated code. This is the problem I tried to solve with delimited control, but there have to be better ways.. Some form of reflectivity on the macro end might be necessary: representing non-primitive macros as lists? Entry: tool integration Date: Sat Aug 2 10:41:35 CEST 2008 Preparing for professional usage, this project needs: * better integration with MPLAB (linker) * interface with C-based development. Entry: Factor Date: Mon Aug 4 18:43:59 CEST 2008 Let's take another look at Factor. What I'm interested to find out is how the compiler is structured. Let's see if there are any documents on the blog describing it. These seem to be interesting links: http://factor-language.blogspot.com/2007/09/two-tier-compilation-comes-to-factor.html http://factor-language.blogspot.com/2008/01/compiler-overhaul.html Hmmm.. They are more about the dynamic vs. static debate. I think I've converged on that: both are nice, but static declarative modules win. Toplevels can be built on top of that. For PIC, everything is static, and redefinitions need to be reloaded, but it does allow for an 'allot-stack' like development which allows separation of kernel and application. Entry: ANS Forth Date: Tue Aug 5 09:32:01 CEST 2008 It seems that standardizing is an essential part to get to some adoption. Basicly, nobody cares about nonstandard Forths: people write their own. Makes sense really. So, let's bring the PIC18 Forth to standard. The goal is to do it as convenient as possible, without loosing too much time on optimization. Some ideas I gathered before: * Data doubler: Add a layer that performs just primitive data size doubling. * Unified address space. Map part of RAM into the namespace. * Interpreter. It seems a good idea to stick with a native Forth, and write a dispatching interpreter on top of this. That way all primitives can be re-used. It's simplest to first make the data doubler, so words written in the doubled language are usable in the unit language. On top of this memory access words can be written, which then can support a trivial interpreter loop. Alternatively, I can implement Taygeta's primitives, and optimize them. Entry: Data doubling Date: Tue Aug 5 09:52:59 CEST 2008 PRIMITIVES: * math primitives: coded manualy (DONE) * macro mapping: coded manually COMPOSITON: * parser map: - num -> num hilo - word -> _word This should be written as a pure parser. I think I'm running into a composition problem here: I can't find a straightforward way to plug in the 'derived: word to 'forth-begin. Let's go to the definition of 'forth-compile and work from there. (define-syntax (forth-compile stx) (syntax-case stx () ((_ str) #`(forth-begin #,@(string->forth-syntax #'str))))) This inserts lexed syntax into forth-begin. Maybe 'forth-begin needs a namespace argument? Let's see where '(macro) is hardcoded. That's in 'forth-begin-tx, where the records returned by the parser are assigned to a namespace. I thought there's one more hardcoding reference, which is in rpn-tx.ss where '(macro) namespace is used to check if a particular identifier is a parser, but this is actually parameterized by 'rpn-map-identifier. What about using the latter in 'forth-begin-tx too? Next: should the '(target) namespace be remapped too? Ok, parameterized a bit, but that leads to other spaghetti being exposed. Atm, the thorn is the fact that instantiate.ss references the '(macro) namespace. Shouldn't this be generic? I just moved these references out of 'forth-begin, but the only thing that macro does is to bind instantiate.ss to the forth parser-tx.ss Trying to fix something else first: parser-tx.ss doesn't need to be aware of the wrapping words, it should just provide an abstract data structure with 'forth 'variable and 'macro tags. OK. The code classes are interpreted in 'forth-begin-tx and not passed to 'forth->records. Maybe the next step is to also generate the toplevel scheme forms as part of the records structure to avoid awkward passing of out-of-band data through muting parameters? Done. Alright, it's a bit cleaner now. Maybe this is enough to build a front-end that takes macros from a different namespace? Still there's the problem of how to link target functions back to the original target namespace. Maybe this is more of a nested namespace problem actually? Using '(macro derived) and '(target derived) does make symbols accessible as derived/+ dervived/- etc... in the core space. This does requires a decision: to make name space mapping standard. Entry: Derived Forth Date: Tue Aug 5 13:10:00 CEST 2008 Let's concentrate the ideas from the previous post. To create a derived Forth, create a separate namespace that is a child of the one we build on top of. The language is then defined though a parameterized forth-begin-tx such that: * 'derived-forth-begin uses only only (macro _) and (target _) namespaces for direct reference and definition. * the corresponding prefix macros in (macro _) map to (macro) for implementing functionality. * all (macro _) forms are accessible in the (macro) namespace through their direct mapping xxx -> _/xxx, but the (macro _) namespace is completely isolated from (macro). Once this works, maybe the current 'live mode can be written as a compile mode? But it's not really necessary, since it probably doesnt require toplevel forms. Unless we allow definitions in the live mode.. Maybe it is a cleaner model.. Entry: return stack Date: Tue Aug 5 15:19:12 CEST 2008 Thomas Pornin: "ANS doesn't require the return stack to consist of stackable elements... What ANS specifies is that, for each activation context, there is a stack-like storage area in which you may write celle values with >R, and get them back with R>. But these values are accessible only from the word itself, not from the caller and neither from the callees. Moreover, you are supposed to clean that stack before exiting from the word." Elizabeth D Rather: "Exactly. A standard system must have a Return Stack whose entries are the same size as cells and data stack items. And it must respond to >R, R@, and R>. What the standard *doesn't* require is that the system must use it for return addresses. This is interesting. I didn't know that. This means it's probably best to rename my 'x' stack to the 'r' stack. Anyways, i've removed all references to 'r' so it can be added easily. Also, x will be renamed to x@. Entry: ANS Forth frontend Date: Tue Aug 5 18:43:04 CEST 2008 The non-reflective words are going to be straightforward, but the reflective ones are problematic. The lexer, prefix processor and macros are DIFFERENT entities in the unrolled structure, while in a reflective Forth, they are all just Forth words. I don't see a solution for this, other than completely replacing the lexer and preprocessor parser with something more akin to an interpret mode simulator. http://www.ultratechnology.com/meta.html Entry: documenting a port Date: Wed Aug 6 12:24:47 BST 2008 Maybe it's a good idea to publicly port to 14 bit architecture, so that process can be documented? Entry: Formalizing Coma Date: Wed Aug 6 20:26:42 CEST 2008 Scat is a concatenative language modeled after Joy. Syntactically, a program p is a concatenation of programs p_i or a primitive program word p' p = (p_0 p_1 ...) | p' This is isomorphic with the semantics, where each program word can be associated with a function, and syntactic concatenation maps to function composition. For Scat, the operational semantics (the implementation in terms of a primitive machine) is given by primitive Scheme functions closed over a state space represented by a Scheme data type. In the case of Scat and Coma, this is a stack, in case of Coma+Control, this a pair of stacks. For most practical use, but not necessary for theoretical use, the state contains at least a stack. It's reason of being is to introduce locality in the effect of functions. It is useful for creating a practical programming language and to derive simple local syntactic rewrite rules that can be derived from local stack operations. Reduction (evaluation) of Scat expressions is eager, and happens from left to right, where each primitive function part of a larger composition is applied to the state, which is threaded through the computation. This is the same as a sequential machine with global state. Note that, because function composition associates with function application, the order of evaluation is arbitrary (S_0 a) b = S_0 (a b) The function a applied to the initial state S_0 returns a value that when passed to the function b yields the same result as evaluating the composition (a b) Now, this only useful if you can prove that there is some c with nice properties such that (a b) = c which allows the application to be written as S_0 c The ``nice properties'' can be simplified to mean ``shorter code''. For Coma, the eventual goal is to generate machine code (syntax) from a concatenative source program (syntax). So instead of looking at associativity of composition, we should look at associativity of concatenation (syntax). More specifically, at rules that allow the substitution of a concatenation of program words i.e. (x y z) by another concatenation of words (s t). a b x y z c = a b (x y z) c = a b (s t) c = a b s t c This uses the rule x y z = s t Now, where do these rules come from? In Staapl, they are syntactic transforms that preserve the associated semantics. These semantics are operational semantics derived from a stack machine's instructions. TODO: Implementation. First: translate to QW, CW language. Then implement rewrite rules. Todo: * explain where the assymmetry comes from: why does a rewrite rule operate on code from the right only? * explain in a simple way that all semantics comes from target idealized (un-projected) machine operations. * explain how you go from arbitrary substitutions to a greedy left->right substitution scheme. EDIT: This needs some cleanup. It deserves a separate paper. What I'm trying to explain: * Coma's primitives are intermediate language transformers. The intermediate language has essentially 2 instructions: execution and quotation. This is then extended with termination, jumps and conditional jumps. * Some of the intermediate language is real target language. Open question: good or bad? Should this be separated out? Due to the pattern matching, it behaves as opaque black-box code + it allows the implementation of simple peephole optimizations. (This is ok: it's a natural extension of the opaque CW data vs. QW that can be partially evaluated. actually, it's a mix of the 2.) * It also allows arbitrary data to be passed from macro to macro, which is a vehicle for arbitrary incremental code generation: this is the presence of QW in the language: it embeds a dynamic stack language. * This can be viewed as eager partial evaluation. In a concatenative language, PE is rather trivial (no variable substitutions). But, in most cases it is not complete: not all primitives exist at run time: they need to be specialized / combined. * It is not PE. Actually, it is, on the lowest level (math + stack shuffling). -- Essentially: This contains quite some posts about the semantics of Coma. It is work in progress. Essentially, the semantics is defined from a set of rewrite rules that implement CONCATENATE operation as a binary function taking a program in intermediate form, and a program word. This operation performs semantics preserving transformations. Entry: evaluation Date: Sat Aug 9 09:03:24 BST 2008 It's been a busy time lately. What needs to happen next? On of the priorities is to get back into industry as soon as possible. You there, hire me! Possible next steps: * reference documentation + API fixing * CSP * TAPL * other Microchip targets * zero-cost viral platform * better boot monitor protocol * array processing language * a 24-bit virtual dsp for PIC18 * finding collaboration * standard Forth frontend Entry: occam Date: Sat Aug 9 09:18:21 BST 2008 see http://en.wikipedia.org/wiki/Occam_%28programming_language%29 - Communication between processes work through named channels. One process outputs data to a channel via ! while another one inputs data with ?. Input and output will block until the other end is ready to accept or offer data. - SEQ, PAR and ALT for sequential parallel and conditional execution. The difference between Concurrent ML (on which PLT Scheme's concurrency is based) and Occam, is the way in which channels are treated. In CML they are first class (dynamic), while in Occam they are static entities. Entry: monitor rewrite Date: Sat Aug 9 10:01:47 BST 2008 This involves 2 main parts: an asynchronous message-passing mechanism over an abstract channel, and the definition of a low-level protocol for different transports. The current problem with the monitor protocol is that it is RPC based. This is fine for 1-1 communication, but won't work well over a many->many network. Entry: Microchip programming protocol Date: Sat Aug 9 10:07:51 BST 2008 Towards a zero-cost standard platform for Microchip PIC. All PICs support the microchip programming protocol, consisting of: 1 /MCLR Vpp 2 Vdd 3 GND 4 PGD 5 PGC 6 PGM ( see http://www.prc68.com/I/ICD2.shtml ) PICs without charge pump need 13V Vpp and can't program themselves. It is necessary to split the development workflow into two parts: * Single chip applications: this can use a combined full programmer + debug monitor to also support chips that need Vpp. This is the one useful for teaching, so it makes sense to make the tethering hardware as simple as possible, i.e. to not depend on a Microchip programmer. * Networked applications: here interconnect needs to be as simple as possible, so the 5-wire programmer protocol is impractical. Cost of tethering hardware isn't critical, so can have more complexity. It's best to stick with something standard here, i.e. I2C or CAN bus. The topology can be symmetric: programmer host interface can be just one of the nodes. Entry: ICD2 + serial? Date: Sat Aug 9 13:46:38 BST 2008 Is it possible to combine the programmer port and the serial port into a single connector? If the hardware flow control pins can be used, this could work. I believe the FTDI chip has a bitbang mode too, which could be useful. serial ICD2 GND GND /CTS VCC VCC TXD RXD /RTS DTE = master (Data terminal equipment) DCE = slave (Data circuit-terminating equipment) RTS = requirest to send CTS = clear to send Using standard serial, there are only 2 output lines: TXD and RTS. The microchip protocol needs at least 2, clock and data, assuming the MCLR and PGM pins are set correctly. Maybe the best approach is really to stick with the ICD2 connector, and device a protocol on top of that. Entry: Staapler connector Date: Sat Aug 9 14:31:37 BST 2008 Before the lowlevel boostrap can be solved, I need an adapter dat converts serial to ICD2 protocol. Let's combine the Olimex 6-pin header with the FTDI 6-pin header into a 12 pin double header. This allows the use of 2x3 female headers for the Staapler. How to arrange it? Daisy chaining Staapler should work without trouble (using one Staapler to program another one). The other alignments can be taken to simplify GND and VDD connections. Staapler connector: target board male 2x6 header with ICD2 and TTL232R serial connector. This is placed at board edge. Staapler (outline = dotted lines) fits on top of this, sticking out (downward) over the target bord edge (dotted edge). The serial connector is optional. . . . . . . . . . . . . . . +-------------------+ . . | 1 2 3 4 5 6 | . ICD2 . | 7 8 9 10 11 12 | . Serial (optional) . +-------------------+ . . . . target board edge . ----------------------------- . . . . ICD2 1 /MCLR white 2 VDD red 3 GND black 4 PGD blue 5 PGC green 6 PGM yellow SERIAL 7 GND black 8 /CDS 9 VDD red 10 TXD orange 11 RXD yellow 12 /RTS Entry: Staapler programmer Date: Sat Aug 9 16:33:00 BST 2008 I started building a 18F1320 prototype which has this male connector as its programming connector. Next is to find out where to connect the outputs and inputs of the female connector. INPUTS: A single input is necessary for the bit-banged serial receive. Probably another one, or this one shared, for some client->host signal when using the ICD2 port alone for monitor operation. Either interrupt on change, or the INT0-INT2 pins can be used. OUTPUTS: TXD bit banged serial transmit /MCLR O target reset VDD O optional target power PGD I/O data PGC O clock PGM O low voltage programming Later this could be extended with a charge pump for generating the programming voltage. Other constraints: * RA4 = open drain * general purpose ports (not using analog): RA0-3 * more (not using oscillator): RA6-7 Probably best to use RA0-3 for the 4 outputs-only ports /MCLR VDD PGC PGM. Let's use INT0/RB0 for PGD I/O. So, what should I do first? * Get the monitor to work with the ICD2 connector only. * Build a programmer. The programmer seems rather trivial. The most difficult problem to solve now is to get a bidirectional communication going over the ICD2 connector. Some things to think about. The only benefit of this device is to be able to use both from-scratch programming + staapl console. It is beneficial for smaller targets when there is actually a charge pump available. Maybe it is better to just modify the firmware of an already existing programmer? The ICD2 would be a good target. Entry: Staapler roadmap Date: Sat Aug 9 17:54:17 BST 2008 Eventual Staapler goals are: * standardize on ICD2 connector for interactive debugging and get rid of serial connector. * add both LVP and HVP through ICD2 connector * create Staapler bootstrap method using parport prog To bootstrap Staapler itself, this approach can be used: (1) Standardize all PIC development on ICD2 connector only. Create a protocol for bi-directional serial communication on top of master-slave Microchip programmer protocol. (2) Build two Staapler boards A and B with ICD2+serial input (or Staapler + one other board with just ICD2+serial input) (3) Connect A's serial output to B's serial input and devise patch-through code. This process then gives a workflow for working with multiple projects (namespace) at the same time. (4) Build the ICD2 comm master on A and ICD2 slave on B, using the serial->serial patchthrough for B development. Independently after (3) (5) Connect A's ICD2 output to B's ICD2 input and write PIC LV programmer code. (6) Use programmer code to emulate a minimalistic parport programmer. (7) Write host-side bootstrap code for PC parport programmer. (8) Staapler v2: add support for charge pump + USB, write HV programmer code, add support for different busses (for networked debugging). Entry: Staapler protocol Date: Sat Aug 9 19:10:37 BST 2008 1. Document the ICD2 master-slave protocol 2. Add to this slave->master messaging (maybe just polling?). 3. Allocate pins on master + slave side 1. From the programming manual for 18F1220 (DS39592B) Commands and data are transmitted on the rising edge of PGC, latched on the falling edge of PGC, and are sent Least Significant bit (LSb) first. All instructions are 20-bits, consisting of a leading 4-bit command followed by a 16-bit operand. Depending on the 4-bit command, the 16-bit operand represents 16-bits or 8-bits of data. COMMANDS FOR PROGRAMMING 4-Bit Command Core Instruction 0000 (Shift in 16-bit instruction) Shift out TABLAT register 0010 Table Read 1000 Table Read, post-increment 1001 Table Read, post-decrement 1010 Table Read, pre-increment 1011 Table Write 1100 Table Write, post-increment by 2 1101 Table Write, post-decrement by 2 1110 Table Write, start programming 1111 2. The hardware protocol used is really enough to provide bidirectional communication if it is extended with a simple 'data ready' signal from the slave. This could either be an asynchronous slave signal (pull a line low/high) or an answer to a poll. A direct slave signal is probably easiest to implement. Then it should be the data line since that is already a multiplexed port at the master, leaving the clock to remain output-only. 3. On the slave side it's easy: use data and clock from the programmer protocol. On the server side, the clock will allways be output, but the data line receives an asynchronous signal. In case of asynchronous signalling, the data line is probably best implemented as a wire-or bus. Slave side data line for 18F1220 is RB7. It has a weak pullup; maybe this can be used instead of on the master side, i.e. thinking about driving with PC parport, which has open collector output? This http://www.beyondlogic.org/spp/parallel.htm suggests using a 4k7 pullup. target host --------------------- PGC RA0 PGD RB0 PGM RA1 VDD RA2 /MCLR RA3 GND GND Notes. It's probably best to start with reading the device using the ICD2 protocol. That way core routines for write and read can be created. The return protocol is self-delimited: each return message is prepended with the size of the message. Probably the master->slave protocol should do the same so it can be routed. RB7 is slave data line; sending is a simple shift. Same for RB0 master sending. What with Microchip's in circuit debuggin protocol? Is that specified somehwere? Entry: Revising boot monitor Date: Sun Aug 10 09:43:44 BST 2008 To prepare for proper routing of the monitor protocol, all commands should be self-delimiting. Note that the protocol remains RPC: each request receives a reply. Q: Should the interpreter ignore messages it doesn't understand? If so, what should be the reply? Probably not. This is a debugging protocol where the slave gives full access to the host. Limiting access by checking if some messages are legal or not makes no sense in this setting: it's the host's responsability to properly drive the target in this mode. Q: Should the monitor protocol be explicitly specified? No. It is the responsability of the application developper to use the proper protocol since it might be extended for specific applications. Q: What with PING? I'm starting to run out of boot monitor space. Maybe it's best to take this out. It's not essential. And identification data can always be added to block 0, which is essentially unused: the host knows what kind of target it is, and for each target type, a storage area can be assigned. Next: clean up live/tethered.ss so there is a clear delimited message send/receive part in the protocol, instead of the current "send header then send body" approach. Q: Should we send "write at address" or "set address pointer" + "write" ? Opting for the latter. It seems to be easiest when sending multiple chunks. So, rewrite is done. All messages in 2 directions are now length-prefixed byte strings that do not require interpretation to be repeated or routed. The tethered.ss code is refactored into async send/receive for messages and rpc functions. 'ping is removed and replaced with a simple target-sync ack = OK mechanism, which enabled the monitor code to fit into the 256 words again, after interpreter changes for delimited messages. Entry: popularity Date: Mon Aug 11 08:01:12 BST 2008 To arduino or not? It would sure help popularity, but I'm afraid it will shift focus too much toward AVR, and leave PIC in the shadow. I've invested quite some time in getting familiar with Microchip's architectures, to for the tool as a whole, it might be better to stick to that single architecture until most of the highlevel workflow and interoperation design reflects my knowledge there. This is nontrivial: it includes the whole monitor.f + thethered.ss chain. Standard Forth frontend or not? It might help to get more people interested, but would distract from the original idea. If I find a proper way to combine standard Forth with the current approach so they can interoperate, and provide metaprogramming support only for the standard one, it might work though. Usage statistics. I have no control over the PLaneT version. How to find out usage stats? Maybe the PLaneT version should download updates? Or, I could put the installer in PLaneT only? Entry: Staapler Date: Mon Aug 11 11:30:15 CEST 2008 ( It looks like Staapler is redundant since PicKit2 provides all the necessary functionality. It can program .HEX files and act as serial passthrough. ) I started building 2 prototypes for the first iteration of the Staapler based on a 18F1320. Currently limited to programming / debugging of Staapl based projects for PICs that support LVP. It uses the Microchip 6-pin ICP/ICD interface, using the pinout from the Olimex ICD2 clone (RJ jacks are too cubersome). http://www.olimex.com/dev/images/PIC/PIC-ICSP.gif In addition, the connector has an optional second row of 6 pins with a FTDI serial TTL header in case an additional serial port is desired. The hardware interface is a male 2x6 header with ICD2 and TTL232R serial connector. This is placed at target board edge. The Staapler is plugged on top of this (board outline = dotted lines), sticking out (downward) over the target bord edge (dotted edge). . . . . . . . . . . . . . . +-------------------+ . . | 1 2 3 4 5 6 | . ICD2 . | 7 8 9 10 11 12 | . TTY Serial (optional) . +-------------------+ . . . . target board edge . ----------------------------- . . . . ICD2 1 /MCLR white 2 VDD red 3 GND black 4 PGD blue 5 PGC green 6 PGM yellow SERIAL 7 GND black 8 /CDS 9 VDD red 10 TXD orange 11 RXD yellow 12 /RTS Next to the female connector for programming a target board, the Staapler has a male Staapler-compatible connector. This is used to bootstrap the Staapler boot monitor using an ICD2 and connect to the host using the serial interface. It contains the following connections for the female header: Target Staapler 18F1320 ------------------------------ PGC RA0 PGD RB0 PGM RA1 VDD RA2 /MCLR RA3 GND GND This has PGD wired to RB0(INT0) so the Microchip protocol can be easily extended with a target -> host ``terminal ready'' signal, enabling the host to wait for replies without the need for polling. The bootstrap plan is documented here: http://zwizwa.be/ramblings/staapl/20080809-175417 Entry: ANS : Forth in Forth + ??? Date: Mon Aug 11 13:11:57 CEST 2008 What are the necessary primtives to implement a Forth in Forth? The problem I'm trying to solve is to simulate an on-target Forth between full simulation and full stand-alone. The only real primitives are @ ! execute Which means: memory and execution model are abstract. This works for the standard PIC18 boot monitor, which is really nothing more than the 3 instruction Forth [1] together with some implemented primitives for programming and more block transfer with less overhead. So, let's first build a complete abstract forth machine. What does a completely abstract ANS Forth machine look like? * two stacks of cells (the parameter stack and R stack: >R R> R@) * a cell array allocation mechanism (ALLOT) From the standard document[2]. 3.3 The Forth dictionary Forth words are organized into a structure called the dictionary. While the form of this structure is not specified by the Standard, it can be described as consisting of three logical parts: a name space, a code space, and a data space. The logical separation of these parts does not require their physical separation. A program shall not fetch from or store into locations outside data space. An ambiguous condition exists if a program addresses name space or code space. 3.3.3 Data space Data space is the only logical area of the dictionary for which standard words are provided to allocate and access regions of memory. These regions are: contiguous regions, variables, text-literal regions, input buffers, and other transient regions, each of which is described in the following sections. A program may read from or write into these regions unless otherwise specified. So, '@' and '!' can _only_ access data space. Q: The important next question is: When defining reflective words (macros), do they have access data space at all? No. Data space does not exist during compilation, which means that all words that are accessible at run-time should also be simulated. This is the only proper way to unroll the behaviour completely, and have simulated reflection that can be TRANSPARENTLY moved to real reflection. Roadmap: - write a reflective ANS Forth that can generate simulated programs using some access to the target memory (for I/O) - from the representation of this, extract a kernel using dependency analysis of words. draw primitives from a library. Q: How to represent the reflective Forth? I'm not sure if it's useful to write this on top of Coma. The result of reading a Forth file is a structured code graph that can be processed to generate a Forth kernel in terms of Coma and some primitives. Q: Where to start? Let's port JONESFORTH[3]. In fact, it might be a good exercise to stick with Richard Jones's literal file, and replace the x86 assembly with Scat code. Actually, it can be ported to plain scheme code. Let's write it in PLT's rs5s language. Hmm.. it's got me completely confused again. There are some problems. I'd like to write this on top of STC primitives, which are not compatible with direct threaded code. Alos, the dictionary model needs to be worked out a bit. So I need a standard model where primitives can be plugged on top of some execution/dictionary model I can live with. Maybe the best way is to implement one myself after all, or figure out how to modify one of the portable ones. It doesn't look like JONESFORTH is a good starting point. Going to remove it from the darcs archive. I need a different set of primitives.. Maybe eForth[4] is the way to go after all? I found a link on comp.lang.forth[5] about this. This brings me back to taygeta MAF[6] which is what I was looking for actually. EDIT: One night of sleep later, I think effort is best spent elsewhere. The essential problem is that dictionary layout and threading model need to be abstracted. If the Forth has to run on a Flash controller, Flash programming needs to be in there too.. This is already a large part of the interpreter. References: [1] http://pygmy.utoh.org/3ins4th.html [2] http://lars.nocrew.org/dpans/dpans.htm [3] http://www.annexia.org/_file/jonesforth.s.txt [4] http://www.baymoon.com/~bimu/forth/ [5] http://groups.google.com/group/comp.lang.forth/browse_thread/thread/287c36f0f2995d49/10872cb68edcb526?#10872cb68edcb526 [6] ftp://ftp.taygeta.com/pub/Forth/Applications/ANS/maf1v02.zip Entry: Minimal bootstrap Date: Mon Aug 11 15:46:05 CEST 2008 From http://groups.google.com/group/comp.lang.forth/browse_thread/thread/287c36f0f2995d49/10872cb68edcb526?#10872cb68edcb526 --- FORTH Primitives Comparison (use a fixed width font) --- 3 primitives - Frank Sargent's "3 Instruction Forth" 9 primitives - Mark Hayes theoretical minimal Forth bootstrap 9,11 primitives - Mikael Patel's Minimal Forth Machine (9 minimum, 11 full) 13 primitives - theoretical minimum for a complete FORTH (Brad Rodriguez) 16,29 primitives - C. Moore's word set for the F21 CPU (16 minimum, 29 full) 20 primitives - Philip Koopman's "dynamic instruction frequencies" 23 primitives - Mark Hayes MRForth 25 primitives - C. Moore's instruction set for MuP21 CPU 36 primitives - Dr. C.H. Ting's eForth, a highly portable forth 46 primitives - GNU's GFORTH for 8086 58-255 functions - FORTH-83 Standard (255 defined, 132 required, 58 nucleus) 60-63 primitives - considered the essence of FORTH by C. Moore (unknown) 72 primitives - Brad Rodriguez's 6809 CamelForth 74-236 functions - FORTH-79 Standard (236 defined, 147 required, 74 nucleus) 94-229 functions - fig-FORTH Std. (229 defined, 117 required, 94 level zero) 133-? functions - ANS-FORTH Standard (? defined, 133 required, 133 core) 200 functions - FORTH 1970, the original Forth by C. Moore 240 functions - MVP-FORTH (FORTH-79) ~1000 functions - F83 FORTH ~2500 functions - F-PC FORTH FIXME 27 ? - C. Moore's MachineForth For comparison: --- 8 commands - BrainFuck (small,Turing complete language) 8 primitives - Stutter LISP 8 primitives - LISP generic 11 functions - OS functions Ritchie & Thompson PDP-7 and/or PDP-11 Unix 14 primitives - LISP McCarthy based 18 functions - OS functions required by P.J. Plauger's Standard C Library 19 functions - OS functions required by Redhat's newlib C library 28 opcodes - LLVA - Low Level Virtual instruction set Architecture 51-56 functions - CP/M 1.3 (36-41 BDOS, 15 BIOS) 56 functions - CP/M 2.2 (39 BDOS, 17 BIOS) 40 syscalls - Linux v0.01 (67 total, 13 unused, 14 minimal, 40 complete) 71 opcodes - LLVM - Low Level Virtual Machine instructions 92+ functions - MP/M 2.1 (92 BDOS, ? BIOS) 102 functions - CP/M 3.0 (69 BDOS, 33 BIOS) ~120 functions - OpenWATCOM v1.3, calls - DOS, BIOS, DPMI for PM DOS apps. 150 syscalls - GNU HURD kernel 170 functions - DJGPP v2.03, calls - DOS, BIOS, DPMI for PM DOS apps. 206 bytecodes - Java Virtual Machine bytecodes 290 syscalls - Linux Kernel 2.6.17 (POSIX.1) eForth primitives (9 optional) ---- doLIT doLIST BYE EXECUTE EXIT next ?branch branch ! @ C! C@ RP@ RP! R> R@ >R SP@ SP! DROP DUP SWAP OVER 0< AND OR XOR UM+ TX! ?RX !IO $CODE $COLON $USER D$ $NEXT COLD IO? 9 MRForth bootstrap theoretical ---- @ ! + AND XOR (URSHIFT) (LITERAL) (ABORT) EXECUTE 9 Minimal Forth (3 optional) ---- >r r> 1+ 0= nand @ dup! execute exit drop dup swap 23 MRForth primitives ---- C@ C! @ ! DROP DUP SWAP OVER $>$R R$>$ + AND OR XOR (URSHIFT) 0$<$ 0= (LITERAL) EXIT (ABORT) (EMIT) (KEY) 20 Koopman high execution, Dynamic Freq. ---- CALL EXIT EXECUTE VARIABLE USER LIT CONSTANT 0BRANCH BRANCH I @ C@ R> >R SWAP DUP ROT + = AND 46 Gforth ---- :DOCOL :DOCON :DODEFER :DOVAR :DODOES ;S BYE EXECUTE BRANCH ?BRANCH LIT @ ! C@ C! SP@ SP! R> R@ >R RP@ RP! + - OR XOR AND 2/ (EMIT) EMIT? (KEY) (KEY?) DUP 2DUP DROP 2DROP SWAP OVER ROT -ROT UM* UM/MOD LSHIFT RSHIFT 0= = 36 eForth ------- BYE ?RX TX! !IO doLIT doLIST EXIT EXECUTE next ?branch branch ! @ C! C@ RP@ RP! R> R@ >R SP@ SP! DROP DUP SWAP OVER 0< AND OR XOR UM+ $NEXT D$ $USER $COLON $CODE BrainFuck ------- > < + - . , [ ] Stutter LISP ---- car cdr cons if set equal lambda quote generic LISP ---- atom car cdr cond cons eq lambda quote LISP, McCarthy based ---- and atom car cdr cond cons eq eval lambda nil quote or set t Entry: next Date: Tue Aug 12 09:11:43 CEST 2008 Maybe it's a good idea to leave the standard Forth idea alone for a while. It is definitely doable and an interesting challenge, but at this moment, there are probably more useful things to focus on. Additionaly, having two different paradigms for Forth might be needlessly confusing. So let's move on. To do: * Staapler - just continue the roadmap. next goal = device ID readout. * Reference documentation - The forms 'patterns 'compositions and 'substitutions. * Internal language standard - control flow primitives: document this when writing the 14 bit core port. - standard library: I'm not sure if this is useful yet. Probably best to wait and see until there are more targets. It would be nice to be able to share most of the monitor code though. Entry: comp.lang.scheme Date: Tue Aug 12 09:37:14 CEST 2008 Trying a different kind of announce here.. -- Hello, Announcing the recent release of Staapl, a library for metaprogramming microcontrollers. It is centered around the concept of an ``unrolled'' Forth language tower, impedance-matched to PLT Scheme's declarative module system, and uses a stack-based pattern language to implement primitives for code generation, partial evaluation of the pure functional target language subset and parameterized metaprogramming. The representation language is a thin layer on top of Scheme implementing a concatenative language with threaded state which can be used independently of Staapl. Current implementation contains a Forth syntax frontend to the concatenative macro language, a backend code generator for Microchip's PIC18 architecture, a tethered interaction system, and a test application implementing a sound synthesizer. Download & Documentation at http://zwizwa.be/staapl Enjoy! Tom Entry: debugger protocol Date: Tue Aug 12 11:55:27 CEST 2008 Apparently the debugger protocol for 18F is proprietary, but for 18F877 it's available here: http://www.beyondlogic.org/pic/f877-6bk.pdf http://ww1.microchip.com/downloads/en/DeviceDoc/51242a.pdf The main idea behind the debugger is the use of a breakpoint register and external halt. Looks like this is for ICD and is obsoleted, replaced by ICD2. Anyways, I don't really need it. The use I've found for ICD2 is to debug the debugger.. I might add some support for ICD2 later, but let's focus on a more direct interpreter approach. Entry: double debugging Date: Tue Aug 12 12:21:49 CEST 2008 A problem I ran into during development of KRIkit is the double debugger problem. When writing an application involving a client and a server, it is beneficial to be able to access both systems from the same host. I'm thinking about a simple daisy-chained system. The unused bits in the boot monitor interpreter could be used as address bits. The next step is serial patch-through. Entry: parameterized code Date: Wed Aug 13 09:14:46 CEST 2008 Context: writing a synchronous serial slave for the ICD2 programmer protocol. This involves code parameterized by 'clock' and 'data' pin macros, and provides 'read' and 'write'. It is time to properly tackle the problem I tried to solve with loading different code modules into a namespace. Currently, the only way to introduce new bindings is to write parsing words. These are essentially prefix words that expand into arbitrary (prefixed) Forth code. I'm not entirely happy with: * 'load' into a shared namespace only works for 1 instance. * this pattern is too important to have it specified on top of the Forth prefix parser. Q: Is it possible to write down a simple solution in Scheme/Coma and translate this to a Forth prefix solution? Entry: Coma code + instantiation Date: Wed Aug 13 09:29:23 CEST 2008 Maybe it's time to start moving things from Forth syntax to Coma/sexp syntax. What's currently missing is an instantiation syntax for Coma words. Something like: ;; Define code generators (compositions (macro) macro: (a 1 2 3) (b a c) (c a b)) ;; Declare which of them employ run-time instantiation. (instantiatie (macro) c b) Problems: * Recursive macros. During instantiation some recursive expansions might lead to infinite code size. This needs a detection mechanism and possibly automatic instantiation. * Fallthrough and multiple exit points. This needs special syntactic support. Moreover, the ';' used in Forth is awkward to use in Scheme syntax. * Somehow it feels wrong to use Forth's structured programming words in the s-expression definitions. Code blocks in the form of higher order macros seem to make more sense there.. Is this just aestetics? The fallthrough/local-exit problem could be avoided by not allowing them in a simple version of 'instantiate'. The cost of these features needs to be analyzed more: they are not for free and significantly complicate the code graph instantiator. Entry: problem with darcs-1 -> darcs-2 Date: Wed Aug 13 09:59:56 CEST 2008 I missed a patch on my laptop.. The one that cleans up instantiate.ss How to fix? Roll back the darcs-2 repo to the point right before this patch, compute a diff and patch the new tree. Alternatively: inspect the patch itself, see what changed and copy over the files. FIXME: the test doesn't work any more since the monitor changes. ok Entry: Staapler change of plan. Date: Wed Aug 13 18:52:06 CEST 2008 The plan has changed to move to PicKit2, since there's no way to do it cheaper, and the platform seems open. So what to do with Staapler? Maybe just focus on using the ICD2 connector as a serial port. Interesting: PicKit2 uses an 18F2550. It might be directly reprogrammable for Staapl use. Entry: Interaction simulator Date: Thu Aug 14 13:56:32 BST 2008 Maybe.. Instead of using a forth-style interaction mode, it is possible to just completely simulate everything: interaction mode is built on top of core Coma without target specialization, and the resulting QW,CW code is interpreted. Compiled definitions are kept in the interpreter so they can be used in interaction. This looks like a much saner model than the current one + it allows to work towards some standard Coma semantics. Entry: next Date: Thu Aug 14 18:05:20 CEST 2008 * specification of an internal language standard through simulation of non-specialized coma output. * create an instantiation syntax usable from scheme and write the fort-begin form on top of this. * think a bit about this whole csp/occam-pi thing. figure out what the core automation problem is in the occam-pi compiler. maybe a 'manual' version can be included in Staapl? Entry: books Date: Thu Aug 14 19:37:02 CEST 2008 This is the collection of books I'd like to finish. I'm being foolish and read books without making exercises, trying to incorporate knowledge in the design and implementation of Staapl. TSPL, EOPL and SICP were real eye-openers. Done: * TSPL http://www.scheme.com/tspl3/ * EOPL http://www.cs.indiana.edu/eip/eopl.html * SICP http://mitpress.mit.edu/sicp/full-text/book/book.html (except logic) Reading: * CSP http://www.usingcsp.com/ * TAOCP http://www-cs-faculty.stanford.edu/~knuth/taocp.html * TAPL http://www.cis.upenn.edu/~bcpierce/tapl/ * TAPOC http://www.comlab.ox.ac.uk/people/bill.roscoe/publications/68b.pdf Todo: * PLAI http://www.cs.brown.edu/%7Esk/Publications/Books/ProgLangs/ * CTMCP http://www.info.ucl.ac.be/~pvr/book.html Entry: dsPIC Date: Fri Aug 15 08:48:46 BST 2008 Microchip is not really being very helpful providing anything else than the .pdf programmers reference. So, let's see if there's a way to get a hold on the instruction set without typing it in. The difference between dsPIC and the 8-bit PICs is the addressing modes. This chip has more of a classical RISC ISA. Data memory hierarchy: RAM, first word (WREG) RAM, first 16 words (Wxx - Working registers) RAM, first 4K (File registers, Near RAM) RAM, all 64K Data addressing modes: | File Register |- Basic --| Immediate | No Modification | | File Register | Pre-Increment | | Direct | Pre-Decrement | | Indirect -------| Post-Increment | | Post-Decrement | | Literal Offset | | Register Offset | | |- Direct |- DSP MAC --| | | No Modification | | Post-Increment (2, 4 and 6) |- Indirect --| Post-Decrement (2, 4 and 6) | Register Offset The only difficulty is to somehow encode the addressing modes properly. The generic template is: (file (o b f d) "oooo oooo obdf ffff ffff ffff") (lit10 (o b k d) "oooo oooo obkk kkkk kkkk dddd") (lit5 (o b w k d) "oooo owww wbqq qddd d11k kkkk") (alu3 (o b w q d p s) "oooo owww wbqq qddd dppp ssss") lit5 -> lit4 for shifts I'm not feeling much for typing it all in.. Isn't there a way to snarf the assembler from a file generated by mplab? Typing the address modes manually, the opcodes i can probably get that way. So, roadmap: 1. generate an ASM file with all opcodes 2. run mpasm30 3. interpret output (binary?) Setting up mpasm30.. I have an XP image somewhere.. Wait, there are linux binaries. Entry: Architectures: where to draw the line? Date: Fri Aug 15 16:22:27 BST 2008 dsPIC has a gcc toolchain: http://iridia.ulb.ac.be/~e-puck/wiki/tiki-index.php?page=Cross+compiling+for+dsPic http://iridia.ulb.ac.be/~e-puck/share/cross-compiler/packages/pic30-binutils_2.01-1_i386.deb http://iridia.ulb.ac.be/~e-puck/share/cross-compiler/packages/pic30-gcc_2.01-1_i386.deb http://iridia.ulb.ac.be/~e-puck/share/cross-compiler/packages/pic30-support_2.01-1_all.deb It's a bit silly to try to compete with that. If Staapl should support dsPIC, it needs to do so on top of C. It might make sense to try to write some dsp-ish language that compiles to assembler, but it doesn't look like there is much to gain in writing an assembler + forth compiler in the same style as for PIC18. Once it's a multi-register RISC chip, C is really the way to go. Same for 32-bit ARM/MIPS. Also, when there's a C compiler available, not being able to integrate with it is commercial suicide. For the small controllers, you're going to be the only tool in the chain. Not so for the bigger ones... there are going to be libraries and C developers. So, where should Staapl live? - For 8-bit controllers < PIC18 Staapl = Forth based macro assembler. Implements native code generator. - For 16/32-bit controllers that have a decent C compiler: Staapl provides Forth based scripting language + DSP-ish array processing languages. Built on top of C. - For 32-bit systems based on PPC/Intel: Staapl's PLT Scheme based meta system. The unifying idea is concatenative languages: the bare metal macro Forths for the low end, a linear typed Forth for the mid end and the functional Scat/Scheme for the high end. Entry: LuT comment Date: Sun Aug 17 08:32:07 CEST 2008 Forth is very elegant minimalism, and hard to improve if you want a minimally complex self-hosted system. But when you switch to a cross-compiled Forth system, the target side can be simplified a lot by taking out reflection. For interactive applications, Staapl uses Frank Sergeant's 3 instruction Forth. On the host side however, minimalism isn't really necessary. Having "just Forth" on the host side really seems like a limitation. I see no reason why a cross compiler written in Forth can't be replaced by one written in Scheme. To make this easier, Staapl uses an impedance matching language Scat, which is a concatenative language based on Joy. Staapl's code transformers are modeled after Forth's immediate words, but are represented as pure Scat functions. All reflection is unrolled as acyclic PLT Scheme modules, making metaprogramming more straightforward. When you skew ordinary Forth from words towards macros an approach like this where macros are as clean as possible seems to make sense. Staapl's PIC18 Forth starts out as all macros. There are no kernel words. I've just updated the scribble docs in the PLaneT package. The main project site is at http://zwizwa.be/staapl. Entry: PicKit2 arrived Date: Mon Aug 18 16:58:46 CEST 2008 > piklab-prog -p pickit2 -d 18f1220 -c connect piklab-prog: version 0.15.2 (rev. distribution) programmer: pickit2 device: 18F1220 Using port from configuration file. Connecting PICkit2 Firmware 1.x on USB Port with device 18F1220... Error: USB Port: Could not find USB device (vendor=0x04D8 product=0x0033). > sudo piklab-prog -p pickit2 -d 18f1220 -c connect piklab-prog: version 0.15.2 (rev. distribution) programmer: pickit2 device: 18F1220 Using port from configuration file. Connecting PICkit2 Firmware 1.x on USB Port with device 18F1220... Firmware version is 2.1.0 Warning: The firmware version (2.1.0) is higher than the version tested with piklab (1.20.0). You may experience problems. set Vdd = 5 V and Vpp = 12 V Error: USB Port: Error receiving data (ep=0x81 res=-110) (err=could not get bound driver: No data available). Starting over: From the PICkit 2 Interface guide Kit PICkit2SourceGuidePCv2-52FWv2-32.pdf available in http://ww1.microchip.com/downloads/en/DeviceDoc/FirmwareV2-32-00.zip Looks like that's what I need. The proper way is to integrate this into Staapl, instead of using an external programmer. Mostly because the serial patch-through is what is most useful. What is interesting about v2 firmware is that it is essentially an interpreter for a script language. This allows quite direct manipulation of the interface, so it might be used for all kinds of things! It gives access to serial port emulation, I2C, SPI and the normal clocked protocol. Note: instead of adding the directory where the .dat file resides to the run-time path, just copy that .dat file to /usr/local/bin together with pk2cmd Do i need to start digging in the .dat file? Programming can probably be outsourced to 'pk2cmd'. The only interaction necessary is access to the port directly. The code in pk2cmd can be snarfed if reading the .dat file should be necessary, but it's not priority it seems. Next: libusb in PLT Scheme. This looks like the excuse I need to dig into FFI. PICkit2 uses HID, which should be quite straightforward. Does the FFI have a reader for C structs? Hmm.. Just using emacs to edit the struct into an s-expression was easy enough. OK, got the reader to work... Why do people create such ad-hoc formats? The rest is for later.. Maybe the dsPIC compiler has a similar way of reading in meta-data? Entry: libusb and FFI Date: Tue Aug 19 10:05:35 CEST 2008 Some questions: * is it possible to automate this? * what about self-referential structures? * pointer-pointer? First contact seems ok, just need to resolve some issues. Also, it might be a good idea to define the usb structs on a higher level, so they can be used in Forth too. Probably need to read this first: wget http://repository.readscheme.org/ftp/papers/sw2004/barzilay.pdf Then start here: http://libusb.sourceforge.net/doc/function.usbopen.html A bit too much reading for now.. Need to switch to output mode again. Entry: Parameterized programming Date: Tue Aug 19 15:26:18 CEST 2008 Two forms of metaprogramming: anonymous macros and name templates. I'm writing a blog article about this.. Now, object. An object is something identified by a single reference (for compile-time objects this would be a macro) that can be sent a message. I'm interested in a _static_ version of this abstraction, which is not much more than a way to link namespaces at compile time, or some way of plugging in behaviour.. I.e. a bidrectional communication port accepts two messages: 'read and 'write. I'd like to write code that uses these messages, but is parameterized by the object implementing it. What could be done is to declare 'read and 'write as methods. A method is a macro that sends a message to a (couple of) object(s). Hunch: I think it's better to persue that route instead of that of prefix parsers, because they don't compose well. Try to stick to concatenative s-expression syntax, and improve partial evaluation rules. So... does 'late binding at compile time' make any sense? The idea is to let a method be something that sends a message to an object. Entry: The s-exp language Date: Tue Aug 19 16:31:56 CEST 2008 Maybe it's best to force the s-exp language into something really distinct from Forth, to make it less confusing. The idea in the end is that instantiation is arbitrary: the current differences between macros and forth, the latter allowing fallthrough and local exit etc, are better captured somewhere else. Quotations that are not instantiated can be instantiated and associated to an execution token. For 8-bit this gives 256 quotations. So, can instantiation be automated? This would give a cleaner language semantics. In that case, manual instantiation is no longer necessary, and can be classified as merely an optimization hint. When not to instantiate? - doesn't produce compilable code <- easy to check - if inlining produces BETTER code <- needs a measure Entry: lazy composition : concatenative vs. compositional Date: Fri Aug 22 10:57:48 CEST 2008 I'm not using the concatenative vs. compositional property anywhere. This extra inspection level could be useful for optimizations. It boils down to 'lazy composition'. Bottom line: hiding the primary composition mechanism behind lambda is NOT a good idea because it throws away information that might be exploited during optimization. The lambda representation IS a good idea for introducing arbitrary primitives however. The good part is that this is easy to change. Also note that a CPS style representation is actually better than a nested lambda expression representation. Entry: abstracting over names Date: Sat Aug 23 11:19:47 CEST 2008 #lang scheme/base ;; An attempt to find a standard mechanism for parameterizable ;; modules. This consists of: ;; * Find a way to build instantiated code from scheme. Currently it ;; only uses macros. ;; * Create instantiation macros for parameterizable code. ;; * Possibly write the Forth instantiation on top of this Scheme ;; code. Somehow unify ``data parameterization'' usable from ;; macros with ``name parameterization'' usable from prefix macros. ;; * Should this use higher order macros or just Forth control words? ;; The example is parameterized code for a synchronous read operation. (define-syntax-rule (sync-reader clock data read write) (begin (compositions (macro) macro: (wait clock low? if begin clock high? until then begin clock low? until) (read 0 (wait data @ 1 and or <<) 8 times) (write (wait dup 1 and data ! >>) 8 times drop) (instantiate read write))) ;; This creates two instantiated words parameterized by two macros. ;; The 'clock and 'data arguments could possibly be passed as macro ;; arguments, but the 'read and 'write NAMES are identifiers; the ;; macro language doesn't know identifiers. ;; Original Forth syntax: ;; \ Instantiate 'read and 'write procedures for slave mode synchronous ;; \ communication. ;; macro ;; : wait-falling | reg bit | ;; reg bit low? if begin reg bit high? until then ;; begin reg bit low? until ; ;; forth ;; : read 8 for clock wait-falling ... next Entry: planet build problems Date: Sat Aug 23 11:24:34 CEST 2008 Planet will compile all the scheme files, so they better be working. Make sure to run "make planet-test" before uploading! win XP: readline not available, so let's take it out. Entry: second documentation pass Date: Sun Aug 24 11:10:42 CEST 2008 Let's look at the available documentation. Is it any good? The first thing that comes to mind is: "How clear is the purpose?" Not very I think.. This should be paragraph #1. Let's fix the introduction first. * Stack machine model: can keep fine granularity on to very low complexity hardware. This influences code size and processor complexity. RISC has a factorization problem due to the global nature of registers. Abstracting small machines as stack machines is feasible, since thy are usually not too engrained in the register model. ( Is this really so? Get data about popularity. ) * Concatenative language model built on top of this: simple framework for staging and partial evaluation, alternative to lambda calculus. Entry: machine model Date: Sun Aug 24 19:25:38 CEST 2008 Now, pragmatically, it would probably be easier to market Staapl as a machine model instead of a programming language. This model should be a stripped down version of ANS Forth, with reflective words removed. Let's have a look at Moore's "core set" and his processor primitives. Staapl needs more: Harvard architecture can't be abstracted. Two possible complementary usage scenarios: * individual embedded developers using the macro forth or the static combinator language directly to write applications and provide new target chips. * using the language model as a target language for higher order description compilation in a model-based design approach. Now, compared to other machine models, the one in Staapl is a 'mid point': you write the standard primitives in terms of appropriate machine primitives in the same system. There is no intermediate "byte code" representation that requires an explicit byte code interpreter with optional jit: the idea is to use just code transformers. Let's see All arguments are data cell size, or arbitrary compile time identifiers that can be optimized away. This exposes 3 stacks. One stack is partially evaluated (the D stack) which means it is present at run time and at compile time (containing quoted, opaque machine instructions or hibrid immediate instructions). The R stack is only present at run time and the M stack only at macro execution time. Note that the machine model does not specify machine flags: macros for flags should be hidden in the lower layers. When these primitives require runtime support, they should redirect to the same name prefixed with a tilde "~" character. Conditions "?" are probably compile-time constructs to ensure optimal encoding of branch and skip instructions. If they survive to runtime, they are encoded as 0=false and true otherwise. (primitives ;; WORD D R M ;; Data (@ (a -- x) (--) (--) "Fetch data") (! (x a --) (--) (--) "Store data") (+ - * / and or xor (x y -- z) (--) (--) "Binary operator") (<< >> 2/ (x -- y) (--) (--) "Unary operator") ;; Aux stack (>r (x --) (-- x) (--) "To r stack") (r> (-- x) (x --) (--) "From r stack") (r@ (-- x) (x -- x) (--) "Fetch from r stack") ;; Control (call (--) (-- a) (l --) "Call label") (jump (--) (--) (l --) "Tail call label") (exit (--) (a --) (--) "Return to caller.") (or-jump (? -- ) (--) (l --) "Conditional jump to label") ;; Indirect memory access (@a (-- x) (--) (--) "Fetch RAM.") (@a+ (-- x) (--) (--) "Fetch RAM, increment a") (@p (-- x) (--) (--) "Fetch RAM.") (@p+ (-- x) (--) (--) "Fetch ROM, increment p") ) Entry: Broad spectrum Date: Sun Aug 24 22:45:11 CEST 2008 It is easy to implement a stack machine quasi-optimally on a low-end 8-bit machine, the PIC18 being the canonical example. For RISC architectures, some whole-program analysis is probably in order to optimize register usage. Is it worth it to keep that road open? In other words: should Staapl aim at broad-spectrum complexity, or does a C backend suffice? There are a couple of things to distinguish: (same for 16 vs 32) * ease of porting 8-bit apps to 16-bit cores. * ease of introducing a 'data doubler' for 8-bit cores. * data-flow analysis and register allocation for DSP/RISC cores What about ease of porting? If it is possible to define chip targets in a way that allows static checking and possibly derivation of optimization rules, a lot could be gained. Is a unified assembler / simulator feasible? For processor cores this doesn't seem so difficult: once say 3 random chips are implemented, generalizing them should be straightforward. So, what about solving the vendor lock-in problem? Maybe C already solves this.. Yes, simulators again. That's where the real beef is. If I take the effort to write an assembler, I should perform a little more work and provide an instruction simulator too. Otherwise it's probably best to try to work the supplier-provided assembler in textual form. Anyways, I should have a look at tiny device C compiler, see if there's nothing to snarf there. http://sdcc.sourceforge.net/ Entry: trade magazines Date: Sun Aug 24 23:03:44 CEST 2008 http://www.dspdesignline.com/ http://www2.electronicproducts.com http://www.techonline.com/ http://www.microcontroller.com http://www.embedded.com/mag.htm http://www.deepchip.com/ http://www.semiconductor.com http://www.design-reuse.com/ Entry: interoperation and snarfing Date: Mon Aug 25 10:28:07 CEST 2008 Yes, finding a nice to have the project survive is difficult. Maybe it is better to start focussing on using 3rd party tools. As mentioned in the last post: the only real reason to write an assembler is to also write a simulator. Doing this without machine readable processor descriptions is asking for trouble. I had this idea a few weeks ago about writing an instruction extractor that snarfs directly from the vendor-specified assembler, to extract opcodes when addressing modes are provided. Maybe a similar thing could be done with semantics? Run tethered programs to see how each instruction behaves, which flags it sets etc.. There are ways to make this not a completely manual unverified "copy from manual" story. However, this is a road that needs focus, so it might be a better idea to avoid it, and start using vendor assemblers, and forget about the simulator idea. About simulation: the difficulty is not necessarily to simulate processor cores, but to simulate peripherals. Without help from the manufacturer, this is really not doable. So.. It looks like processing text files is going to be part of the problem. I need to have a look at the Ometa parser language. This might come in handy for integration work: use grammars not parsers. Entry: PICkit2 + libusb Date: Mon Aug 25 10:35:54 CEST 2008 Today I'm going to have a look at the PICkit2 serial patch through. I think the target-side software problem (parameterized code problem) that made me drag my feet is solved. The approach I found would help towards building the control combinator Coma extension more useful for metaprogramming. Next step: understand the FFI. I got the Barzilay/Orlovsky paper in front of me. There's a thread about libusb and some code by Jakub Piotr Cłapa here: http://list.cs.brown.edu/pipermail/plt-scheme/2007-March/016671.html Entry: USB / model based design? Date: Tue Aug 26 15:28:15 CEST 2008 I'm writing some support code for the Univeral Serial Bus. The problem with usb is its (too) complex setup procedure. It is quite general, and has a lot of overhead when you want to do simple things with little meta-data overhead. It seems that this is the main reason most people stick to the HID subsystem, moreover because this does not require device driver developent. I'm looking at this from two perspectives: * Need to support USB PICkit2 through libusb/FFI * Client code for the PIC18 The host code necessary to drive the PICkit2 is interesting to figure out how things work, while the client code for the PIC18 is an interesting excercise for code generation from a higher level description. I've attempted this before without much success. The eventual state of a USB connection is data packets moving back and forth between virtual pipes called ENDPOINTs. Apparently, the host side isn't that complicated. For PICkit2 I get away with claiming the interface and sending/receiving 64 byte buffers to ENDPOINT1. Entry: prescheme Date: Mon Aug 25 23:32:02 CEST 2008 As mentioned before, for RISC chips some dataflow analysis is necessary to do proper register alloc. I should really have a look at Factor's new intermediate rep. Probably a lot to learn there. Also, I'd like to define an applicative first order language that's as simple as possible, to get an idea of the complexity of a compiler that support such a language, and to see how far such a thing is from properly data-flow analysed stack code. Maybe have a look at Jal? (Hmm.. according to Wouter the source code is a mess..) Entry: PK2 replies Date: Tue Aug 26 19:36:45 CEST 2008 box> ,staapl (enter! "staapl/pickit2/pk2.ss") box> (open-pk2) box> (pk2 GETVERSION) (2 32 0 41 64 181 68 24) That's firmware 2.32.0 and some other stuff. got minimalist interpreter.ss working so commands and scripts can be created conveniently: box> (pk2-cmd (EXECUTE_SCRIPT (pk2-script (VDD_OFF) (VDD_GND_ON))) (EXECUTE_SCRIPT (pk2-script (VPP_PWM_OFF) (VPP_OFF)))) (166 2 254 253 166 2 248 250) Reply works too. box> (pk2-cmd (GETVERSION)) (2 32 0) Ok.. this is complete overkill, but it was fun :) Something doesnt seem to work: box> (pk2-cmd (EXECUTE_SCRIPT (pk2-script (READ_BYTE_BUFFER)))) box> (pk2-cmd (UPLOAD_DATA)) () Changed names to 'script-begin and 'command-begin, cleaned up the code a bit more and concatenate all replies from executing multople commands in a single 'command-begin. Next: test the serial output. It's pretty clear that something doesn't work with executing scripts: This doesn't turn on the led. box> (pk2-script (BUSY_LED_ON)) () From the manual: "RUN_SCRIPT and EXECUTE_SCRIPT commands will be ignored when any of StatusHigh bits 7:1 are set" Apparently, status needs to be read before it works. Ok, going to have some more fun. Now I got the interpreter macros, but they behave a bit awkward for composition. I'd really like to model the instructions as functions. Let's organize the namespaces so the commands can be in plain sight. This works pretty well! Got free tab completion and all :) Entry: demos Date: Wed Aug 27 12:30:07 CEST 2008 * model-based design demos: the (graphical) DSP language compiled down to the PIC18 Forth machine + registers, and a generic USB driver generator. * PICkit2 interaction: complete the driver + write a bitbang serial monitor. Entry: two distinct uses of byte code Date: Fri Aug 29 16:19:24 CEST 2008 I'm sick, so this might not make sense. There are 2 uses of byte code: * as decoupling interface * as serialization protocol In the first case, bit patterns are associated to specific semantics. All programming uses this byte code explicitly, possibly from a shared table where it's specified only once, used by both sides of the interface. In the latter case, byte code is a consequence of having to squeeze information over a channel between two componets of the SAME system, and its values are essentially arbitrary: if they are never touched by a programmer, they can be generated automatically. This allows for COMPRESSED byte codes: ones that result in dense code. The are destinguished by the code being published or not. Entry: standard programming cable Date: Fri Aug 29 17:30:54 CEST 2008 http://www.electro-tech-online.com/micro-controllers/36934-icd2-vs-pickit-2-a.html#post289566 When I designed the Inchworm (ICD2) and more recently the Junebug (PK2) I recall why I choose the 2x5 The RJ12 on the Microchip ICD2 does not have 0.1" pinning (breadboards & protoboards are usually 0.1") The 1x6 female header on the PICkit 2 IMO simply comes loose to easily plus its not readly designed for use with a programming cable (the Junebug has a 16pin extended version available) Some clones use a 1x5 (the prehistoric PGM pin is omitted) polarized male header, you need a crimp tool to build them. There is also no strain relief so you have to be careful. Other clones use a 2x5 male header and IDC cable. This is the system my kits use. 1. Inexpensive 2. Robust the IDC cable has strain reliefs 3. Reliable on my kits I double up the connections 4 Can be assembled easily a small vice works well 5 Can be used on a solderless breadboard (the doubled connections work well here) Looks like a plan. Entry: PICkit2 UART mode Date: Fri Aug 29 18:45:21 CEST 2008 Apparently, READ_STATUS clears the uart mode.. This gives a nice squarewave: (begin (ENTER_UART_MODE (baud 300)) (DOWNLOAD_DATA (build-list 60 (lambda _ #x55)))) Baud calc is correct too. Measured 35ms pulse distance which is about 30Hz. Hardware loopback works too. Bootstrap roadmap: * Connect the ISP and UART pins on the target together, so current monitor can be used. * Set up host side data connection. * Implement target side bit-banged serial. * Support native programming so we can go from full recompile & upload straight to interaction without external tools. Maybe it's actually easier to setup programming first. Let's have a look. How to request device ID? A script? Yes, looks like it. Entry: Applications Date: Fri Aug 29 21:38:13 CEST 2008 http://www.automotivedesignline.com/howto/210200925;jsessionid=ZUA0HLFOIMXXKQSNDLPSKHSCJUNN2JVN?pgno=3 http://www.autosar.org/find02_07.php Have a look at the microcontroller abstraction layer. Entry: MetaML Date: Sat Aug 30 10:26:45 CEST 2008 http://en.wikipedia.org/wiki/MacroML MacroML is an experimental programming language based on the ML programming language family that seeks to reconcile ML's static typing systems, and the types of macro systems more commonly found in dynamically typed languages like Scheme; this reconciliation is difficult as macro transformations are typically Turing-complete and so can break the type safety guarantees static typing is supposed to provide. http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.10.6987 Read that and figure out the idea behind macros in a staged setting. Entry: more examples Date: Tue Sep 2 18:58:27 CEST 2008 After couple of days of rest it's become pretty clear that what the project needs most is more examples. Plenty of things to do though: * Finish PICkit2 interface (module = bitbanged serial port) * Usb drivers (module = generic usb + hid + serial + audio) Other things that keep coming back: * parsers (Ometa) * quasiquotation, MetaML and syntax-case * difference btwn. metaprogramming Scheme and lambda calculus only. (define global identifiers?) Entry: quasiquotation and staging Date: Tue Sep 2 22:23:08 CEST 2008 quasiquotation = merging of namespaces syntax-case does this by importing identifiers look at oleg's paper about metaml vs. quasiquote and renaming. maybe the essential difference is the compile time identification of bound names? scheme's macro facility only makes sure that names come from the right lexical environment, but they do not identify binding forms and variable references. another thing is cross stage persistence: quoted code creates closures (variables from generation time are still available). so, quasiquotation. apparently i got the nested version wrong: http://repository.readscheme.org/ftp/papers/pepm99/bawden.pdf "The innermost comma is associated with the outermost backquote." I found a definition of quasiquote here: http://srfi.schemers.org/srfi-46/eiod.scm (define-syntax quasiquote (syntax-rules (unquote unquote-splicing quasiquote) (`,x x) (`(,@x . y) (append x `y)) ((_ `x . d) (cons 'quasiquote (quasiquote (x) d))) ((_ ,x d) (cons 'unquote (quasiquote (x) . d))) ((_ ,@x d) (cons 'unquote-splicing (quasiquote (x) . d))) ((_ (x . y) . d) (cons (quasiquote x . d) (quasiquote y . d))) ((_ #(x ...) . d) (list->vector (quasiquote (x ...) . d))) ((_ x . d) 'x))) Entry: That safe language Date: Thu Sep 4 01:20:24 CEST 2008 Maybe, what i need is a language that can be completely expressed as rewrite rules instead of stacks? Maybe it makes sense as a didactic tool? Entry: pk2cmd Date: Thu Sep 4 13:12:03 CEST 2008 Simplified entry trace in pseudo-C: main() pk2app.PK2_CMD_Entry(argc, argv); checkHelp(argc, argv); PicFuncs.ReadDeviceFile(tempString); Pk2OperationCheck(argc, argv); // check if conn is necessary findPICkit2(pk2UnitIndex); // find pk2 and init comm processArgs(argc, argv); // execute commands PicFuncs.ReadPkStatus(); // only if it did comm the beef is in processArgs() PicFuncs.VddOff(); // make sure VDD is off // look for part name first PicFuncs.FindDevice(tempString); // look for the device in the device file // from then on, process args according to 4 priorities // these are in cmd_app.cpp and take you directly to the action priority1Args(argc, argv); checkArgsForBlankCheck(argc, argv); priority2Args(argc, argv); priority3Args(argc, argv); priority4Args(argc, argv); delayArg(argc, argv); Examples: program a .hex file: tom@zzz:/tmp$ pk2cmd -pPIC18F1220 -fsynth.hex -m PICkit 2 Program Report 4-9-2008, 13:32:55 Device Type: PIC18F1220 Program Succeeded. Operation Succeeded this contains options: -p device -f hex file selection: // in cmd_app.cpp bool Ccmd_app::priority1Args(int argc, _TCHAR* argv[]) case 'f': ret = ImportExportFuncs.ImportHexFile(tempString, &PicFuncs); -m program memory // in cmd_app.cpp bool Ccmd_app::priority2Args(int argc, _TCHAR* argv[]) // -M Program if ((checkSwitch(argv[i])) && ((argv[i][1] == 'M') || (argv[i][1] == 'm')) && ret){ if (hexLoaded) { if (argv[i][2] == 0) { // no specified region - erase then program all if (PicFuncs.FamilyIsEEPROM()) {} // ignoring eeprom programming else { bool rewriteEE = PicFuncs.EraseDevice(true, !preserveEEPROM, &usingLowVoltageErase); if (PicFuncs.FamilyIsPIC32()) {} // ignore PIC32 programming else { // program all but configs and verify, as configs may contain code protect // code here mainly calls PicFuncs.WriteDevice } } } } } // finally, in PICkitFunctions.cpp bool CPICkitFunctions::WriteDevice(bool progmem, bool eemem, bool uidmem, bool cfgmem, bool useLowVoltageRowErase) // this function contains the main programming flow: // compute configration information. int configLocation = (int)DevFile.PartsList[ActivePart].ConfigAddr / DevFile.Families[ActiveFamily].ProgMemHexBytes; int configWords = DevFile.PartsList[ActivePart].ConfigWords; int endOfBuffer = DevFile.PartsList[ActivePart].ProgramMem; SetMCLR(true); // assert /MCLR to prevent code execution before programming mode entered. VddOn(); // ignoring some device specific things (OSCCAL/Keeloq/PIC24/ RunScript(SCR_PROG_ENTRY, 1); if () { DownloadAddress3(0); RunScript(SCR_PROGMEM_WR_PREP, 1); } while data { DataClrAndDownload(downloadBuffer, DOWNLOAD_BUFFER_SIZE, 0); RunScript(SCR_PROGMEM_WR, scriptRunsToUseDownload); } // The functions called here are very simple wrappers around // writeUSB() that build a command packet. // programming EEPROM DEVID and CONFIG is similar Entry: pickit disassembler Date: Thu Sep 4 14:30:15 CEST 2008 Because it's probably not too hard to do, and because I need a way to see if the PK2v2 VM can be ported to AVR (not using SFR while programming). box> (pk2-script 1 dasm) Standard 12V Vpp 100ms delay after entering. Drop Vpp & retake high to try entry on parts with internal MCLR & osc, and ICSP pin(s) high outputs. (VPP_OFF MCLR_GND_ON VPP_PWM_ON BUSY_LED_ON SET_ICSP_PINS 47872 DELAY_LONG 20 MCLR_GND_OFF VPP_ON DELAY_SHORT 127 MCLR_GND_ON VPP_OFF VPP_ON MCLR_GND_OFF DELAY_LONG 19) Entry: microchip forum post Date: Thu Sep 4 17:02:50 CEST 2008 First, i'd like to point out that it is possible to use Staapl without the metaprogramming. I've tried to be careful to not let high abstractions get in the way when problems are simple and don't call for them. The website contains a couple of tutorials for this, and Staapl's main test application (a sound synthesizer) is of that class. The only thing that's generated there is a lookup table of note frequencies, a geometric series. This is the Scheme code that implements that macro: http://zwizwa.be/darcs/staapl/staapl/pic18/geo.ss For the metaprogramming, I'll macro-expand the sales pitch a bit. I haven't found a concise way to introduce the idea to people not familiar with both Forth and Scheme.. > Staapl is a collection of abstractions A library of scheme functions and macros > for metaprogramming microcontrollers that can be used to construct programs in a lower level language based on Forth, and interactive code for testing program operation. > from within PLT Scheme. Staapl integrates with PLT Scheme's macro/module system, a solid base for writing large applications and domain specific languages. The main point is that all name space and name scope management is PLT Scheme's. > The core of the system is a programmable code generator Think of it as a macro assembler which uses a stack to pass arguments. This leads to a prefix assembly language, instead of the normal postfix OPERATOR OPERAND ... > structured around a functional concatenative macro language The 'concatenative' part is where the ease of metaprogramming is hidden: composition of programs is just concatenation of strings. ( This is how PostScript is mostly used. ) The macros are not quite Forth, but a side-effect-free higher-level version that makes the Staapl core a lot simpler. > with partial evaluation. This is basicly 'constant folding' optimization, but it is also used to write template code. A concatenative functional language makes partial evaluation almost trivial. > On top of this it includes a syntax frontend for creating Forth style languages The core is not Forth, but I've added a frontend that looks a lot like Forth. This is historical, since the original idea was to build a forth compiler, but the core evolved into a different language. > a backend code generator for the Microchip PIC18 microcontroller architecture A standard macro set implements a stack machine layer for the PIC18. I'm working on 12 and 14 bit cores. > and interaction tools for shortening the edit-compile-run cycle. One of the great benefits of Forth is its interactivity: with the target chip running, you can create new functions and try them out. However, Staapl is not self-hosted (not stand-alone) so interactivity needs to be simulated. This allows to use interactive development on tiny chips. Now, why does this matter? The original goals of this project are basicly: * Interactive Forth development for tiny chips. * Easy generation of code that has a large amount of red tape, but is difficult to express in a low-level language. The first one is really just standard Forth approach. If you've never tried it, give Flashforth a try. ( http://flashforth.sourceforge.net ) It stays closer to traditional Forth. Staapl's Forth is more optimized for the integration with Scheme. The latter one is more ambitious and is heavily inspired by the Lisp/Macro approach of writing domain specific languages. This is sometimes called model-based design/development: use high level descriptions to compile to low level form. Add to this that Forth is very simple to generate, and you have a high+low level system that is impedance-matched. The problem is that for small projects, this approach is overkill. I'm aiming mostly at large code bases which contain a lot of parameterization. I'm working on a USB driver framework that's written using such an approach, USB containing a significant amount of red tape to make you want to keep separate specificiation and implementation (which then becomes a special purpose translator from specification to concrete code). If this model-based idea rings a bell, drop me a line. I'm looking for concrete problems to apply this to ('compiling' specifications). * * * quote: ORIGINAL: DavidP5 I know that functional languages can be quite helpful in language translation. The purpose of this Staapl system seems to be for translating from one language to another (in this case Forth-style languages are translated, using the functional language Scheme, into PIC18 assembly code?). So it seems that you can create your own Forth-style language and then use this system to make your own compiler for the language. But, I'm sure you would have huge reservations about such a plan for microcontroller code generation. I would be interested inhearing from the OP whether I have this correct. There are 2 translation phases involved. 1. The Forth macro language translates concatenative code to machine code. To port this to a new target, some key macros need to be defined that generate the appropriate machine code for a simple set of primitives. Optionally, this can add some optimizations. Most of the core compiler is reused. This layer mostly presents a 'generic microcontroller' to the upper layers if desired. It's perfectly feasible to use this Forth language as a programming language by itself. 2. Code for this basic lowlevel Forth language can then be generated from within Scheme. This is if you want to use the 'generic microcontroller' stack machine as the machine to target your domain-specific description language. Entry: more pk2 Date: Thu Sep 4 18:51:32 CEST 2008 Added accessors to the pk2 module namespace that query the current device's properties: box> (print-script (ProgMemWrWords)) Erase progmem (09) & EE(0B), int timed 10ms WRITE_BITS_LITERAL #x06 #x00 WRITE_BYTE_LITERAL #x00 WRITE_BYTE_LITERAL #x00 WRITE_BITS_LITERAL #x06 #x09 DELAY_LONG #x02 WRITE_BITS_LITERAL #x06 #x0B DELAY_LONG #x02 box> (script (ProgMemWrWords)) Erase progmem (09) & EE(0B), int timed 10ms (238 6 0 242 0 242 0 238 6 9 232 2 238 6 11 232 2) This required only minor modification: ;; Also create accessor thunks for properties on top of hash DB. These ;; will get the property of the current part (makes name-checking ;; static). (define-syntax-rule (define-reader/provide name (type id . args) ...) (begin (define-reader name (type id . args) ...) (begin (define (id) (property 'id)) ...) (provide id ...))) (define part (make-parameter 'PIC18F1220)) (define (property tag [dev (part)]) (let* ((part (hash-ref *part-index* dev)) (fam (vector-ref *family* (hash-ref part 'Family)))) (hash-ref part tag (lambda () (hash-ref fam tag))))) Trying to make a device-read script now, but something is missing: the code executes scripts on the device, but i didn't see the part where it gets uploaded. The lowlevel routine is \\ PICkitFunctions.cpp bool CPICkitFunctions::downloadScript(unsigned char scriptBufferLocation, int scriptArrayIndex) The one that calls this is: void CPICkitFunctions::downloadPartScripts(int familyIndex) This is called by void CPICkitFunctions::PrepPICkit2(void) Which also sets VDD and Vpp voltages Scripts start from index 1, not 0 !! box> (SCRIPT_BUFFER_CHKSM) (0 0 0 0) box> (DOWNLOAD_SCRIPT 1 (script (ProgMemRdScript))) Reads 64 words of program memory. () box> (SCRIPT_BUFFER_CHKSM) (1 0 238 0) So uploading seems to work. Execution not though: ;; CPICkitFunctions::ReadDevice (define (read-program-memory) (READ_STATUS) (MCLR_GND_ON) (EXECUTE_SCRIPT (VDD_GND_OFF) (VDD_ON)) ;; we power the device (EXECUTE_SCRIPT (CLR_DOWNLOAD_BFR) (DOWNLOAD_DATA 0 0 0)) (EXECUTE_SCRIPT (script (ProgMemAddrSetScript))) (CLR_SCRIPT_BFR) (DOWNLOAD_SCRIPT 1 (script (ProgMemRdScript))) (EXECUTE_SCRIPT (CLR_UPLOAD_BFR) (RUN_SCRIPT 1 ;; index ??? 1 ;; repititions? ) (UPLOAD_DATA_NOLEN) (UPLOAD_DATA_NOLEN))) I'd like to eliminate the use of 'script' too. For printing scripts, one should use the 'properties function. This requires some type spec, maybe derived from the name? Entry: remarks about pk2.ss Date: Fri Sep 5 01:45:58 CEST 2008 Can I say something intelligent about the pk2.ss implementation? It's about primitives and composition, and hiding as much as possible behind NAMES, exposing only a single behaviour for each... I.e. a selector name in the device data base representing a script has the ONLY behaviour to reproduce the script code, parameterized on the current device. All indirection in the datastructure can be hidden behind this one name. 3 namespaces are brought together: commands, script primitives and scripts from the .dat file. One remaining confusing bit is that scripts can be invoked as if they were commands without error. 2 semantics: commands and scripts. What I fail to see is why these 2 semantics are necessary. Basicly, the scripts are composable with something like ordinary 'call' in the form of RUN_SCRIPT, which has an extra repeat parameter. The commands are a flat meta-language. Something to fix: make sure it's not possible to confuse scripts and commands by making their return values distinct. It is convenient to not have to manually 'execute' commands, but confusion is easy. Entry: MetaML vs Scheme Macros Date: Fri Sep 5 02:16:24 CEST 2008 In MetaML's staged programming, the substitution is done on binding expressions: the static analysis knows which are binding constructs and which are variable references. For Scheme macros this is not possible for generated code, however, it is possible to identify bindings using bound-identifier=? I'm confused... Entry: blink-a-led Date: Fri Sep 5 09:48:55 CEST 2008 As requested on the Microchip forum. Actually not a bad idea. Preparing some code on the 452 proto board. Requires a bit of shuffling. Made a command line compiler wrapper script to use in examples. Entry: The 18F452 Date: Mon Sep 8 11:59:18 CEST 2008 Trying to share some code for the monitor, making it easier to have a "default" interpreter running (current code contained too much red tape). Implemented as macros, so separate instantiation is possible. The word 'init-all/console will perform all initialization necessary to setup the machine model (stacks) and initialize the serial port for the console. In addition, it starts the interpreter if a serial connection is detected (this requires a pulldown resistor on the serial port). Some things to fix: serial port baud rate: should be derived from a single specification. OK What needs to happen still? I'd like to use this 'sc' script installer, but that's a pain.. Entry: Mail from JPC Date: Mon Sep 8 15:28:27 CEST 2008 * http://www.sics.se/~adam/pt/ Isn't that just a statemachine? * Objective-C AutoreleasePools Isn't that just tree rewriting? * ARM THUMB code: use this as a native Staapl targed, and use C for the 32-bit mode. Also, check out fmt.ss : http://planet.plt-scheme.org/display.ss?package=fmt.plt&owner=ashinn * Look at typed scheme for namespace hierarchy stuff.. Entry: make-mzscheme-launcher Date: Mon Sep 8 17:56:49 CEST 2008 I wanted to send this to the list before I ran into 'make-mzscheme-launcher. -- Hello, I'd like to figure out a way to install (platform specific) wrapper scripts that call the main entry point of an application stored on PLaneT. I.e. the Staapl compiler has a command line front-end that can be invoked as: mzscheme -p zwizwa/staapl -- But for convenience i'd like to wrap this in a script called 'sc': #!/bin/bash exec mzscheme -p zwizwa/staapl -- "$@" Now, I wonder if there's a way to do this so that it "just works" on all platforms supported by PLT Scheme, after doing something like: sudo mzscheme -p zwizwa/staapl/install Also, is it possible to generate some progress report during installation of a package? I understand the default of quiet is to be preferred in many cases, but this seems to confuse people.. (Is there really something happening??) Maybe as a flag for the 'planet' binary? Cheers, Tom Entry: old partial evaluation explanation Date: Tue Sep 9 12:55:40 CEST 2008 ( This was removed from the blog and replaced with a post that stresses the BINDING and QUASIQUOTATION mechanisms used to implement SUBSTITUTION RULES that are INSPIRED by PARTIAL EVALUATION ) http://zwizwa.be/ramblings/staapl-blog/20080526-203330 ------------- So, how does it work? PE from (greedy) deterministic pattern matching = a typed template language So, by fixing the algorithm used to implement PE, a language emerges that is useful for other kinds of code generation. Let's spin that out a bit. PE in a concatenative language is quite straighforward: function composition is associative which makes evaluation order a parameter to play with. Compositions of pure functions can be performed at compile time to generate composite functions that are more efficiently implemented, while other compositions can be postponed to run time due to dependency of dynamic data (1). This is because concatenative syntax allows to abstract away the composition method: a function can always be trivially inlined, instead of being invoked at runtime using a the run-time composition mechanism: the machine's function call (2). When inlining multiple functions, there can be an opportunity for specialization by moving some computations to compile time. For example, inlining the functions [ 1 ], [ 2 ] and [ + ] produces an inlined composition [ 1 2 + ] which can be replaced by the composition [ 3 ]. This is automatic program specialization by Partial Evaluation in its purest form. In Purrr, the Partial Evaluator is not implemented as a separate component. PE is a consequence of the actions of the machine model, which is specified in terms of primitive Purrr macros, which implement a map of recently generated code (the top of the compilation stack) to new code to be generatied (placed on the compilation stack). These primitives are expressed in a language with deterministic pattern matching. It allows the specification of the following compiler components: * target code generation * peephole optimization * partial evaluation * generic parameterized template instantiation The first 3 could be counted as components of pure partial evaluation. The last one however is not: it is an interface that connects the concatenative macro language to explicit code generation tools. It allows the use of templates that have no target semantics unless they are parameterized. Why is this useful? Say you want to implement 'cos' as a function of two arguments like cos ( angle scale -- value ) Realizing that a true 'cos' function is never used in the target code because the scale can be fixed and is available at compile time, it can be implemented as a template that generates a lookup table and code to lookup the value. If later generic cosine routines are necessary, this template macro can be extended to compile a call to library code in case the parameter is not available at compile time. One can be surprised how many times this pattern occurs: due to the lack of target support for specific primitive abstractions it is often easier to write something as a template for specialized code. Note that this is different from programming for non-embedded systems where this primitive functionality is usually available. The advantage of doing it this way is that the code is easier to read: code expresses semantics more easily without instantiation annotation getting in the way. This annotation can be expressed somewhere else in the form of 'forth' and 'macro' mode indicators. The disadvantage is that a lot of code will be pushed towards implementation as a macro. If this is taken too far, possible sharing might be endangered. For that reason, moving between macro and instantiated target code is made really straightforward in Purrr, but it remains an explicit operation under programmer control. Explicit code generation in Purrr is useful when * partial evaluation becomes too hard to do automatically * some on-target primitives are not available * algorithms are hard to express in concatenative syntax So as long as it is possible to express a general algorithm in the purely functional macro sublanguage, the built in PE can be used to specialize the code. The advantage here is that the difference between compile and run time can be silently ignored as an implementation detail. However, in practice in some cases it might be easier to make the code generation process a bit more explicit. In it is made very straightforward to plug in arbitrary Scheme code for parameterized code generation. Conclusion Stack languages are interesting for writing parameterizable low-level code because the composition mechanism is so simple: * They are very straightforward to implement on the target architecture with a small run-time footprint of two stacks. * Automatic specialization through partial evaluation is very straightforward to implement off-target. * Implementing the code generator (including PE) using deterministic pattern matching exposes an interface that can be reused for plugging in aribitrary parameterized code generators. In Purrr, the code generator language is Scheme. Within Scheme all of Purrr and the underlying compiler is exposed: you can decide to generate (pseudo) assembly code, Purrr code, or interface to external code generators. -- (1) Of course, the Purrr target semantics is not purely functional. It contains language primitives that introduce machine state (determined by world state) through a side channel into the concatenative language. This is no problem for PE, since it merely limits PE to those the subset of pure functions. Procedures that depend on other parts of the machine state aside from the (threaded) parameter stack simply have to be instantiated, and cannot be performed at compile time. (2) Except when it interferes with the implementation of the run-time function composition method, i.e. modifies the return stack. A more correct statement would be that the subclass of pure functions can always be trivially inlined. Entry: concatenative email Date: Tue Sep 9 15:10:14 CEST 2008 Hello folks, I'm trying to write a paper about the core idea of Staapl from the perspective of concatenative code rewriting, and how a particular _implementation_ of this gives a convenient applicative -> concatenative metaprogramming framework. These two blog posts try to explain the mechanism: http://zwizwa.be/ramblings/staapl-blog/20080806-212034 http://zwizwa.be/ramblings/staapl-blog/20080625-162839 As side information a post that deals with the 'impedance match' between Scheme and the concatenative macro language, based on pattern matching and quasiquotation: http://zwizwa.be/ramblings/staapl-blog/20080526-203330 I'm wondering mostly if this makes sense.. While using the abstractions in real life works beautifully, I have great trouble trying to explain in a few words why this all works so well. Basicly, it's the interplay of: * pattern matching for destruction/construction of stack machine code * using this for eager partial evaluation implementing rewrite rules * macro hygiene and lexical closures Any comment welcome. (Best one gets a free PIC kit ;) Cheers, Tom Entry: blink-a-led on the 18F452 Date: Tue Sep 9 16:03:19 CEST 2008 It's working.. Don't know what went wrong last time, something to do with the FTDI cable giving up. What needs to be fixed to make the demo work smoothly in interactive mode. * State should contain toplevel macro file, so macros can be regenerated. * Interpreter should evaluate macros if it doesnt find a target word or substitution macro. Entry: USBprog Date: Wed Sep 10 12:03:00 CEST 2008 I'm thinking about contributing to the USBprog PICkit2 v2 firmware clone: http://www.embedded-projects.net/index.php?page_id=218 Preliminary source code is in the SVN svn co http://svn.berlios.de/svnroot/repos/usbprog The directory with firwmare is here: svn co http://svn.berlios.de/svnroot/repos/usbprog/trunk/usbprogPIC/firmware2/ Since there is already C code, writing it in Forth is maybe not a good idea. A more interesting approach might be reverse-engineering of the .HEX file + build some tools for that. Writing C for AVR is probably also a good starting point to get more familiar with the architecture + it could allow some code generation experiments. Conclusion, for me this project is about: * Working towards a standard programmer for Staapl, next to the native PICkit2 * Adding some reverse engineering functionality for PIC18 (including some simulator stuff?) * Getting to know the AVR and it's gcc based toolchain, and get an idea of how good the compiler actually is, to see if there's much to gain with a stack-based HAL. Entry: 8051 Date: Wed Sep 10 12:46:39 CEST 2008 Another important target next to AVR and ARM/thumb might be the 8051. This can serve as inspiration: http://www.camelforth.com/page.php?4 Also, this might make Gert get off his ass and try my software.. EDIT: just requested samples for AT89C51RB2 and ATMEGA168 Entry: Re: DavidP5 (Microchip forum) Date: Fri Sep 12 08:33:05 CEST 2008 In the following thread: http://forum.microchip.com/tm.aspx?m=356807 DavidP5: > So, what you made is a variant of the Forth programming language for > PIC18s? Essentially, yes. That's what is finished now. But there's a lot more to it than that, mostly moving in two different directions: Higher abstraction: It is Forth, but with its conventional macro system replaced by a functional programming language based on the Joy language: http://en.wikipedia.org/wiki/Joy_(programming_language) Why? * As you already mentioned, functional programming languages are good for writing compilers: a compiler is a data structure converter == a function. A tool most useful for this is pattern matching: http://en.wikipedia.org/wiki/Pattern_matching * Due to their similarity, the Joy/Forth interplay makes partial evaluation easy to express as simple algebraic manipulation. I.e. there is a rule that says that "1 2 +" always can be replaced with "3", or more generally for any two numbers preceding "+". The trick is that this mechanism can be used for more general template programming. This, together with the integration in the PLT Scheme module system makes a solid base to start experimenting with different languages and domain specific models on top of the low-level typeless stack machine language. Portability: The idea is to find a standard set of macros that can abstract the microcontroller architecture to a large degree. This set will be based as much as possible on standard Forth and existing stack computer instruction sets. Compared to C this can go a lot further because it is based on a macro system instead of a function library API as is common for other abstraction layers. Staapl contains a (dis)assembler generator that should make porting to new architectures relatively straightforward. I'm currently working on 12 and 14 bit cores. The step to integrating this with a processor simulator generator isn't that big, but this would really need closer cooperation with chip vendors as I'd like to do this with machine-readable chip descriptions. What I'd like to find out is how people who spend most of their time ``in the trenches'' would use this tool. For me it's in the first place a tool to build things on top of, but I'm quite curious whether using it just as a low-level machine abstraction layer would work, and what would need to change to make that really practical. I wrote some nontrivial applications in it (a sound synthesizer toy and an ASK/PSK modem) so I'd say it can definitely be used as such. For more information, here's some documentation that talks only of the lower Forth layers: http://zwizwa.be/archive/pic18-forth.pdf http://zwizwa.be/archive/pic18-synth.pdf Cheers, Tom Entry: simplify simulation Date: Fri Sep 12 12:46:53 CEST 2008 Actually, if I only knew the stack effect of a macro for all literal arguments, it would be straightforward to simulate it during live target interaction. This could get rid of all explicitly coded simulation in live/command.ss making the command frontend less dependent on extensions (all semantics would be encoded in macros). Approaches: * annotate macros with stack effect (pre-checks) * let it operate on data directly (either lazily, or copy whole stack) I.e. for +: 1. copy the whole stack (1 2) -> ([qw 1] [qw 2]) 2. apply macro -> ([qw 3]) 3. success? replace stack -> (3) Copying the whole stack isn't too much communication overhead, and it's easiest to implement. The change needs to be made in live/target-lang.ss in 'target-interpret. This needs to change to full symbolic interpretation relative to the compiler namespace: map-id needs to do nothing. Maybe the the undefined symbol hook for the target/xxx namespace reference should be added to then try the macro/xxx namespace? It's simpler than that: at compile time, check if the name is defined, otherwise compile a macro/xxx reference. Use 'identifier-binding See next posts. The trouble is that writing a generic simulator is a lot of work. Probably not so much the infrastructure, but entering all the processor specific data. Looks like I'm better off with just keeping the current simplistic interpreter + a macro evaluator. Entry: simulator simulator Date: Fri Sep 12 13:22:50 CEST 2008 I don't have a target board at hand atm, but i'd like to test the interaction code.. How to proceed? Let's connect the default IO to a stack machine simulator that simulates the 3-instruction Forth. Got it +- working, except for data stack access: this is not abstract enough in the current tethered approach. The simplest solution seems to be to add a 'meta' command to the interpreter, which will load a pointer to a meta struct area. This could then contain info about the target, without the need for more interpreter instructions. Alternatively, this could be stored in a per-micro specific location. i.e. the boot block. OK.. now I'm getting ambitious.. What about turning this into a seed for a proper machine simulator? Instead of writing a version of the interpreter for the machine, compile the current .f code to another machine and run that in the simulator. Might be a good idea to converge on the standard machine model. Let's try to separate things out a bit.. 1. Reference semantics = Scheme. The pattern matching rules define the different type signatures. 2. Machine model = stack ops + memory reference. The stack ops can be derived straight from the macros. So, build a function that translates a bunch of macros into a symbolic interpreter for a stack machine with a certain word length. What should be the goal? * To be able to compile .f code to a standard virtual machine for bit-accurate simulation + testing of control structures. * Simplify this kind of simulation + generic platform-specific simulation and porting. The idea is that if I want to compile to a huge number of architectures, I better think about a decent simulator/assembler architecture. It's probably best to allow for an incremental assembler construction path: not all opcodes are necessary to create a basic uC port. A lot of the assembler infrastructure for PIC18 is not really used. Entry: towards standard machine architecture Date: Fri Sep 12 16:29:41 CEST 2008 There are essentially 2 kinds of semantics involved: the PE semantics, which is scheme with infinite precesion and all kinds of data types, and the concrete machine semantics. All the rest is just intermediate and should be derived or at least verified. In order to improve test-driven development, these need to be made comparable: some redundancy needs to be inserted to check compiler correctness and machine model correctness. So, is it possible to create a model for the 'generic risc' chip emulating a stack machine, and then gradually use this model in compiler generation/verification? Microcontroller HAL: Harvard architecture (2 memory pointers) 2 Stacks. Arithmetic operations operating on parameter stack. Function call/jump Conditional jumps. It looks like this is the same problem as creating a simulator for a specific processor based on a generic simulator writer framework. Roadmap: build a verifyier for the current PIC18 compiler first. This means that all instructions used should have annotations that allow semantics to be added, enabling simulation and so verification. The processor (logic) specification language can be a simple functional language with named nodes (multi in/out functions). Maybe SSA, essentially parallel dataflow (really?), is actually the most convenient representation? (define-io (add (a b) (dst w n ov z c dc)) (! (c W) (+ a b)) (! z (zero? W)) (! n (ref W 3)) (! dc (ref W 7)) (! ov (xor c n))) Then the base language: If i'm not going to express it down to the gate level, some meaningful intermediate level needs to be used. Here there is really no alternative other than C. Simulator conclusions: * Compared to compilation, ultimately simulation needs to be FAST. The only way to make things run fast on arbitrary host systems (workstations) is to generate C, and specialize a simulator to a program. * This answers the question of primitive semantics: C and it's bit manipulations. The composition mechanism can be kept at SSA, which is trivially transferred to/from other dataflow representations. * Such a primitive language is easily simulated in scheme when speed isn't important, but it's important to keep that layer to not use too many higher order tricks that would prevent compilation to C later. Entry: summary of activity Date: Sat Sep 13 09:23:20 CEST 2008 * PICkit2: working towards an integrated solution that can program + communicate over the ICD2 connector. * thinking about standard machine model + how this can be combined with a concrete machine simulator and how to write such a thing, or if a virtual simulator is better. * thinking about more targets and what needs to change in the assembler generator to support addressing modes (PIC doesn't have any: they are memory mapped). * occam-pi and staapl: how much does it take to run just the synchronization primitives in Forth? * documentation and correspondence Entry: MetaScheme Date: Sat Sep 13 10:16:27 CEST 2008 * Trying to re-implement this: http://okmij.org/ftp/Computation/Generative.html#meta-scheme We implement the very clever suggestion by Chung-chieh Shan and represent a staged expression such as .<(fun x -> x + 1) 3>. by the sexp-generating expression `(,(let ((x (gensym))) `(lambda (,x) (+ ,x 1))) 3) which evaluates to the S-expression ((lambda (x_1) (+ x_1 1)) 3). Thus bracket is a complex macro that transforms its body to the sexp-generating expression, keeping track of the levels of brackets and escapes. The CEK machine: code, environment, continuation. ? * Trying to find the key points of difference between MetaML and Staapl. This is quite difficult, because there is no clear analogy of "code object" in Staapl. There are two: low level stack machine code, and anything Scheme (including Coma code), which represents a stack machine code transformer. The presence of partial evaluation makes switching between these two straightforward, but also a bit confusing.. 1. Bracket == Quasiquote in Staapl, because template code cannot generate new binding constructs. This makes things simpler. 2. Code objects are opaque (implemented as machine code transformers closed over lexical Scheme environment) instead of abstract syntax trees. This allows one to never have to leave the semantics of generators, and Scheme -> Coma translation isn't really staging, but namespace mixing. Entry: FSM-hume Date: Sat Sep 13 17:49:49 CEST 2008 FSM-hume: check the primitives, and assimilate. http://www.macs.hw.ac.uk/~greg/hume/ Hume is a novel programming language, intended for resource bounded domains, designed at Heriot-Watt University and the University of St Andrews. It is based on concurrent finite state automata controlled by pattern matching and recursive functions over rich types. Hume has been designed as a multi-level language, where different levels have different formal properties amenable to different analyses. HW-Hume is a relatively impoverished language of bits and tuples for characterising hardware, with decidable equivalence and termination, and predictable time and space behaviour. FSM-Hume introduces fixed precision abstractions over bit tuples, including integers, reals, strings and vectors, with associated operators and conditional constructs. This level, oriented to wider finite state machine-based designs, has strongly bounded time and space behaviour. HO-Hume augments FSM-Hume with a repertoire higher-order function with known cost models, such as map and fold, and user-defined non-recursive functions. PR-Hume extends HO-Hume with user-defined primitive recursive bounded functions and full Hume is a Turing Complete language. EDIT: Reading the introduction of the Hume report, the thing that struck me is the use of a restricted set of higher order combinators without full recursion in one of the intermediate steps. Looks like a more formal treatment of the higher order macro ideas that leaked into Staapl. (p.e. ifte) http://www.macs.hw.ac.uk/~greg/hume/hume03.pdf Entry: the Transterpreter Date: Sun Sep 14 10:03:17 CEST 2008 > This is an email from Matt Jadud describing the TVM and occam > http://www.transterpreter.org/ The TVM is a virtual machine for the instruction set of a Transputer. In particular, we support the 4xx and 8xx series instruction sets, which includes floating point operations. The Transputer had a 3-depth stack (for integer and FP each; I'll ignore FP from now on, as the FP is really no different/no more interesting). The A, B, and C registers (as they are called), along with a workspace pointer, instruction pointer, and a front pointer and back pointer for the scheduling queue, and a timer queue represent all of the machine state needed for an occam process. (occam was the language designed to execute on the Transputer.) By keeping things to just seven words, you could do very fast context switches; the Transputer could do nanosecond context switches between parallel processes. This is, in part, because it had some very fast RAM on board. It's specialized nature, and expensive construction, are just two of the reasons it ultimately died; the Intel 286 was a much cheaper processor. The occam language is grounded in the CSP algebra for reasoning about concurrent and parallel processes. http://www.usingcsp.com/ In CSP (and therefore occam) you reason about sequential processes that execute in parallel. They communicate over channels, and sending and receiving were designated in the algebra by '!' and '?', respectively. In occam, these are the operators for sending and receiving as well. Producer/consumer looks like: PROC producer (CHAN BYTE ch!) WHILE TRUE ch ! 42 : PROC consumer (CHAN BYTE ch?) WHILE TRUE BYTE c: ch ? c : PROC main () CHAN BYTE ch: PAR producer(ch!) consumer(ch?) : The channel is represented in the TVM (and on the Transputer) as a single word in memory. This is a channel word that is in shared memory; the workspaces of the two processes are private. (This privacy is, and in fact MUST be, guaranteed by the compiler, or true parallelism breaks.) When the writer comes in, they tweak the word; when the reader comes in, they either wait, or complete the rendezvous. The communications are therefore the synchronization points in a program, and dictate when we enter the scheduler. Further, they are synchronous (we block on read and write) and point-to-point. The extended language, occam-pi, introduces SHARED channels (that make implementing bus architectures easy), as well as some other nifty bits, like mobility of data and channel ends. For now, I'm staying in the core/original language for purposes of this discussion. How this is implemented does not matter. You could implement that protected word using transactional memory. You can rip my producer/consumer apart, and replace that channel with a wireless link... as long as you preserve the semantics of the channel. It takes work, but the powerful thing about the channel abstraction is that it is a clean, well-defined point at which we can ignore what is going on at the other end, and just assume we're either reading or writing to a word in memory. In the TVM, we just use a word and have one or two protected values that indicate the channel is empty and whatnot (MAX_INT and MIN_INT play roles here, I think; I can't remember.) If you're familiar with the Cell, their mailboxes (which provides a single word for communicating between the CPU<->SPU as well as all of the SPU<->SPU communications, I think) are effectively the same primitive. They have implemented what amounts to a CSP channel for transferring one word at a time between these units directly in hardware, and a small API for interacting with that word. Of course, the C compiler doesn't help make sure you're not doing something braindead, and they do provide for DMA transfers between the SPUs... but my point is that many hardware devices that are going parallel have similar operations baked in. At a higher level, the blocking calls in MPI are equivalent, and I've mapped occam channels to them in the past. Things "just worked," by-and-large... with a few caveats about comms failure, which (sadly) the original occam model never considered. (It wasn't possible... or, something catastrophic happened, so you didn't care, since your computer was on fire.) So that's a bit of history and a bit of a peek into the way synchronization happens. The Transterpreter is more interesting than a Transputer in a lot of ways, since we can programmatically map channel ends to most anything. For example, in the Blackfin port (and soon the ARM port, once we get our dev boards), we map the interrupt-driven UART to a channel end. 0. The VM is sleeping; the CPU is sleeping. 1. A character lands on the UART. The CPU wakes up. 2. The interrupt stuffs the character into an occam channel word. (We have a C API for this.) 3. The interrupt sets a flag in the VM, wakes the VM, and goes away. 4. The VM runs the occam bytecode, and the "channel" handling comms is ready. The corresponding read happens. 5. Everyone goes back to sleep if nothing else needs to be done. The fun thing is that we can do this with an arbitrary number of interrupts, and the occam program will be semantically sound, even though we're doing spurrious/arbitrary interrupts on hardware. This is because channels only guarantee sync, they don't guarantee time. Therefore, the VM can be "waiting" on any number of "hardware" channels, but since it thinks they're just normal occam channels, the VM is happy, and the occam programmer can get on with building a highly parallel network of communicating processes on their embedded target without wondering about the safety of their hardware interrupts. If it works in test (on a single processor with non-interrupt driven code), then it will work without. (I'm not trying to waive my hands overly aggressively... things should be cool, but we're still exploring this space. The semantics, we believe, hold.) Given that the Transputer was originally used (and still used, in some form) for codecs, comms, and video processing, it makes sense that there is a good mapping. Entry: Trench dwellers Date: Sun Sep 14 11:53:03 CEST 2008 Got quite a to-the-point reply from DavidP5 on the microchip forum: The people who inhabit these forums are probably "in the trenches" as much as anyone. It is ironic that you seem to be having some difficulty explaining what your tool actually does in a way that is comprehensible to the people you are talking to (translating it into trenchese?). Clearly you love the concepts that you work with, and they sound so exotic (the lambda domain for example), but they may be getting in the way of you showing the trench-dwellers how you are proposing to make their lives easier. I suspect that using it "just as a low-level machine abstraction layer" is as much as you can hope for, but you will have to stop talking like that if you want to get any traction around here. I think you would be better to say that you have made a Forth implementation for PIC18s, and ask people to test it and suggest improvements. I have read your pic18-forth pdf. It is interesting but way too abstract for "trenchers". A tutorial about how to use your Forth for a variety of useful tasks (how to use a timer, how to send and receive data over a USART, responding to an external interrupt, writing to flash etc) would be better received. Trench-dwellers are primarily practical people so you must show them how to do stuff with your Forth if you want them to be interested. It would also be worth demonstrating that your compiled programs are at least as good as compiled C in terms of efficiency and resource utilisation. This puts the finger on the sour spot. Looks like there is only one way around this: differentiate the documentation in a Scheme camp and a machine code camp. My reply: Thanks. Your suggestions are very helpful. It does look indeed like a different approach to documentation is needed. I'm getting more convinced that the real problem I'm trying to solve is not really the technical one of building this system, but to write the documentation, bridging two different engineering cultures that generally just ignore each other.. A question to the PLT Scheme list: (at least, something I want to ask but writing down the question makes me think more.. this needs to ferment a bit more..) Hello folks, This list being a lair of educators, I guess it is the right place to get some inspiration. I'm looking for advice on writing documentation which needs to use the concept of lexical scope and closures for people that have spent already several years in an Electrical Engineering background. I have an EE/DSP background myself, and after absorbing Scheme and a bit of basic language theory over the last 3 years I think I finally get it, in that this ``exotic lambda domain'' has become a natural way of working. I recall that learning this was not an easy process, and involved partially un-learning the use of real machines as my only reference point. However, in the meanwhile I seem to have not made any progress at all extracting knowledge from my own learning process to better explain the basic ideas in simple terms to my former peer group, or avoiding unnecessary lore. So the question: Does anyone have ideas or experience to ``sell the benefits of functional programming'' to an EE biased audience? --- I need two chapters. 1. The Forth language. 2. Its compiler. I'm taking the following post off the blog.. Entry: Can you sell a language? Date: Sun Sep 14 16:43:08 CEST 2008 ( Appeared on the blog on Sun Aug 24 14:39:43 CEST 2008 but removed because if its rambling style. This problem is much deeper than I assumed, and I need to take another approach. ) This article is an attempt to clear out some ideas about design choices in Staapl, from the _cultural_ point of view. What is culture and what has pure technical merit? It turns out this is not so easy to answer. I don't remember who said it, but it was at a keynote talk at MWSCAS 2000: ``As engineers, we should be aware about not falling in love with our approach.'' It's an idea that I very much needed to hear, and which taps me on the shoulder from time to time. Ask yourself this question: how much of your approach is just tradition? A couple of days ago I ran into this post on LuT: http://lambda-the-ultimate.org/node/687#comment-18074 A nice display of insight. This made me think a bit about the real merit of Staapl, aside from the cultural aspects tied to the Scheme and Forth paradigms (it's cleaner). See the previous post that tries to identify the design choices, illustrating possible other approaches to the problem Staapl solves. Of the 4 trade-offs illustrated, I would say the first two are technical and the last two are cultural. The choice between library or language is what Frank Atanassow stresses in the LuT comment. I have no data to prove it, but it does look like implementing a compiler in a functional language is simpler because the problem of compilation is expressing maps between data structures. Compiler writing, a hobby for data structure junkies: http://flint.cs.yale.edu/cs421/case-for-ml.html Within functional programming the typed vs. untyped choice is one of much debate. I picked a middle road here, using PLT Scheme, a dynamic language with static aspirations. This choice is definitely about culture, but in my opinion not so terribly important. The main players that set aside Scheme in the land of dynamic languages are hygienic macros and PLT's module system. Using a stack machine as a machine model is quite a gamble because of the incredible pressure of the C programming language, which fueled the rise of RISC machines. However, going against this culture makes things simpler. Maybe it's a false hope, but I have the idea that once parallel programming becomes _culturally accepted_, machine architectures will become simpler. So, this choice is cultural or better _counter-cultural_ . Let's try to answer Frank's 5 questions: 1. What problem does this language solve? How can I make it precise? As a representation language it contains an easily retargetable machine model that supports a simple metaprogramming framework in the form of quasiquotation, combined with a binding form for target code parttern matching. 2. How can I show it solves this problem? How can I make it precise? A subjective statement: partial evaluation for concatenative language (string rewriting) is simpler to understand than lambda expression reduction (tree reduction rules). 3. Is there another solution? Do other languages solve this problem? How? What are the advantages of my solution? of their solution? What are the disadvantages of my solution? of their solution? Yes. Metaprogramming based on (typed) lambda calculus: MetaOcaML. The disadvantage of my solution is the lack of type system (which does look like it is straightforward to add) and the forced use of a point-free programming style. 4. How can I show that my solution cannot be expressed in some other language? That is, what is the unique property of my language which is lacking in others which enables a solution? It can, therefore it is a Scheme library. 5. What parts of my language are essential to that unique property? Scheme -> Coma quasiquotation, machine code pattern matching binding construct, and semantics gueded by partial evaluation transformation rules. * * * Both Forth and Lisp are firmly rooted in a culture of thought that you can only appreciate once you went through the "aha-moment" yourself. This statement reflects some of the smugness present in these programmer cultures. A smugness that deters many people new to the paradigms. But from my experience (I come from a C/assembly background) I can testify that after this aha moment, you can't look at things in the same way as before. So for those that had the aha for both Forth and Lisp, I don't think much justification for Staapl is necessary. You are probably not interested, because you're writing your own abstraction, right this moment! The rest of this article is for the other audience I try to reach: embedded engineers that do not know Forth or Lisp and their relation to macros and staging. * * * The problem with abstraction knowledge is that it is difficult to bring up the motivation to aquire it when you already have a method that is ``good enough''. It's certainly not a good thing to make things too abstract. http://en.wikipedia.org/wiki/Worse_is_better But sometimes the smart thing to do is to move beyond the barriers of the programming system, up the abstraction ladder. The main point is: for complex code that requires lowlevel control in such a way that excludes the use of higher level languages directly (can't trust that compiler!), it might be a good idea to find an intermediate solution between using a low or a high level language approach exclusively. Use a highlevel language to _generate_ code for a lowlevel one. Don't trust a generic compiler to generate good code for you, write a specific one yourself by automating the high -> low translation of the abstractions you need. If you can ``manually compile'' a highlevel solution description by expressing it in a low level programming language, and perform the same kind of translation multiple times, you can probably automate this process. The good news is that writing a special purpose compiler for a specific abstraction is quite a lot easier than trying to write a compiler that optimally tries to compile any kind of program. Additionally, this problem can be simplified by using a set of primitive compiler components. So, let's suppose you've decided you need a staging solution. Why would you choose Staapl? http://en.wikipedia.org/wiki/Not_Invented_Here I know I can't sell Lisp. I know I can't sell Forth. But I know that if I want to build a framework for code generation (a.k.a. model based design), I can't ignore the knowledge encoded in the design of the Forth and Lisp languages. To use either of both seems like a limitation: Forth's over-concrete standard semantics doesn't use a source representation that is abstract enough and Lisp's (or better Scheme's) machine model is too high level. The Joy language seems to be the grain boundary between these two models. Staapl is really nothing more than a possible implementation of the bridge between a simple machine model and code generation viewed in a functional programming setting. The Staapl approach, through being built on paradigms from Forth and Lisp, brings together a large collection of design choices that formed the Forth and Lisp languages, leading to a fairly optimal encoding system. The path is quite straightforward: start with Forth, a language that is already good at efficiently encoding problems occuring in embedded design, and and replace its compiler, usually written in Forth, by a mechanism that provides an impedance match to languages based on lambda abstraction. This is done by viewing Forth macros as functional combinators (the Joy language model). These macros provide an API point that separate a multi-target backend from highlevel metaprogramming and code generation on top of an abstract machine. http://en.wikipedia.org/wiki/Joy_programming_language In staapl the impedance match between lambda abstractions and combinators happens in two places. The Scheme->Scat quasiquoting mechanism allows the construction of parameterized _concatenative_ code, (let ([a 123] [b (macro: +)]) (macro: ',a ,b)) ;; results in (macro: '123 +) while a pattern matching mechanism for target (stack)machine code allows the use of lambda abstraction to build _primitive_ code transformers. (patterns (macro) (([qw a] [qw b] +) ([qw (+ a b)]))) In other words: Staapl removes some of the difficulty of having to use a concatenative language at the higher abstraction layers, while keeping the simple concatenative model as the basic composition mechanism for metaprogramming. The real advantage of Staapl is its structure as a machine model (primitives + composition mechanism) combined with a programmable code generator adapted to that model (composition of concatenative macros) that is accessible from an applicative Scheme language through its lexical scoping mechanism. The simple 2-stack machine model allows simple re-targeting to any machine architecture, with a slight bias towards the low-end ones, where C might be sub-optimal from a code size perspective. The overal algebraic feel of combinator code is an asset for writing code generators. In short, stack languages are good for writing lowlevel primitives. Staapl embeds this stack model in Scheme, which is based on the more standard function application model, and as such interfaces easer to generic domain-specific models, without requiring a specification in a stack language representation. To conclude, the merits of Staapl summarized: * minimalistic machine model * simple staging framework (code = sequence instead of directed graph) * language promotes fine granularity (small functions / macros) * functional languages make source transformation easier * a bridge to lambda-based abstraction (quasiquoting + patterns) * once in scheme, use scheme's function+macro composition mechanisms Entry: How not to sell a language Date: Sun Sep 14 21:32:21 CEST 2008 OK. Different strategy. I need to use two different points of view. There seems to be too much a divide between CS and EE to try to explain everything from a single point. It's important to try to build a community around the practical aspects of Staapl, ignoring the ``exotic lambda domain'', and to complement this with metaprogramming internals. The documentation as it is now is already somewhat partitioned in these two classes. It just needs to be made more clear: Abstract: * The general idea behind Staapl metaprogramming for PLT Scheme. * The reference guide for the Scheme API. * Some articles on the blog. * The excellent PLT Scheme documentation. Concrete: * An overview of the PIC18 macro Forth dialect. * A practical tutorial about interactive tethered development. * A tutorial about writing a low-level DSL using bottom up programming with procedures and macros. Entry: Break Date: Mon Sep 15 11:38:02 CEST 2008 I started rewriting pic18 forth doc, but ran into a problem with scribble and forth syntax. Overall I think it's time to take a break and re-order priorities.. I think I have an idea of where to go next, but some focus is in order. Main: get PICkit2 working so examples can be constructed. Side: work on the reference doc + the concrete docs in terms of examples - explain the differences with standard forth: * based on macros * no reflection * no 'postpone' necessary due to partial evaluation - move metaprogramming examples ('patterns and 'macro:) from the introduction to the reference doc. Entry: non-orthogonal part Date: Tue Sep 16 12:43:06 CEST 2008 There's one thing that always gets in the way while explaining the quasiquotation and pattern matching facilities central to Staapl's metaprogramming model: The fake algebraic type constructors perform the same role as quasiquotation of macros. Can the late detection of this pattern somehow be bent into an asset? It is an essential part, because without it, it is impossible to create primitive code generators. So, there are 3 important parts: code = (fake) algebraic data types * code deconstruction (pattern matching) * code construction composition: concatenative + quasiquotation evaluation: turn abstract (macro) code into concrete code. The main problem is that I have no clear idea of stages. Basically, there are only two ``real'' stages: 1. The creation and composition of code generators. Mostly based on higher order programming. 2. Execution of these. Maybe the real idea is: There is no need for multi-stage programming if your metalanguage has higher order functions. (Then, in a HOPL, macros can be useful yielding again a mult-stage language.) Entry: MetaML and Cross-Stage-Persistence Date: Tue Sep 16 16:16:51 CEST 2008 What I don't really understand is why CSP is stressed so much.. Why isn't the whole code represented as a closure that evaluates down to something with a much simpler lexical structure, i.e. machine code? Why do you need intermediate representation layers, that are littered with opaque CSP values, but nevertheless cannot be inspected? It seems MetaML is really fairly tightly rooted in the manipulation of ML abstract syntax trees, in such a way that staged code can be presented to the ML compiler. In staapl, the abstract representation is quite straightforward because the code IS the compiler: it is a closure that when applied to the empty program dumps out concrete machine code. I guess renaming in MetaML is necessary because the ASTs manipulated are probably still flat trees, and do not contain direct variable->binder connections? (NO: it works modulo alpha-equivalence) So I wonder.. Am I missing something? The MetaML story starts with explicit manipulation of a-equivalent template code (lambda calculus), I start with manipulation of flat transformer combinators and primitive combinators (primitive stack machine code transformers). Entry: algebraic types Date: Wed Sep 17 08:27:54 CEST 2008 I've been selling the pattern matching as being based on algebraic types. Maybe it's time to make that a bit more formal? The most interesting problem to solve is to perform type analysis on the macros. Is it possible to compute every in-out behaviour? The reason for asm code to be in source form is mostly confusion about what to do with it where it to be an abstract data structure. Maybe the most important changes are: * replace the extensional composition mechanism by an intensional one so type interference can reach the primitives. * add annotation to primitives. (done) The first one is less straightforward.. This is simple for anything not built on top of parser extensions.. Looks like the parser needs to be rewritten for this. That's a lot of work... It's probably better to try to incrementally improve it. Make it more abstract.. One thing I don't like is the way parameters are used to record the 'closing' of expressions. It might be better to make 'expr' an abstract data type (zipper?) and encode all the information in it. First part is quite simple: it's only the immediate/function sites that need modification for the normal syntax (not forth/parser-tx.ss) Second part involves the 'close' operation, which can be separately abstracted. Let's first remove this. It's trivial for the pure concatenative code, but not for parser-tx OK. got a better idea now about intensional representation: all the parser macros convert ultimately to pure concatenative code through the quotation mechanism. The only thing parser-tx does is to add dictionary functionality. It looks like replacing the default lambda wrapper is all that's necessary. Then, the "current-close" operation is only used in the locals-tx.ss transformer. It's not just the default lambda wrapper, it's also the assumption that the initial state has a syntactic representation. Let's try to model the expression as a function parameterized in the internal state. Got first draft of functional representation working: (define (rpn-open-expression) (lambda (x) x)) (define (rpn-on-cursor fn expr) (compose fn expr)) (define (rpn-close-expression expr) (let ((s #'*state*)) #`(lambda (#,s) #,(expr s)))) (define rpn-expr? procedure?) Next: fix locals, since it's broken now. This is the one that needs a cursor, since it needs to wrap the code. This uses a non-hygienic name capture trick that apparently doesn't work in the current approach. This needs to be factored out a bit. I don't see a way to do this without capturing the state and re-binding it.. So i put rpn-state back in, but it's not a paramter. Entry: loading library code Date: Thu Sep 18 12:01:40 CEST 2008 using "load p18f1220.f" from a file not in the staapl/pic18 directory doesn't work.. it does work when using staaplc, but not when using the require form from scheme. EDIT: this is because the module level code has no concept of load path: it only uses absolute and relative references. The prj/pic18 version does have this functionality, and is the preferred way of using the compiler. Entry: documentating Date: Fri Sep 19 19:27:56 CEST 2008 I'm getting a bit fed up with trying to explain Staapl to the wrong audience.. Learned a lot about how _not_ to write documentation, and hopefully moving towards a better approach. It's improving, but the way is longer than I thought. Entry: static assembler Date: Sat Sep 20 00:39:38 CEST 2008 Let's see what is necessary to get some static analysis going. Before any compile time analysis can be added, the assembler needs to be moved to static names. One remark: if the assembler becomes static, it's straightforward to convert it to concatenative code. There are 2 uses for assembly instructions: * as data structure supporting pattern matching and construction. * as reducable expression A pseudo instruction is non-reducable, and serves only as a data-intermediate. Assembly requires the association of the data structure to a reducer. Let's make all assembler structs carry a pointer to a reducer, and let them derive from an abstract assembler opcode which has only a reducer. Make sure the pattern matching form ignores the reducer when matching, and provide a constructor that creates a proper reducer thunk. NEXT: create a macro for this.. requires access to the lowlevel version of define-struct Entry: more about macros and MetaML Date: Sat Sep 20 15:58:53 CEST 2008 More than I can process, but there was some talk about macros on LuT recently. http://lambda-the-ultimate.org/node/2987 Some things that struck me: Laurence Tratt: The Template Haskell school (for want of a better phrase - it might better be called the MetaML school) of compile-time meta-programming inverts the traditional Lisp notion of macros. Put very simply, in Lisp macros are special things that are explicitly identified, but macro calls aren't (they're normal-looking function calls that happen to do special stuff at compile-time); in Template Haskell, macros are normal functions (that just so happen to return ASTs) but macro calls are explicitly identified. This means that you can do anything you would normally do with functions, so 'macros' are first-class in such an approach. Interesting, since this is one of the things that I consider a feature in Staapl: heavily metaprogrammed source code can be read as if it is single-stage. naasking: Staging is strongly-typed runtime compilation. So you can replace a interpreted parser expression built from closures with a compiler using staging, and get a specialized parser that executes at full speed with no interpretive overhead. I'm probably missing a lot of nuances trying to compare Staapl with MetaML by ignoring the type system.. Frank Atanassow: The problem with LISP-style macros is that they force the user to solve those problems over again. But, if the interpreter/compiler writer has already done that, why should they have to? Why can't they reuse that functionality? To put it another way, the problem is that LISP doesn't sufficiently abstract from the way programs are represented. It forces you to work with surface syntax, represented as trees of atoms, when in fact a program has a great deal more structure. Now, hygienic macros do solve quite a lot of problems here, since at least the identifiers are handled appropriately, but since the "great deal more structure" in lisp is hidden behind the effect of a macro, there is no way to check that structure before executing a macro with the "wrong parameters" and generating a "run-time error at compile time": you can't statically check macro code, and have to wait until it is used in a tranformation for it to fail. This leads to the following references: About macros: http://lambda-the-ultimate.org/classic/message9532.html Peter Van Roy: Higher-order programming is more expressive. Macros are syntactically more concise and give more efficient code. (Note that expressiveness and verbosity only have a tenuous connection!) In Staapl, expressiveness is traded for efficiency. The thing that glues HOFs and macros together seems to be a good partial evaluator. I.e. higher order traversal function applications replaced with static 'structured programming' code. Peter Van Roy: My definition of expressive power has nothing to do with Turing equivalence. It's something like this: adding a new language construct increases the expressive power of a language if it can only be expressed in the original language with nonlocal changes. For example, adding exceptions to the declarative model of chapter 2 in CTM increases the expressive power, since the only way to do the same thing in the declarative model is to add boolean variables and checks to all procedures (checks like: if error then return immediately). This was proved in the 1970s! (All of CTM is structured in this way, with computation models becoming more and more expressive, in the precise sense defined above. See Appendix D of CTM for a brief summary.) There's some more talk about quasiquotation being the really important idea, not s-expressions. Frank Atanassow talks about macros just being notational convenience, not really extending expressiveness according to PVR's definition. Some threads about MetaML: http://lambda-the-ultimate.org/classic/message8778.html Frank Atanassow: One of the limitations of MetaML, incidentally, is that its representation of programs is too abstract. Although you can reflect a program---turn it into a value---the only thing you do with it is compose it with other reflected programs, and evaluate it. You can't, for example, rewrite parts of a reflected program. This is sufficient to turn an interpreter into a compiler, but not into an optimizing compiler. http://lambda-the-ultimate.org/node/2438 Need to read that again.. But it does bring me back to the idea that the combinator approach in Staapl makes things really a lot simpler.. But what is lost? http://web.cecs.pdx.edu/~sheard/staged.html Tim Sheard: A Taxonomy of meta-programming systems. Quasiquotation is defined as a different way to write syntax trees built from algebraic data types, instead of writing the trees by manual construction. Entry: ee vs. cs Date: Sat Sep 20 18:23:18 CEST 2008 Reading these cs-jargon filled posts make me think that documentation for Staapl should really play a bridging role between two worlds, as I find myself right in the middle of it now, not being part of any :) Anyways, the message is clear: more examples. Entry: typed scheme Date: Sat Sep 20 19:34:40 CEST 2008 http://lambda-the-ultimate.org/node/2622 occurence typing: The key feature of occurrence typing is the ability of the type system to assign distinct types to distinct occurrences of a variable based on control flow criteria. You can think of it as a generalization of the way pattern matching on algebraic datatypes works. Entry: type classes in scheme Date: Sat Sep 20 19:56:24 CEST 2008 http://groups.google.com/group/comp.lang.scheme/msg/ad9df65985068c16 cool. Entry: musing Date: Sat Sep 20 21:43:45 CEST 2008 After a day of reading and not really learning that much, my heart aches for some down-to-earth engineering.. Looks like I want too much and need to slow down a bit. I've learned a lot in the last couple of years, but beore i can get further on the big ideas front I need to read more. Finish TAPL and have a look at CSP/CTM, then look at HUME. Entry: static asm identifiers Date: Sun Sep 21 11:13:51 CEST 2008 problem is to give the static identifier (used in match) the right kind of prefix. maybe i should have a look at dave herman's algebraic datatypes instead of structs. Entry: C code generation Date: Mon Sep 22 10:25:48 CEST 2008 http://docs.plt-scheme.org/dynext/index.html Entry: pickit2 debugging Date: Mon Sep 22 17:27:42 CEST 2008 problems: apparently (EXECUTE_SCRIPT (ProgEntryScript)) was missing. Also, the voltages are messed up. It looks like the connection code in Jeff's pk2-3.00 sets the pgm voltage. Look at pickitGetDevice2() for a proper initialization. Entry: concatenative.org Date: Tue Sep 23 08:54:38 CEST 2008 Staapl (http://zwizwa.be/staapl) is a 2-stage dynamically typed metaprogramming system for the PLT Scheme family of programming languages. It consists of a concatenative transformer language Coma that operates on a stack machine language. It is based on two main observations. * An imperative stack machine model (abstracting a processor as a Forth machine) works particularly well for small embedded microcontrollers. * Staging and partial evaluation are simple to express for functional concatenative programming languages due to absence of problems related to identifier hygiene. Staapl's Scheme bridge consists of the dynamically typed functional concatenative Scat language, implemented as a hygienic Scheme macro. In its basic form, Scat is Joy with extensional compositions: code quotations are not lists, but abstract entities represented by Scheme closures. Scat compositions support quasiquotation for lexically scoped template programming from Scheme, and use a hygienic hierarchical name management system based on PLT Scheme's declarative module system. Staapl's code transformers are written in a derivative of Scat called Coma, a functional concatenative language operating on stacks of stack machine code instructions. Staapl contains a pattern matching mechanism for implementing Coma primitives in Scheme, which is used to implement partial evaluation and writing architecture backends. Currently there is an optimizing backend for Microchip PIC18. Staapl contains a simplified Forth frontend that hides most of the metaprogramming system. This Forth dialect has the standard Forth-style metaprogramming removed and replaced with a mechanism based on partial evaluation and template macros. It is possible to use the Forth language as a stand-alone programming system. Entry: MetaOcaml Date: Tue Sep 23 16:00:06 CEST 2008 Before I can make a roadmap for adding static analysis to Staapl, on top of the already mentioned move to proper algebraic data types and intensional Coma code representation, it might be a good idea to dig a bit deeper into MetaOcaml. The ideal application point is generation of PF's C-code video processing primitives from high level staged algorithms. Useful: http://www.mpi-sws.mpg.de/~rossberg/sml-vs-ocaml.html Let's start at the examples page: http://www.metaocaml.org/examples/ Looks like this is the most interesting part: http://www.metaocaml.org/examples/fft.ml Hello world in MetaOcaml -> C M-x tuareg-run-caml metaocaml # .!{Trx.run_gcc}(.<123>.);; - : int = 123 And generates the following file as a side-effect: -------------- void initializer() ; #ifdef OCAML_COMPILE #include "/usr/local/lib/ocaml/caml/mlvalues.h" #endif void initializer() { } #ifdef OCAML_COMPILE int main() { return 123; } #endif -------------- Entry: Texas Instruments TMS320C67 Date: Fri Sep 26 07:10:54 CEST 2008 What would it take to port Staapl to TI C67? For DSP apps stack-based approach without register allocation is not going to go very far, since most DSP apps heavily rely on using all registers for pipelining.. However, it might be possible to add compilers for a small number of building blocks. It could be that registers are mainly used for pipeline filling instead of random access... Entry: partial evaluation vs. dynamic closures Date: Fri Sep 26 09:07:42 CEST 2008 Partial evaluation and dynamic closures are related in that they are both partial reductions (given an expression with abstractions and applications, perform some beta-reduction(s) somewhere in the expression), but are quire different in the way they are represented. Dynamic closures are best implemented by delaying full application - the execution of (machine) code - until all the values are present: intermediate representation is a chunk of parameterized (machine) code and an environment of variable bindings. Partial evaluation performs reduction by generating a (machine) code representation that does not contain any free variables that need to be filled in by a separate record of variable bindings at final application time. The (machine) code produced after partial evaluation is completely stand-alone. But I apparently already said that in the blog.. Entry: more about typed vs. untyped staging Date: Fri Sep 26 09:51:45 CEST 2008 http://lambda-the-ultimate.org/node/2575 Oleg: By typed compilation I mean a transformation from an untyped to typed tagless representations. Entry: MetaML vs. Template Haskell Date: Fri Sep 26 09:53:21 CEST 2008 What I dont get.. On the TH website: http://www.haskell.org/th/ Template Haskell is an extension to Haskell 98 that allows you to do type-safe compile-time meta-programming. Then on Oleg's site: http://okmij.org/ftp/Computation/Generative.html In the process, we develop a simple type system for a subset of TH code expressions (TH is, sadly, completely untyped). So what's this? Type-safe manipulation of untyped code? Maybe the point is that TH allows _syntactic_ correctness by using a type-checked AST, but doesn't allow the type checking to run until it is effectively spliced into code and passes the compiler. Entry: Matlab / Simulink integration Date: Fri Sep 26 10:38:59 CEST 2008 Real-Time Workshop: http://www.mathworks.com/products/rtw/ Based on Embedded Matlab: http://www.mathworks.com/products/featured/embeddedmatlab/index.html And the fixed point toolbox: http://www.mathworks.com/products/fixed/ The language subset is what would be expected: everything works, except things that need dynamic tricks + annotation is necessary. Looks like a simple static type extension on top of dynamic Matlab. Entry: Type system for Coma functions Date: Fri Sep 26 10:45:23 CEST 2008 Path: Coma identifiers need to be linked to compile time information. How to implement this? I'm already using this space for Forth parsers, so it requires a different ns.. Time to look at how Typed Scheme does this. Now, as long as one stays within the primitives + composition model, this seems rather trivial. Base types could be derived from primitive stack code pattern matching rules, and composition is just figuring out all the branches that are taken depending on Coma run-time.. If there is remaining dynamic dispatch based on the value of instructions, then this won't work: search space will explode exponentially. So the move to real agebraic types might be necessary.. I.e. something like [qw n] -> (if (n < 0) [qw -1] [qw 0]) Here 'n can't be determined at compile time, but it still generates a [qw ...] type in any case. When the two branches do not generate the same type, this gives a problem, since it requires type unions (to simplify) or an exhaustive search. There is also a problem with Scheme -> Coma metaprogramming: how to introduce typing info there? This requires a binding form that allows information to travel at compile time. (let ((abc 123)) (let-macro ((foo (macro: 1 +)) (bar (macro-value abc))) (macro: bar foo))) Roadmap: * Figure out where this kind of dependency is used in an essential manner. Go from there to decide on what kind of restrictions to impose. * Scheme -> Coma template programming then requires some annotation. This could be solved by using a special binding form that introduces local Coma words, so is not an essential problem, but does reduce the freedom a bit in how to formulate things. I do wonder if this is the right way to go.. This is a can of worms. I'm better of starting with a typed language to build a similar system to see where the problems are.. Most code in Staapl is structured enough so it can be straightforwardly translated to Ocaml or Haskell code. Conclusion: stay with the current implementation, and gradually allow static checks to be introduced. Avoid conversion to a full static type system until the problem is understood (an implementation in a typed language is working.) Looks like the road to follow is more towards Typed Scheme instead: this already is written from the viewpoint of integrating with a dynamic language. Entry: typed vs. untyped metaprog Date: Sun Sep 28 07:57:56 CEST 2008 I'm not really convinced about the benefits of typed metaprogramming. The important issues are hygiene and the use of algebraic datatypes for code transformations. The fact that type errors get cought at generator compile time instead of generator execution time hower seems a bit moot. This only makes sense when one writes generators that are not used and so not tested. MetaOcaml is a nice system though. It does what it's supposed to do, and the offshoring to C is particularly interesting: in that setting typed metaprogramming does make sense. Generating C code from Scheme requires one deals with types in some way, which is not entirely trivial. However, Staapl has proper hygiene and models generators as closures. The fact that it can generate ill-typed _generators_ doesn't seem to be such a problem since this is still detected at generator execution time. Note however that Staapl isn't completely untyped. On the ADT level (the machine instruction opcodes) some static checking is definitely possible: Coma primitives and composition mechanism are both accessible and are _already_ compile-time computations. Typing the Scheme->Coma unquote operation could be replaced by cross-stage persistance only: require functions to always use a let-macro form such that type info can travel at compile time using local syntax bindings, and always interpolate Scheme lexical values as [qw ...] generators, or provide a let-constant form. Entry: Union types Date: Sun Sep 28 09:31:47 CEST 2008 One could see disjoint union types as a means to control conditional branching by keeping its effects local. Unadulterated dynamic "decision making" makes static analysis untractable since different branches of a "case" statement might have a completely behaviour. Union types with corresponding exhaustive "case" statements control this exponential explosion of possible paths through the code by joining branches back together: at the type level, each branch does essentially the same thing, allowing the identification of a union -> union map as a "local conditional branch" construct. So in essence: one puts in a little bit more effort to design types (approximate behaviour) in order to be able to prove things about the behaviour of a program without executing it. An essential tool to make this practical when conditional branching is present is to abstract conditionals as union types, avoiding behaviour search space explosion. Entry: modifying MetaOcaml Date: Sun Sep 28 12:11:36 CEST 2008 Maybe the best way to understand a bit more about the internals of MetaOcaml is to dive right in and make some modifications. I.e. I'd like to replace the "double" type with "float" which is more useful for embedded development. There are two occurences of the literal '"double"' in bytecomp/trx.ml What happens if these are changed? Nothing.. It looks like postprocessing the code is going to be simpler. For generating float code this dirty hack should work: #define double float Now.. The clean way. In cprint.ml there is a function called "print_type_spec" which does support the Tfloat type. Maybe changing Tdouble by Tfloat should work then? Test with: # (.!{Trx.run_gcc}. x +. 1.0>.) 123. ;; - : float = 124. That seems to work. The generated procedure has correct type declarations, and the marshalling uses implicit float<->double conversions based on the types of "Double_val" and "Store_Double_val". float procedure(float x_1 ) { return x_1 + 1.0; } #ifdef OCAML_COMPILE value procedure_marshall(value arg_list ) { initializer(); { float ret ; float x_1 = Double_val(arg_list); value store = alloc(Double_wosize, Abstract_tag) ; ret = procedure(x_1); Store_double_val(store, ret); return (store); } } EDIT: some interesting extensions: - double vs. float - short ints - vectors Entry: bug in interactive console -> testing Date: Fri Oct 3 10:56:12 CEST 2008 Right when I needed to demo it.. Something to think about is a decent regression test. The current .hex based all-in-one test doesn't cut it for changes in interactive console. Otoh, that code is as good as stable.. Entry: next Date: Fri Oct 3 11:03:47 CEST 2008 Been a bit out of it.. What's next? - pk2 + serial interface - dorkbot demo - start working on C-code generation I'm a bit in the middle of wanting to try the MetaOcaml->C and extending Staapl with C-code target (ARM/dsPIC). To simplify things it might be best to try to stabilize on the pk2 interface first, so there's a low-threshold entry to the project. Entry: MetaOcaml bytecomp/trx.ml Date: Fri Oct 3 12:39:33 CEST 2008 Downloading the Ocaml distribution to see the difference between the two projects. MetaOcaml based on 3.09.1 http://caml.inria.fr/pub/distrib/ocaml-3.09/ocaml-3.09.1.tar.bz2 The cabs.ml file seems to be from the Ocaml project. Nope.. it's not in there.. It's a separate library: cabs -- abstract syntax for FrontC. Here's a link in the Caml Dev Kit: http://pauillac.inria.fr/cdk/newdoc/htmlman/cdk_180.html#SEC206 The one in metaocaml is 2.1 while the one documented above is 3.0. There seem to be some naming differences. The other file from FrontC is cprint.ml The parser itself isn't included. EDIT: look at these: http://manju.cs.berkeley.edu/cil/ http://frama-c.cea.fr/what_is.html Entry: Generating 3 addr SSA (GIMPLE) Date: Fri Oct 3 16:47:06 CEST 2008 Since all code in GCC goes through GIMPLE, it might be interesting to generate such code in the first place. It's possible to have GCC dump out a C-like representation of the intermediate tree forms using flags like: -fdump-tree-gimple http://gcc.gnu.org/ml/gcc/2002-08/msg01397.html In any case, here's a draft of a design of a new SIMPLE representation. It's a bit sketchy at this point, but I'm very interested in comments: ------ function: FUNCTION_DECL DECL_SAVED_TREE -> block block: BIND_EXPR BIND_EXPR_VARS -> DECL chain BIND_EXPR_BLOCK -> BLOCK BIND_EXPR_BODY -> compound-stmt A BIND_EXPR takes the place of the current COMPOUND_STMT, SCOPE_STMT and DECL_STMT; all of the decls for a block are given RTL at the beginning of the block. DECLs with static initializers keep their DECL_INITIAL; other initializations are implemented with INIT_EXPRs in the codestream. The Java "BLOCK_EXPR" is very similar. compound-stmt: COMPOUND_EXPR op0 -> non-compound-stmt op1 -> stmt rth has raised some questions about the advisability of using COMPOUND_EXPR to chain statements; the current scheme uses TREE_CHAIN of the statements themselves. To me, the benefit is modularity; apart from the earlier complaints about the STMT/EXPR distinction, using COMPOUND_EXPR makes it easy to replace a single complex expression with a sequence of simple ones, simply by plugging in a COMPOUND_EXPR in its place. The current scheme requires a lot more pointer management in order to splice the new STMTs in at both ends. It seems to me that double-chaining could be provided by using the TREE_CHAIN of the COMPOUND_EXPRs. stmt: compound-stmt | non-compound-stmt non-compound-stmt: block | loop-stmt | if-stmt | switch-stmt | labeled-block-stmt | jump-stmt | label-stmt | try-stmt | modify-stmt | call-stmt loop-stmt: LOOP_EXPR LOOP_EXPR_BODY -> stmt | NULL_TREE | DO_LOOP_EXPR (to be defined later) The Java loop has 1 (or 0) EXIT_EXPR, used to express the loop condition. This makes it easy to distinguish from 'break's, which are expressed with EXIT_BLOCK_EXPR. EXIT_EXPR is a bit backwards for this purpose, as its sense is opposite to that of the loop condition, so we end up calling invert_truthvalue twice in the process of generating and expanding it. But that's not a big deal. >From an optimization perspective, are LABELED_BLOCK_EXPR/EXIT_BLOCK_EXPR easier to deal with than plain gotos? I assume they're preferable to the current loosely bound BREAK_STMT, which has no information about what it's exiting. EXIT_EXPR would have the same problem if it were used to express 'break'. if-stmt: COND_EXPR op0 -> condition op1 -> stmt op2 -> stmt switch-stmt: SWITCH_EXPR op0 -> val op1 -> stmt The McCAT SIMPLE requires the simplifier to make case labels disjoint by copying shared code around, allowing a more structured representation of a switch. I think this is too dubious an optimization to be performed by default, but might be interesting as part of a goto-elimination pass; a possible representation would be to also allow a TREE_LIST for op1. labeled-block-stmt: LABELED_BLOCK_EXPR op0 -> LABEL_DECL op1 -> stmt jump-stmt: EXIT_EXPR op0 -> condition | GOTO_EXPR op0 -> LABEL_DECL | '*' ID | RETURN_EXPR op0 -> modify-stmt | NULL_TREE I had thought about always moving the assignment to the return value out of the RETURN_EXPR, but it seems like expand_return depends on getting a MODIFY_EXPR in order to handle some return semantics. | EXIT_BLOCK_EXPR op0 -> ref to LABELED_BLOCK_EXPR op1 -> NULL_TREE | THROW_EXPR? I'm not sure how we want to represent throws for the purpose of to generating an ERT_THROW region? I had thought about using a THROW_EXPR wrapper, but that wouldn't work in non-simplified code where calls can have complex args. Perhaps annotation of the CALL_EXPR would work better. | RESX_EXPR label-stmt: LABEL_EXPR op0 -> LABEL_DECL | CASE_LABEL_EXPR CASE_LOW -> val | NULL_TREE CASE_HIGH -> val | NULL_TREE try-stmt: TRY_CATCH_EXPR This will need to be extended to handle type-based catch clauses as well. | TRY_FINALLY_EXPR I think it makes sense to leave this as a separate tree code for handling cleanups. modify-stmt: MODIFY_EXPR | INIT_EXPR op0 -> lhs op1 -> rhs call-stmt: CALL_EXPR op0 -> ID op1 -> arglist Assignment and calls are the only expressions with intrinsic side-effects, so only they can appear at statement context. The rest of this is basically copied from the McCAT design. I think it still needs some tweaking, but that can wait until after the statement-level stuff is worked out. varname : compref | ID (rvalue) lhs: varname | '*' ID (lvalue) pseudo-lval: ID | '*' ID (either) compref : COMPONENT_REF op0 -> compref | pseudo-lval | ARRAY_REF op0 -> compref | pseudo-lval op1 -> val condition : val | val relop val val : ID | CONST rhs : varname | CONST | '*' ID | '&' varname_or_temp | call_expr | unop val | val binop val | '(' cast ')' varname unop : '+' | '-' | '!' | '~' binop : relop | '-' | '+' | '/' | '*' | '%' | '&' | '|' | '<<' | '>>' | '^' relop : '<' | '<=' | '>' | '>=' | '==' | '!=' Entry: About code generation and identifiers Date: Fri Oct 3 17:29:43 CEST 2008 With all this focus on making identifiers lexically scoped, I'm wondering how the move to an external compiler/assembler is going to work out.. Entry: gimple Date: Fri Oct 3 18:43:41 CEST 2008 void boo(int *a, int *b){ int i; for (i = 0; i<100; i++){ a[i] += b[i]; } } the -tree-gimple option gives boo (a, b) { unsigned int i.0; unsigned int D.1200; int * D.1201; unsigned int i.1; unsigned int D.1203; int * D.1204; int D.1205; unsigned int i.2; unsigned int D.1207; int * D.1208; int D.1209; int D.1210; int i; i = 0; goto ; :; i.0 = (unsigned int) i; D.1200 = i.0 * 4; D.1201 = a + D.1200; i.1 = (unsigned int) i; D.1203 = i.1 * 4; D.1204 = a + D.1203; D.1205 = *D.1204; i.2 = (unsigned int) i; D.1207 = i.2 * 4; D.1208 = b + D.1207; D.1209 = *D.1208; D.1210 = D.1205 + D.1209; *D.1201 = D.1210; i = i + 1; :; if (i <= 99) { goto ; } else { goto ; } :; } This is quite lowlevel. It doesn't have any structured elements. Just "goto", "if" and assignment. Also, all derefs are direct. I wonder how this is then mapped to addressing modes. The -tree-original option gives this: ;; Function boo (boo) ;; enabled by -tree-original { int i; int i; i = 0; goto ; :; *(a + (unsigned int) ((unsigned int) i * 4)) = *(a + (unsigned int) ((unsigned int) i * 4)) + *(b + (unsigned int) ((unsigned int) i * 4)); i++ ; :; if (i <= 99) goto ; else goto ; :; } It also just has if + goto. Entry: readMode() Date: Sun Oct 5 10:12:16 CEST 2008 A trace of the 'read' command for pk2 by Jeff Post simplified to PK2 v2.x and 18F1220 programming. http://home.pacbell.net/theposts/picmicro/ http://home.pacbell.net/theposts/picmicro/pk2-3.00-alpha12.tar.gz In pk2-3.00-alpha12/pk2main.c : readMode() pickitGetDevice() readDeviceData1/2() pickitGetDevice2() findDeviceName2() SETVPP pickitRead() allocateDevice2Buffers(); pickitReadProgram() pickitReadProgram() enableTargetPower() clearUploadBfr() (EXECUTE_SCRIPT (ProgEntryScript)) setDownloadAdrs() while { clearUploadBfr() (EXECUTE_SCRIPT (ProgMemRdScript)) readDataBlock() } pickitReadEeprom() pickitReadConfig() writeHexFile() Some problems: UPLOAD_DATA_NOLEN doesn't seem to work in my implementation, so I'm using UPLOAD_DATA. Got config write to work too.. However, write protect can't be cleared with just writing config data. I think this requires a chip-erase. OK, works now. For program write, I move to the Microchip pk2cmd sources: PICkitFunctions.cpp : WriteDevice write-program-memory now finishes, but the pk2 hangs on the next read-program-memory. Wait.. Got something back! After reset it does seem to work. Let's try with uploaded script again. That doesn't work, but without works fine. Got the problem: the DOWNLOAD_SCRIPT function had tag 255, which should be 254, but that doesn't work either. Something fishy there: it's not tested yet. Entry: demo / test Date: Sun Oct 5 16:21:11 CEST 2008 with pickit2 programming working for the essential parts, it's time to write a demo / test. add this to staaplc: this leads to 2 operation modes: * OFFLINE: generate .hex and .ss files to manually upload + connect * ONLINE: don't save those files, but upload binary and stay connected interface to online programming should be the same as hex file saving. 'upload-monitor is working now. next: activate target + serial pass-through. the latter requires a circuit and some thought, so leave it for later. Entry: last couple of weeks Date: Sun Oct 5 21:16:50 CEST 2008 * MetaOcaml: I've been looking closer at MetaOcaml, trying to prepare for replacement of some PacketForth code by GCC offshored generated code. Also been thinking about Mathlab/Simulink's Real-time-workshop and how this might fit together with MetaOcaml. * PICkit2: programmer is working. Getting closer to a standard 5-wire protocol for minimalist PIC programming. * Documentation: saw the need to stabilize on two fronts: pure Forth without the Scheme stuff, and a Scheme side. * Static analysis: some ideas about adding a type system and inference rules, but ran into problems with the dynamic nature of Scheme. I'm not sure if this is worth it.. Maybe a "static core" that is translated to Ocaml or Haskell as a _test suite_ is a better approach. The idea is: the fact that Staapl is dynamically typed is a plus as it makes things simpler, but it is important to keep an eye on how things would work would it be implemented in a static type system, to get more inspired compile-time and test-time checks. * C analysis and synthesis, Plenty of stuff here: MetaOcaml + FrontC / CIL, Haskell's language.c and Ometa parser for PLT Scheme. The latter might be interesting to solve the "quasiquotation" problem: a template language with the langauge's concrete syntax is easier to use than syntax trees. * Transterpreter + CSP. I've had little time for reading lately, but it would be nice to write the MC-303 ui application in a parallel Purrr dialect, one which has return stacks in RAM. * Simulator simplification. This triggered a lot of thinking about static analysis, since the simulator needs to know the number of elements consumed by the macro to be able to evaluate it. The current mechanism doesn't give any information: it merely inspects the input and acts accordingly. Alternatively, it might be easier to copy the entire stack, but this requires extra annotation for the interpreter. Entry: closures Date: Tue Oct 7 09:57:29 CEST 2008 When explaining functional programming to somebody with a hands-on imperative/oo background (say C,C++,Python), how to proceed? I think the punch-lineshould be something like: closures (lambda) make the creation of plug-in behaviour really easy. A closure is essentially a method bound to a hidden object. Except from typing "lambda" in Scheme, or "fun" in Ocaml or "\" in Haskell, there is essentially no overhead when creating these objects. These objects are first-class, in that they can be passed to other such objects as function arguments. Essentially, functional programming is about "building functionality by passing closures (parameterized behaviour with context) to other closures". Entry: macros at the console Date: Tue Oct 7 10:13:43 CEST 2008 - type system: allow compile-time annotations that know this - lazy stack read - full stack transfer The latter seems best really, since it is also useful for debugging. Maybe best implemented as an extra opcode? Dump stack.. This requires the monitor to know its data stack location. OK: full stack transfer works: target provides just stack size, host performs moves + copy. NEXT: remove '+' from simulated words, but try to add it as a simulated macro. OK. got it to work.. it is a bit slow though, due to moving of all data. Removed other simulation mechanism. Possible optimization: perform multiple macros in sequence, instead of moving data up and down. Fixme: add real interpretation of QW, CW and JW. This does require some optimizations to be disabled. However, the comp/ postproc seems to be not used anyway. FIXME: make this all a bit more explicit. There's another problem: with interactive usage, not all identifiers are available at compile time. This is not problematic, except for prefix words: those need to be available at the place where 'target:' is evaluated. Actually, it is problematic.. Currently invokation of commands doesnt work any more.. TODO: * replace the current simulation with an explicit QW/CW interpreter * clearly define the semantics of macro-eval first one ok: see 'interpret-cw/qw in live/tethered.ss the other one i need to check properly.. Entry: pickit2 Date: Tue Oct 7 15:55:57 CEST 2008 next: serial passthrough. OK. the scheme side code seems to work. next: PIC side direct loopback. ICD2 serial color 1 /MCLR white 2 VDD red 3 GND black 4 PGD RX (<- target) blue 5 PGC TX (-> target) green 6 PGM For the PIC18 this requires: 13 PGC RB6 RX (<- host) 14 PGD RB7 TX (-> host) Build a test app: init-stacks TRISB 7 low begin PORTB @ >> PORTB ! again Entry: microchip samples Date: Tue Oct 7 21:37:31 CEST 2008 need some pics with canbus + ethernet.. for ethernet the stand-alone chip with spi is maybe better: ENC28J60 PIC18F2580 32k PIC18F2585 48k PIC18F2680 64k PIC18F2685 96k Entry: staapl pickit interactive Date: Fri Oct 10 08:02:52 CEST 2008 Looks like programming is working, time to clean it up a bit. Some things that need to work: 1. accessible from console (reload project: 'init-prj) 2. verify! The namespace management thing is a bit confusing.. How does the console fit in this? Can it execute commands that can change the current project? It should. Interaction: all identifiers necessary for interaction reside in the namespace object. The problem with this is that the command parsing will depend completely on the wrapped namespace. Where is the conflict? + There should be some toplevel entity that can kill a project and reload it without having to reload core compiler components, or that can keep multiple projects active. + Projects should be self-contained, and not influence this top level manager. + The console should have access to the toplevel management. - Currently, console is per project. Solution: provide an escape hatch per project: install a command that will evaluate an expression in the toplevel context? Now, the problem is: some commands will disconnect the target. How to cope with that? Currently the 'OK is provided by a sync to the target.. Maybe the concept of connection should be re-established.. So, add another bullet: - Currently, console assumes connection. So, what is the model? The console interacts to a machine, but the machine is variable ('login' to another project / device) and it includes a mode where commands don't work since there is no target connected. However, we want to keep the illusion we're on the target chip. Entry: Staapl selling points Date: Sun Oct 12 20:07:20 CEST 2008 I'm gearing up to try to sell Staapl to trench dwellers. I need an attack plan.. Particular language needs to be adapted to the public, but the real selling points need to be clear. * The console. A console is something to control parameters, preventing recompile. In practice, everybody developing embedded apps provides a console. Usually this is done in an ad-hoc way, adding just the reconfigurability that is necessary. Lisp and Forth remove this ad-hoc feature: the console is a keyboard connected to a full language, giving a short edit-compile-test cycle. * Pattern matching, closures, Useful for building language transformers * Closures, macro hygiene and declarative modules. Tools for namespace management. EDIT: After one week in the mud, I think I'm going to try to scale down the expectation a bit.. You can't sell functional programming. Entry: getting things done Date: Sat Oct 18 14:46:21 CEST 2008 Next event is (probably) dorkbot gent 2008/12/19. What needs to work? Pickit2 interface + examples. SIMPLE PIKCIT2 BASED. This is also for the low-end pic chips. 1. Get serial pass port to work. 2. Bit-banged serial console over ICD. USB BOOTLOADER BASED. Johannes' USBpicstamp. The problem here is only software. Entry: higher bandwidth to PC Date: Sat Oct 18 15:08:14 CEST 2008 I.e. for simple logic analyser.. * USB: Relatively simple hardware, but driver software is complex. This is on the list, but not available yet. Disadvantage is polled nature (not client driven). * ETHERNET: Simpler software, but little more complicated hw. What is necessary to get the ENC28J60 ethernet chips going? Hmm.. I don't have transformers, 1% resistors, rj45 connectors and 3.3 V regulators. Order? Disadvantage: components. * PC ISA. Can this be done without address decoder chips (using a single address bit on a dedicated machine?) If so, it's a possible solution. If not, other interfaces are probably better. Probably best with PSP port on 40pin devices. What with DMA? Disadvantage is compatibility. * ATA. PIO-0 might be feasible? Try to figure out if a standard linux driver can be used. Might have some protocol overhead. * The FTDI chips allow for more elaborate modes than just async serial. I have a bunch of them.. Is there a linux driver that allows more lowlevel access? The linux driver allows 'warp' rate 460800, which would be 8 channels at 46.08kbit, which is probably enough for now, and should be straightforward to get going. The driver also supports a custom divisor. * CANbus. I've ordered a bunch of PICs with CANbus support, and some line drivers. The problem here is interface to PC, it needs a separate bus. Entry: PicKit2 Date: Wed Oct 22 08:25:50 CEST 2008 Should I finish this, or first build an USB interface? For the serial passthrough I need a decent measuring device. Actually, an active brain would be enough.. EDIT: pk2 is probably more important. Let's get a setup going. Entry: distracted Date: Sun Oct 26 11:52:40 CET 2008 Got distracted a bit by ideas of using CANbus or RS485 as a way to connect measuring equipment to a PC. Currently I'm thinking of trying to make an ISA card with DMA. As long as the card is not bus master, this shouldn't be too hard. However, it does need buffering on the PIC side, so is limited to slow data rates (not possible to run a single loop). So, the easiest route is the FTDI on 460800 baud with plain 8 bit port readout. For PIC18 @ 40 MHz that's a divisor of 21.7 And on 48 Mhz it's 26. The reason i'm dragging my feet here is that i don't have a decent workflow setup. This application (getting serial comm to work) can be used as a comb for workflow documentation. 1. Create a standalone application: .f -> .hex This involves: processor selection + oscillator configuration. Maybe this is a good time to also use standard config bits defines from the asm files. 2. Time base + delay loops. Use this to create a bit-banged serial send, and later receive. 3. hardware UART config. The deal is this: get the monitor config working using just busy loops and general purpose IO. This leads to simple examples. It removes dependency on hardware uart (i.e. when uart is used by app, or chip doesnt have one). It should be enough to get the PK2 serial port working too. Entry: reset Date: Tue Nov 11 09:46:57 CET 2008 Working in the field for 4 weeks gave me a proper reset. Where is Staapl headed? I think I found some possible commercial applications for code manipulation foo, but they are more about code analysis, refactoring and aspect-oriented tricks. And it all has to be about C. So I guess it's time for a break, and continue work on usability and documentation. From a more practical point, I have a need to build a couple of circuits for measuring and testing, which could pull Staapl development. The most important thing to focus on right now is to reduce the setup time to go from a blank chip to a working app (interactive or not). This should be nothing more than adding an icd2 connector. Second important thing is to be able to use multiple projects at the same time. Currently I want to use an 4550 board with a parallel port to use as a logic analyser to get the bitbanged serial port to work. Entry: colours Date: Tue Nov 11 12:16:38 CET 2008 Going to standardize the colorscheme for ICD/serial, following the thing that's already in my head: On the master side: orange = hot (TX), yellow = cold (RX). Orange is then clock (always out), while yellow is data (sometimes in). This makes sense. ( I never thought I'd loose so much time on connectors.. Or on fitting the stuff in my head.. Is it straight or reversed? RX or TX? I.e. the ICD connector is a bus with a well-defined master, while async serial is a symmetric point-to-point link, so we assign a master to avoid colors to cross, which means slave has inverted colors. ) ICD2 serial color 1 /MCLR white 2 VDD red 3 GND black 4 PGD RX (<- target) yellow 5 PGC TX (-> target) orange 6 PGM Entry: basic app config Date: Tue Nov 11 16:01:33 CET 2008 It's quite simple now: * compile chip configuration registers * define fosc * define monitor baud rate * load chip specific macros * load monitor + boot code Also, FTDI cable works up to 230 kBaud Entry: busy loop timing Date: Tue Nov 11 16:13:00 CET 2008 I'd like to create a busy wait macro that's relatively accurate, using nested for loops. Got it: \ This is optimized for fosc from 8 -> 48. The inner loop compensates \ for the oscillator period. Using a convenient period of 50us, this \ gives 33 iterations at 8Mhz and 200 iterations at 48Mhz. The macro \ is exposed, but it is most accurate for +- 50 us. macro : usec fosc 4000000 / * \ instructions per us 3 / \ instructions per loop for next ; forth : 50usec 50 usec ; Entry: logic analyser Date: Tue Nov 11 17:32:13 CET 2008 Now, let's see what there is to do to pipe in data. Using 4 channels, there are 2 measure points per byte, gives a rate of about 50kbit. But, let's keep it simple an pipe only the bytes. Using 8 channels is easiest to code, but gives only 23k samples/sec. At 230400 baud, a byte needs to leave every 434 cyles. Let's work with a sample period of 54 cycles, this gives enough time to send out the bytes (TXREG !) and amounts to 185k samples/sec. Now, make this into 64 cycles, and a free running timer can be used to synchronize, making everything easier to get right. This still gives 156k samples/sec. A free running timer isn't necssary, we can easily update the timer every cycle. Entry: dorkbot - a different angle? Date: Sun Nov 16 17:20:04 CET 2008 programming = debugging There's already too many things said about programming. Writing software for deeply embedded systems isn't really about programming, it's about debugging. And there is only one way to debug: make sure you SEE what's going on, the rest is just getting there. Entry: tools ready Date: Thu Nov 20 08:36:38 CET 2008 Everything seems to be in place now after move. Still don't have internet, but at least the PC in my workspace has USB working. No more excuses for procrastination. Time to do some serious work now. next: PK2 serial First: make the PK2 accessible from toplevel. Currently I have to use box> (enter! "staapl/pickit2/pk2.ss") box> (boot) loading PICkit2 device file. pk2-open: PICkit 2 Microcontroller Programmer (define (test-uart) (boot) (uart-start) (let loop () (display ".") (uart-write #x55) (sleep 1) (loop))) There are several things wrong. First, VDD is not connected. OK. At 300 baud, hardware loopback works. next: make buffer work OK: sending is a sync channel, receiving is an async channel (we can't block target). next: bitbanged serial over the ICD2 ports Entry: multiple consoles Date: Sat Nov 22 08:33:10 CET 2008 Something I didn't try yet is to run a snot head to a forth console. It's quite straightforward: just load the .ss file into the toplevel, then activate the console with: ,(repls '(box command)) This implies the console is defined (i.e. in .snotrc or in the corresponding .snot file in the path) using the command: (register-language 'compile "compile> " (not-implemented 'compile-eval) (lambda (str) (box-eval `(forth-compile ,str))) #f) The hook here is 'forth-compile. To enable multiple consoles, this needs to be parameterized. The .ss file generated needs to be a bit more general than a standalone console.. Entry: Getting rid of namespace management Date: Sat Nov 22 09:00:15 CET 2008 This is a feature only useful for more advanced host/target setups. What about making things simpler? For direct console access, only a single namespace is necessary, so there is no need for the extra indirection that shields the compiler namespace. This also gives a more hackable interface. Argument against: you need a namespace to perform evaluations. But this can just as well be the initial mzscheme namespace. So, the .ss file is now something that can be loaded into a fresh namespace, be it a prepared one or an empty one. Note that staaplc also uses reflective operations to compile a .f file. Entry: host + client Date: Sat Nov 22 11:01:44 CET 2008 So, what's the workflow for having both a client and a host communicating, instead of an emulated console? The whole deal with 'lazy connection' never really worked properly.. Time to fix it or get rid of it. What about: one namespace per target, so store the connection in the namespace? Let's simplify. * The tethered.ss code is parameterized by 'target-in' and 'target-out', so it doesn't care if the port is a channel or a scheme port. * Lazy connections are not necessary. If there is no console, the target simulator is run instead. Entry: pk2 console works Date: Sat Nov 22 15:10:42 CET 2008 That's a milestone. There's a slight problem however: detection of ICD cable.. It would be nice to keep the same idle line detection as for ordinary serial ports. Another disadvantage is that it doesn't fit in 512 bytes, but for applications that need it the boot block could be compacted. And, with the pk2 attached, a messed up bootloader isn't so problematic, so one could run without boot-protect making fixed-size loaders not necessary. Next: reset pk2 with ICD2 PGC line high. Probably something like this: send a single command buffer that switches the target on + switches to uart mode. Maybe it's best to use the following logic: before connecting, the serial line acts as an interrupt input. Initially the line is in BRK, and upon reception of a 0->1 transition (BRK->IDLE) the interpreter is started, waiting for a start bit (0). But.. that uses an isr, and is thus quite invasive.. Let's see.. If we can start target with the PGC output high, then it's already ok. Hmm.. can't get it to work atm. Entry: full circle Date: Sat Nov 22 17:09:16 CET 2008 next: do everything: programming + connecting in one go, and seal it up. basic functionality works: staaplc -u -d pk2 rapid.f this is without verify, and things go wrong when pk2 isn't closed properly. close fixed too. looks like it's working now. Entry: next Date: Sun Nov 23 08:47:48 CET 2008 The next big hurdle is USB. But it might be interesting to setup some design flow documents showing a test-centered design approach. * Bootloader + serial cable. Flash once + enable boot protect. All application code is loaded in incremental mode. * PK2. Flash application (whole program). Incremental code mainly for debugging. In the previous approach I used relative addressing to modify fields in buffers. Is this the right abstraction? Going back to the radical roots, I think I saw an interview with Chuck Moore about not using C-like structures, but something else.. Is there anything there? What is filling in a struct? It is like calling a procedure and passing parameters. Procedures use the stack in Forth, so is there a correspondence here? Entry: minilanguage for usb drivers Date: Sun Nov 23 13:10:21 CET 2008 USB code contains a huge amount of red tape not easy to factor in procedures. This could be a good opportunity to document the construction of a minilanguage that abstracts this. Let's first get the picstamp to boot.. Entry: optimization terminology Date: Sun Nov 23 13:12:43 CET 2008 * constant propagation: replacing variables initialized with constants by the constant. * constant folding (constant expression evaluation): eliminate expression evaluation by replacing expressions with results. * inlining: eliminating procedure invocation, which is a run-time binding construct. Entry: Parameterization Date: Sun Nov 23 13:33:04 CET 2008 It's a bit hard to use.. The thing I seem to use the most is: * Literal macro arguments (for single procedures). * Global variables that get redefined. * Prefix parsers. The first one is OK, but has limited use. (It can't define multiple words in a simple way). The second one is not good, but used a lot. The third one doesn't get used really. So, for redefinition of global variables (macros). What's the deal there? macro : foo 1 ; forth : bar foo ; macro : foo 2 ; forth In this code, foo is: 0840 6EEC [movwf 236 0] 0842 0C02 [retlw 2] In a single compilation unit, macro definition ALWAYS comes before code generation. If a macro gets redefined, the generated code will reflect this. This is a bit raw, but very usable. Note: this is used inside the core compiler too. Most language features are redefined or specialized in this way, so it is kind of basic. Entry: problems shutting down uart mode Date: Sun Nov 23 18:26:04 CET 2008 When I use EXIT_UART_MODE, the chip won't properly reconnect on next start. When I don't do this, the pickit isn't closed down properly. I don't know what this is.. Anyways, the following code works as a hackaround: (begin (uart-write 1 6) (msleep 100) (unless (uart-try-read) (printf "uart-start hack\n"))) Entry: the problem with writing lowlevel code.. Date: Sat Dec 6 12:55:54 CET 2008 I've been doing a full-time C job last 2 months. The things I loose most of my time with are not really programming tasks, but getting hardware to work: using the right initializations, correct order of operations and a load of workarounds. The problem is STATE. Hardware is overall tremendously stateful, doesn't have clear interfaces that prevent illegal use, and has poor error handling/reporting. It does look like there is little to be done about this: the problem is hardware itself: problems are pushed to the software level since they are easier to handle there. A device driver is then really mostly a design bugfix layer.. One thing which is tremendously useful in packet-based communication is a sniffer with a packet parser. For APIs it's a call trace. So.. maybe for the USB drivers, I try to route it over ethernet (i think that exists) to be able to sniff the traffic? Entry: longer cables (pk2 over TCP) Date: Wed Dec 10 22:41:57 CET 2008 I'm running into the problem that the host I can connect to my experimental setup isn't very fast, and the fast one I'd like to do dev on isn't near to my soldering table. Basicly, I'd like to transport the PK2 protocol over TCP. Since PK2 really isn't more than sending 64 byte buffers up and down, this should be quite simple.. Entry: demos for dorkbot Date: Thu Dec 11 21:56:19 CET 2008 I'd like to do the synth + the TV. For the TV I have 13.5 Mhz xtal + now there's the bitbanged monitor freeing one of the serial ports for raw binary output. I don't remember which chip can do this though.. The serial port seems to be this one: in 18f2550 datasheet, section 20.3 synchronous master mode. The 13.5 Mhz is standard video pixel clock. Can probably use one of the usb devices they have higher clock ratings. So... A better TV out circuit. What i tried before is to use the 75 ohm impedance as a voltage divider, to not have to use a 75 Ohm output impedance. This is a dirty trick yeah.. So, to do it properly, generating the 4 levels (0V : 0.33V : 0.66V : 1V) requires a buffer amp to isolate the load impedance from the resistor divider. This frees resistor values a bit, but requires a rail-to-rail amp if fed from a single supply. For demo purpose it's probably best to stick with the two simple resistors. Asked Bert: To do it properly, provide double levels + add a unit buffer opamp with a 75 ohm resistor in series (this assumes line is properly terminated), maybe also add an output transistor to deliver the current. Non-terminated: it's possible to get the line levels always correct if the buffer has zero output impedance. Entry: Preparing for dorkbot Date: Sun Dec 14 15:40:53 CET 2008 This works on my first 452 proto A/V proto board: staaplc -u -d pk2 452-40.f But, the synth boards don't work. They are running at 8 Mhz, maybe it's too fast? Let's lower the baud rate. 2400 doesn't work either.. It's probably something else.. I forgot how to load a forth file into the scheme console, so the asm can be inspected. OK: using (require (planet zwizwa/staapl/prj/pic18)) instead of (require (planetzwizwa/staapl/pic18)) The other option is to use the former, then run (init-namespace). FIXES: - baud rate was hardcoded - 40Mhz goes up to 19200 - 8Mhz goes up to 4800 Look into this better: this probably due to start bit timing issues. NEXT: load the whole synth in the app. OK. Now, running the synth app. NEXT: fix problem with ICD2 port being shut off. Ok. not initializing the digital inputs seems to work. Now there's a problem when turning the synth engine on: this messes up busy-loop timing. Maybe the serial busyloop should disable interrupts? Disabling the engine during command interpretation is probably simplest. Another option is to tie the bitbanged serial port to the interrupt routine, and use the main timer for bitbanged receive. The same problem is going to occur for ForthTV, so lets fix it properly. We need to poll for the start bit, disable the engine and then fall into receive. Is it ok to remove polling the start bit TRANSITION in receive? Entry: Polling the icd serial port Date: Mon Dec 15 09:31:52 CET 2008 \ Custom startup : warm init-all init-board-debug engine-on begin icd.rx-ready? if engine-off interpret-msg engine-on then again ; This seems to work. Entry: interrupting application Date: Mon Dec 15 10:38:15 CET 2008 Shut down serial + power cycle? Entry: list of interactive commands Date: Mon Dec 15 12:59:58 CET 2008 Some of these are prefix commands: commands that take literal inputs from the command input stream, instead of the data stack. see \ Disassemble ul \ Replace current marked upload with file's program abd \ Dump data block n (64 bytes) fbd \ Dump flash block n (32 words) p | ps | px \ Print byte value as unsigned, signed or hex _p | _ps | _px \ Print word value (2 bytes) as unsigned, signed or hex kb \ Print a memory map for the first n kilobytes. words \ Print on-target words, usable interactively macros \ Print all defined names, usable in program code ts | tss | tsx \ Print data stack bytes in unsigned, signed or hex _ts | _tss | _tsx \ Print data stack words (2 bytes) in unsigned, signed or hex Other more advanced commands: load \ Load and compile a file (doesn't upload) Entry: ctrl-C Date: Mon Dec 15 13:35:14 CET 2008 It would be nice if ^C resets the device while PK2 is being used. Ok: did this: Theres a parameter "tethered-reset" that provides target reset for a specific target IO mechanism. For serial ports, this is just the "cold" monitor command. For PK2 this is an external reset. This is driven by the console: whenever a user break arrives during the execution of a command, (command "cold") is executed. User break during read-line is ignored. It isn't very robust though. For normal use it seems to work, but interrupting commands like "4 kb" doesn't work that well. Also, when there is no console, there's no way to stop the console. Ok.. After a reset, the console checks if "OK" works. If not, it will exit. Hmm.. It's messed up now.. Re-flashing helped.. Probably something got killed. OK. This is good enough. Entry: forthtv Date: Mon Dec 15 20:48:15 CET 2008 Next: porting the forthtv code. This requires mostly cleanups of language changes (already did "constant"), and some serial port logic. Getting this to work for the serial port is not going to be simple.. Maybe we should work with some zero bytes to synchronize. Entry: more forthtv Date: Tue Dec 16 18:31:27 CET 2008 Got it to compile today, but it doesn't seem to want to run. Something gets redefined. Maybe it's the "hook.f" stuff? This arrow notation i've changed in the synth end of 2007. (Forthtv last update is may 2008 for workshop). No, it's usec. Entry: redefining names Date: Tue Dec 16 18:51:04 CET 2008 It's necessary to find a way to manage redefinition of macro names. This mechanism is terribly convenient for specialization (configuration) but hard to debug. Entry: dorkbot presentation Date: Wed Dec 17 20:04:12 CET 2008 It's about 3 things really. What is it? * a programming language * a compiler to PIC18 machine code * human interaction tools (console + inspection) What is it more * a programmable programming language * a retargetable code generator * remote procedure calls => all features are programmable The big ideas: * Forth as an efficient machine model * Lisp style macros (Scheme's hygienic version) * Short debug cycle What makes Forth so nice? It's a CUT & PASTE language. (concatenative) -> _VERY_ convenient for test-driven prototyping Entry: sync jitter Date: Wed Dec 17 20:36:05 CET 2008 TV stuff is working.. It would be nice to also solve the 0-1-2 jitter problem. Originally it was done using interrupts, which gave zero jitter. It can be solved using an instruction that conditionally wastes once cycle: a zero offset conditional branch. EDIT: It's actually not so difficult: Create an ISR that pops the return address, then add a routine : wait wait ; Entry: dorkbot staapl presentation overview Date: Fri Dec 19 09:16:38 CET 2008 - general idea: 3 things - synth board: * interaction - set config (square, pwm, p0, ...) - upload script * demo "go" - TV board: * interaction + stop * show code Entry: midi Date: Thu Dec 25 18:43:23 CET 2008 Been thinking about using some kind of network <-> midi converters where data is carried one-way over +- current loops, which can feed the target. Entry: Staapl definition Date: Sun Jan 4 14:42:10 CET 2009 Staapl is a 2-stage concatenative language inspired by Forth, Joy and Scheme. It is built as follows: - [MACRO LANGUAGE] Semantics are defined in terms of primitive machine code transformers: each WORD is represented by a function mapping a string of machine code instructions to another one. Note this entails both EXPANSION and RECOMBINATION of primitive machine code: Macros perform computations. - [CONCATENATIVE LANGUAGE] Because the transformers are functions, function composition can be used as the language's composition mechanism. This is represented syntactically by CONCATENATION. - [STACK LANGUAGE] These transformers operate on ONE END of a machine code strings which makes it easy to attribute the semantics of a STACK language. - [INSTANTIATION] The MACRO semantics is extended with run-time storage and indirection (2 stacks + machine's indirect jump unstruction) to provide run-time operations that mimick the semantics of the stack-of-code transformers. The result is a Forth-like language with a well-structured macro system. It is 2-stage because of the existence of 2 very distinct code levels: - base machine language - machine language transformers The transformers themselves can be programmed using mult-level techniques (I.e. Scheme macros) if necessary. A different semantics model ignores the explicit distinction between target code generators and instantiated targed code (MACROs and PROCEDUREs), and defines a program as a partial evaluation of a functional concatenative program. The main idea is to combine the macro semantics with concrete semantics of the base machine language to get to this simplified semantics. However, in Staapl this is only approximate and acts more like an ideal design guide. -- I'm happy with this in the sense that it is a story that keeps coming back, and it is fairly close to how it is implemented. However, the instantiation part isn't very clear yet... This needs a proper mathematical model: i guess the instantiation part will need proper constraints.. I.e. it's easy to define for pure functions operating on machine values, vs. transformation of literal load instructions, but this has to be made more explicit. Entry: genoeg gezaagd.. Date: Fri Jan 23 18:17:07 CET 2009 Time to get going again. What's on the plate? - theory: * see above. complete the story. it's about macros and how forth/lisp fit into that picture. * bridge to C. For commercial non-sealed applications, that's the only plug point. Targets: AVR 8bit, DSPic 16bit and ARM/MIPS 32bit. - practical: * High-bandwidth communication. I've been toying with ideas like CANbus, Ethernet and RS422/485 but it's probably best to stick to USB and build a router for serial protocols. Also, USB is usable for other things, definitely for the systems programming I'm doing lately. Also see: http://developer.berlios.de/projects/usb4rt * PICkit bugfixes. It's mostly working, but it still crashes. Get rid of the 'uart hack' by looking at the actual output of the chip. Setup some logic analyser. - education: * Get kids involved. I've been aiming too much at trying to convince already fixed-minded embedded engineers. Fuck them. Get some fresh minds in the picture. Entry: concurrent Date: Mon Jan 26 13:29:12 CET 2009 For CSP/transputer style tasks, it might be best to create a pure stack machine VM for one of the architectures, one that has simple task switches. http://occam-pi.org/list-archives/occam-com/msg01148.html Entry: multi-stage semantics Date: Tue Jan 27 09:33:45 CET 2009 http://lambda-the-ultimate.org/node/3179 "Ziggurat allows the language extender to optionally define static semantics for her new language, and connect these static semantics amongst language levels." Isn't this what I'm looking for? The move from compile-time semantics coming from rewrite rules, to run-time semantics. The paper calls this "specification by compiler". Entry: PIC30 binutils + gcc Date: Fri Jan 30 14:02:08 CET 2009 http://www.baycom.org/~tom/dspic/ There are also deb packages, but I forget where I found them. Mirrored here: (/etc/apt/sources.list) deb http://zwizwa.be/debian unstable main // test.c add1(int a){ return 1 + a; } tom@del:/tmp$ pic30-elf-gcc test.c pic30-elf-ld: cannot find -lpic30-elf tom@del:/tmp$ pic30-elf-gcc -c test.c I guess libpic30-elf is libc? Entry: hybrid systems Date: Mon Feb 9 17:24:28 CET 2009 Staapl is about 2-stage systems. Currently this is PC-uC, but it could very well be PC-DSP on soc designs like the TI DaVinci. Entry: state machines / USB Date: Wed Feb 11 15:44:46 CET 2009 Before finishing the USB driver, there are two more general tasks to solve: * Figure out an approach to create state machine abstractions. * Refine the compiler which translates a highlevel device description into code that answers queries about it. Keep this in mind: during device setup code doesn't need to be fast, so it is better to aim for the right language abstraction instead of trying to write optimal code, as I did before. This is going to make the water muddy for a while, as I can anticipate some problems. One in particular is "require" in combination with the way macros work in a flat language. Steps: 1. Build 18F2550 hardware setup. (USBPICstamp) 2. Setup USB debug on host side. Let's make this into a story about how to write a resonably complex application with Staapl's macro approach. -- Getting it to run tom@zzz:~/staapl/app$ staaplc -u -d pk2 452-40.f loading PICkit2 device file. pk2-open: PICkit 2 Microcontroller Programmer program: 532 bytes config: 14 bytes (Vdd 4.814453125) (Vpp 11.9072265625) uart-start hack Connected (pk2 19200) Press ctrl-D to quit. OK The same one for the 2550 doesn't work. Some 1220 boards also do not work, so I think this is a circuit issue. What I do miss here is a memory verify operation after programming. Let's build that first. Ok. Apparently the 2550 doesn't allow flash write. I think this is a boot-protect problem: need to disable boot protect in config before writing is possible. Apparently this was the problem: #x30000000 org \ wrong #x300000 org \ right Will add warning when there's no config memory. I have to program it with pk2cmd -I -p PIC18F2620 -F logan.hex -M Also, the console will only come up after one ctrl-C Entry: fixing bugs w. bitbanged serial Date: Thu Feb 12 09:05:16 CET 2009 Maybe I should switch to a proper bit-banged serial port implementation first. However, without knowing what exactly is happening, I can't make proper decisions. The hypothesis is that there is some transition on the host->target line that gets mistaken as data. I don't have anything to measure this, but I can build something to measure it. One possible route is to fix the logic analyser such that it is connected with the ICD2 serial port for serial console, and the TTL serial port for data transfer. This does look like a lot of work.. Alternatively, we could try to get the PK2 logic analyser to work. It has higher bandwidth. I do have two of them.. The problem is that it has only a window of 1K samples for 3 channels.. So I do need my own, as I don't know yet what I'm looking for. Let's build it. Next problem: there seems to be something wrong with chip erase.. I can program fine, but it won't reset the bits before programming. Ok. There was at least something wrong with the SetAddress script. Now I get into trouble when I try to run (EXECUTE_SCRIPT (ConfigWrPrepScript)) car: expects argument of type ; given () The other problem, when I (execute ProgMemWrPrepScript) the programmer hangs. The script that hangs is 18F_PrgMemRd64, but the writing seems to have succeeded. It doesn't behave the same on subsequent runs.. Something fishy going on here.. WTF! adding (READ_STATUS) after each read/write command seems to do the trick. No.. back to square one. It won't verify if I change the source file, both for config mem and program mem. One thing that needs to be on is external reset. It should be possible to do this with turning target off though.. Entry: global variables Date: Thu Feb 12 10:48:34 CET 2009 What is actually a gobal variable? It's something of which there is only one instance. I.e. a machine constant, or an application constant. It's easier to use global variables for driver configurations. Driver code is highly parametric, but doesn't usually need multiple specializations per uC program. I.e. console baud rate. Currently "baud" global is used everywhere in console and serial init, but this needs to change so one can have a console AND a special purpose serial port with a different baudrate. In general I'm going to use this strategy: use both generic macros with arguments AND some default macro which uses a global variable. Entry: pk2cmd trace Date: Thu Feb 12 18:48:18 CET 2009 This is: ~/pickit/pk2cmdv1.20LinuxMacSource/pk2cmd -I -p PIC18F2620 -F staapler.hex -M >staapler.pk2.log -- W: A6 02 FE FD EXECUTE_SCRIPT VDD_OFF VDD_GND_ON W: A0 80 2A D2 SETVDD W: A1 40 DF 9C SETVPP W: AF R: 48 02 25 3C 0E 00 81 81 00 0F C0 0F E0 0F 40 FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF 00 SCRIPT_BUFFER_CHKSM W: AB CLR_SCRIPT_BFR W: A4 00 12 FA F7 F9 F5 F3 00 E8 14 F6 FB E7 7F F7 FA FB F6 E8 13 W: A4 01 08 FA F7 F8 F3 03 E8 0A F4 W: A4 02 2D DA 2A 0E DA 15 09 DA 00 00 DA F8 6E DA AA 0E DA 55 0A DA F7 6E DA AA 0E DA 54 09 DA 00 00 DA F6 6E EE 04 09 F2 00 F0 DA FF FF E9 09 01 W: A4 03 09 EE 04 09 F2 00 F0 E9 06 7F W: A4 05 1B EE 04 00 F1 F2 0E DA F6 6E EE 04 00 F1 F2 0E DA F7 6E EE 04 00 F1 F2 0E DA F8 6E W: A4 06 21 EE 04 00 F1 F2 0E DA F6 6E EE 04 00 F1 F2 0E DA F7 6E EE 04 00 F1 F2 0E DA F8 6E DA A6 8E DA A6 9C W: A4 07 23 EE 04 0D F1 F1 E9 05 1E EE 04 0F F1 F1 EE 03 00 F3 04 E7 2F F3 00 E7 05 F2 00 F2 00 EE 04 0D F2 FF F2 FF W: A4 08 1F DA A6 9E DA A6 9C EE 04 00 F1 F2 0E DA A9 6E EE 04 00 F1 F2 0E DA AA 6E DB DA 99 0E DA F5 6E W: A4 09 21 DA A6 80 DA A8 50 DA F5 6E DA 00 00 DA 00 00 EE 04 02 F2 00 F0 DA A9 2A DA D8 B0 DA AA 2A E9 1E 1F W: A4 0A 1C DA F8 6A DA A6 9E DA A6 9C EE 04 00 F1 F2 0E DA A9 6E EE 04 00 F1 F2 0E DA AA 6E DB W: A4 0B 23 EE 04 00 F1 F2 0E DA A8 6E DA A6 84 DA A6 82 DA 00 00 E9 03 03 E8 01 DA 00 00 DA A9 2A DA D8 B0 DA AA 2A W: A4 0D 1B DA 30 0E DA F8 6E DA 00 0E DA F7 6E DA 00 0E DA F6 6E EE 04 09 F2 00 F0 E9 06 0D W: A4 0E 1E DA A6 8E DA A6 8C DA 00 EF DA 00 F8 DA 30 0E DA F8 6E DA 00 0E DA F7 6E DA F6 6E DB DB DB W: A4 0F 33 EE 04 0F F1 F2 00 EE 03 00 F3 04 E7 2F F3 00 E7 05 F2 00 F2 00 DA F6 2A EE 04 0F F2 00 F1 EE 03 00 F3 04 E7 2F F3 00 E7 05 F2 00 F2 00 DA F6 2A E9 30 06 W: A4 11 1B DA 20 0E DA F8 6E DA 00 0E DA F7 6E DA 00 0E DA F6 6E EE 04 09 F2 00 F0 E9 06 07 W: A4 13 2F DA 20 0E DA F8 6E DA 00 0E DA F7 6E DA F6 6E DA A6 8E DA A6 9C EE 04 0D F1 F1 E9 05 02 EE 04 0F F1 F1 EE 03 00 F3 04 E7 2F F3 00 F2 00 F2 00 W: A4 16 32 DA 3C 0E DA F8 6E DA 00 0E DA F7 6E DA 05 0E DA F6 6E EE 04 0C F2 3F F2 3F DA 04 0E DA F6 6E EE 04 0C F2 8F F2 8F DA 00 00 EE 04 00 E8 01 F2 00 F2 00 W: A4 17 32 DA 3C 0E DA F8 6E DA 00 0E DA F7 6E DA 05 0E DA F6 6E EE 04 0C F2 0F F2 0F DA 04 0E DA F6 6E EE 04 0C F2 83 F2 83 DA 00 00 EE 04 00 E8 01 F2 00 F2 00 DOWNLOAD_SCRIPT W: AF R: 48 02 25 3C 0E 00 81 81 00 0F C0 0F E0 0F 40 FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF 00 SCRIPT_BUFFER_CHKSM W: A6 02 EA 00 EXECUTE_SCRIPT SET_ICSP_SPEED W: A6 01 F7 EXECUTE_SCRIPT MCLR_GND_ON W: A6 02 FC FF EXECUTE_SCRIPT VDD_GND_OFF VDD_ON W: A9 A5 00 01 W: A9 A5 16 01 W: A9 A5 01 01 CLEAR_UPLOAD_BFR RUN_SCRIPT (SCR_PROG_ENTRY) CLEAR_UPLOAD_BFR RUN_SCRIPT (SCR_ERASE_CHIP) CLEAR_UPLOAD_BFR RUN_SCRIPT (SCR_PROG_EXIT) W: A6 02 FE FD EXECUTE_SCRIPT VDD_OFF VDD_GND_ON W: A6 01 F6 EXECUTE_SCRIPT MCLR_GND_OFF W: A6 01 F7 EXECUTE_SCRIPT MCLR_GND_ON W: A6 02 FC FF EXECUTE_SCRIPT GND_OFF VDD_ON W: A9 A5 00 01 CLEAR_UPLOAD_BFR RUN_SCRIPT (SCR_PROG_ENTRY) W: A7 A8 03 00 00 00 CLR_DOWNLOAD_BFR DOWNLOAD_DATA W: A9 A5 06 01 CLR_UPLOAD_BFR RUN_SCRIPT (SCR_PROGMEM_WR_PREP) W: A7 A8 3D 1F D0 FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF W: A8 3E FF FF FF E0 D0 9E BA 01 D0 FD D7 AB A2 03 D0 AB 98 AB 88 F8 D7 EC 6E AE 50 AB A4 02 D0 ED 50 F2 D7 12 00 AC B2 01 D0 FD D7 AD 6E ED 50 12 00 05 00 22 D8 FE 6E ED 50 FD 6E ED 50 12 00 EC W: A8 3E 6E 00 0E AF D0 A6 8E A6 9C 7D D8 F9 D7 A6 8E A6 9C 7C D8 F5 D7 0F 0B 82 D8 F2 D7 14 D0 0F D0 14 D0 4B D0 50 D0 EC D7 FF 00 28 D0 1C D0 3B D0 2F D0 4F D0 5A D0 E7 D7 EA D7 80 D8 7F D0 EC W: A8 3E 6E 01 0E 90 D8 8F D0 7A D8 DB D7 D3 DF D9 D7 76 D8 FF 0F ED 50 02 E3 72 D8 DE DF 12 00 F8 DF FE D7 6D D8 EC 6E 7F D0 FC DF E4 6E ED 50 EC 6E 09 00 F5 50 78 D8 E7 06 FA E1 E5 52 12 00 F1 W: A8 09 DF E4 6E ED 50 EC 6E DE 50 W: A9 A5 07 04 W: A7 A8 3D 6E D8 E7 06 FB E1 E5 52 12 00 55 D8 E4 6E ED 50 52 D8 F5 6E 0D 00 ED 50 E7 06 FA E1 E5 52 AD D7 4A D8 E4 6E ED 50 47 D8 DE 6E ED 50 E7 06 FB E1 E5 52 A3 D7 BF DF DA 6E ED 50 D9 6E ED W: A8 3E 50 9D D7 B9 DF F7 6E ED 50 F6 6E ED 50 97 D7 EC 6E 40 0E E4 6E FF 0E EC 6E 09 00 F5 50 ED 14 E7 06 FA E1 E5 52 AA D7 EC 6E E9 50 80 0F A6 D7 EC 6E F2 9E 55 0E A7 6E AA 0E A7 6E A6 82 F2 W: A8 3E 8E ED 50 12 00 A6 84 A6 88 F3 D7 A6 84 A6 98 0A 00 EF DF 09 00 12 00 EF 60 EF 6E ED 50 E8 44 FD 26 01 0B FE 22 ED 50 12 00 00 D8 EC 6E 57 0E E4 6E ED 50 E7 06 FE E1 E5 52 12 00 81 AC 01 W: A8 3E D0 FD D7 F4 DF F2 DF EC 6E 08 0E E4 6E 00 0E 81 AC 02 D0 D8 80 01 D0 D8 90 E8 30 E7 DF E7 06 F7 E1 E5 52 12 00 8A 9E E1 DF EC 6E 08 0E E4 6E ED 50 E8 30 02 E3 8A 8E 01 D0 8A 9E D7 DF E7 W: A8 09 06 F8 E1 E5 52 8A 8E ED 50 W: A9 A5 07 04 W: A7 A8 3D D1 D7 00 0E FC 6E 00 EE 7F F0 10 EE 8F F0 8B 68 EC 6E 70 0E D3 6E 93 8C 93 9E ED 50 5C D7 FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF W: A8 3E FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF W: A8 3E FF FF FF FF FF C8 0F 12 00 FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF W: A8 3E FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF W: A8 09 FF FF FF FF FF FF FF FF FF W: A9 A5 07 04 CLR_DOWNLOAD_BFR DOWNLOAD DATA ... | CLR_UPLOAD_BUFFER RUN_SCRIPT (SCR_PROGMEM_WR) | 3x W: A9 A5 01 01 CLR_UPLOAD_BUFFER RUN_SCRIPT (SCR_PROG_EXIT) W: A9 A5 00 01 CLR_UPLOAD_BUFFER RUN_SCRIPT (SCR_PROG_ENTRY) W: A7 A8 03 00 00 00 CLR_DOWNLOAD_BUFFER DOWNLOAD_DATA W: A9 A5 0A 01 CLR_UPLOAD_BUFFER RUN_SCRIPT (SCR_EE_WR_PREP) W: A7 A8 10 FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF CLR_DOWNLOAD_BUFFER DOWNLOAD_DATA W: A9 A5 0B 10 CLR_UPLOAD_BUFFER RUN_SCRIPT (SCR_EE_WR) W: A9 A5 01 01 CLR_UPLOAD_BUFFER RUN_SCRIPT (SCR_PROG_EXIT) W: A9 A5 00 01 CLR_UPLOAD_BUFFER RUN_SCRIPT (SCR_PROG_ENTRY) W: A7 A8 08 FF FF FF FF FF FF FF FF CLR_DOWNLOAD_BUFFER DOWNLOAD_DATA W: A9 A5 13 01 CLR_UPLOAD_BUFFER RUN_SCRIPT (SCR_USERID_WR) W: A9 A5 01 01 CLR_UPLOAD_BUFFER RUN_SCRIPT (SCR_PROG_EXIT) W: A6 02 FE FD EXECUTE_SCRIPT VDD_OFF VDD_GND_ON W: A6 01 F6 EXECUTE_SCRIPT MCLR_GND_OFF W: A6 02 FC FF EXECUTE_SCRIPT VDD_GND_OFF VDD_ON W: A9 A5 00 01 CLR_UPLOAD_BUFFER RUN_SCRIPT (SCR_PROG_ENTRY) W: A7 A8 03 00 00 00 CLR_DOWNLOAD_BUFFER DOWNLOAD_DATA W: A9 A5 05 01 # SCR_PROGMEM_ADDRSET CLR_UPLOAD_BFR RUN_SCRIPT W: A9 A5 03 01 AC R: 1F D0 FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF 00 W: AC R: E0 D0 9E BA 01 D0 FD D7 AB A2 03 D0 AB 98 AB 88 F8 D7 EC 6E AE 50 AB A4 02 D0 ED 50 F2 D7 12 00 AC B2 01 D0 FD D7 AD 6E ED 50 12 00 05 00 22 D8 FE 6E ED 50 FD 6E ED 50 12 00 EC 6E 00 0E AF D0 00 W: A9 A5 03 01 AC R: A6 8E A6 9C 7D D8 F9 D7 A6 8E A6 9C 7C D8 F5 D7 0F 0B 82 D8 F2 D7 14 D0 0F D0 14 D0 4B D0 50 D0 EC D7 FF 00 28 D0 1C D0 3B D0 2F D0 4F D0 5A D0 E7 D7 EA D7 80 D8 7F D0 EC 6E 01 0E 90 D8 8F D0 00 W: AC R: 7A D8 DB D7 D3 DF D9 D7 76 D8 FF 0F ED 50 02 E3 72 D8 DE DF 12 00 F8 DF FE D7 6D D8 EC 6E 7F D0 FC DF E4 6E ED 50 EC 6E 09 00 F5 50 78 D8 E7 06 FA E1 E5 52 12 00 F1 DF E4 6E ED 50 EC 6E DE 50 00 W: A9 A5 03 01 AC R: 6E D8 E7 06 FB E1 E5 52 12 00 55 D8 E4 6E ED 50 52 D8 F5 6E 0D 00 ED 50 E7 06 FA E1 E5 52 AD D7 4A D8 E4 6E ED 50 47 D8 DE 6E ED 50 E7 06 FB E1 E5 52 A3 D7 BF DF DA 6E ED 50 D9 6E ED 50 9D D7 00 W: AC R: B9 DF F7 6E ED 50 F6 6E ED 50 97 D7 EC 6E 40 0E E4 6E FF 0E EC 6E 09 00 F5 50 ED 14 E7 06 FA E1 E5 52 AA D7 EC 6E E9 50 80 0F A6 D7 EC 6E F2 9E 55 0E A7 6E AA 0E A7 6E A6 82 F2 8E ED 50 12 00 00 W: A9 A5 03 01 AC R: A6 84 A6 88 F3 D7 A6 84 A6 98 0A 00 EF DF 09 00 12 00 EF 60 EF 6E ED 50 E8 44 FD 26 01 0B FE 22 ED 50 12 00 00 D8 EC 6E 57 0E E4 6E ED 50 E7 06 FE E1 E5 52 12 00 81 AC 01 D0 FD D7 F4 DF F2 DF 00 W: AC R: EC 6E 08 0E E4 6E 00 0E 81 AC 02 D0 D8 80 01 D0 D8 90 E8 30 E7 DF E7 06 F7 E1 E5 52 12 00 8A 9E E1 DF EC 6E 08 0E E4 6E ED 50 E8 30 02 E3 8A 8E 01 D0 8A 9E D7 DF E7 06 F8 E1 E5 52 8A 8E ED 50 00 W: A9 A5 03 01 AC R: D1 D7 00 0E FC 6E 00 EE 7F F0 10 EE 8F F0 8B 68 EC 6E 70 0E D3 6E 93 8C 93 9E ED 50 5C D7 FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF 00 W: AC R: FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF 00 W: A9 A5 03 01 AC R: C8 0F 12 00 FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF 00 W: AC R: FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF 00 CLR_UPLOAD_BFR RUN_SCRIPT (SCR_PROGMEM_RD) UPLOAD_DATA_NOLEN UPLOAD_DATA_NOLEN ... W: A9 A5 01 01 CLR_UPLOAD_BUFFER RUN_SCRIPT (SCR_PROG_EXIT) W: A9 A5 00 01 CLR_UPLOAD_BUFFER RUN_SCRIPT (SCR_PROG_ENTRY) W: A7 A8 03 00 00 00 CLR_DOWNLOAD_BUFFER DOWNLOAD_DATA W: A9 A5 08 01 CLR_UPLOAD_BFR RUN_SCRIPT (SCR_EE_RD_PREP) W: A9 A5 09 04 AC R: FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF 00 W: AC R: FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF 00 W: A9 A5 09 04 AC R: FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF 00 W: AC R: FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF 00 W: A9 A5 09 04 AC R: FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF 00 W: AC R: FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF 00 W: A9 A5 09 04 AC R: FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF 00 W: AC R: FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF 00 W: A9 A5 09 04 AC R: FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF 00 W: AC R: FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF 00 W: A9 A5 09 04 AC R: FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF 00 W: AC R: FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF 00 W: A9 A5 09 04 AC R: FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF 00 W: AC R: FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF 00 W: A9 A5 09 04 AC R: FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF 00 W: AC R: FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF 00 CLR_UPLOAD_BFR RUN_SCRIPT (SCR_EE_RD) UPLOAD_DATA_NOLEN UPLOAD_DATA_NOLEN W: A9 A5 01 01 CLR_UPLOAD_BUFFER RUN_SCRIPT (SCR_PROG_EXIT) W: A9 A5 00 01 CLR_UPLOAD_BUFFER RUN_SCRIPT (SCR_PROG_ENTRY) W: A9 A5 11 01 AC CLR_UPLOAD_BUFFER RUN_SCRIPT (SCR_USERID_RD) R: FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF 00 W: A9 A5 01 01 CLR_UPLOAD_BUFFER RUN_SCRIPT (SCR_PROG_EXIT) W: A6 02 FE FD EXECUTE_SCRIPT VDD_OFF VDD_GND_ON W: A6 01 F7 EXECUTE_SCRIPT MCLR_GND_ON W: A6 02 FC FF EXECUTE_SCRIPT GND_OFF VDD_ON W: A9 A5 00 01 CLR_UPLOAD_BUFFER RUN_SCRIPT (SCR_PROG_ENTRY) W: A7 A8 03 00 00 00 CLR_DOWNLOAD_BFR DOWNLOAD_DATA W: A9 A5 0E 01 CLR_UPLOAD_BUFFER RUN_SCRIPT (SCR_CONFIG_WR_PREP) W: A7 A8 10 00 06 0F 0E 00 81 81 00 0F C0 0F E0 0F 40 A5 BF CLR_DOWNLOAD_BUFFER DOWNLOAD_DATA W: A9 A5 0F 01 CLR_UPLOAD_BUFFER RUN_SCRIPT (SCR_CONFIG_WR) W: A9 A5 01 01 CLR_UPLOAD_BUFFER RUN_SCRIPT (SCR_PROG_EXIT) W: A6 02 FC FF EXECUTE_SCRIPT GND_OFF VDD_ON W: A9 A5 00 01 CLR_UPLOAD_BUFFER RUN_SCRIPT (SCR_PROG_ENTRY) W: A9 A5 0D 01 CLR_UPLOAD_BUFFER RUN_SCRIPT (SCR_CONFIG_RD) W: AA R: 0E 00 06 0F 0E 00 81 81 00 0F C0 0F E0 0F 40 FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF 00 UPLOAD_DATA W: A9 A5 01 01 CLR_UPLOAD_BUFFER RUN_SCRIPT (SCR_PROG_EXIT) W: A6 02 FE FD EXECUTE_SCRIPT VDD_OFF VDD_GND_ON PICkit 2 Program Report 12-2-2009, 18:44:57 Device Type: PIC18F2620 Program Succeeded. W: A6 01 F7 W: A6 02 FC FF W: A9 A5 00 01 W: A9 A5 02 01 W: AA R: 02 86 0C 0F 0E 00 81 81 00 0F C0 0F E0 0F 40 FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF 00 W: A9 A5 01 01 W: A6 02 FE FD W: A6 01 F6 Device ID = 0C80 Revision = 0006 Device Name = PIC18F2620 W: A6 01 F7 W: A6 02 FE FD W: A2 R: 05 00 0C 0F 0E 00 81 81 00 0F C0 0F E0 0F 40 FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF 00 Operation Succeeded -- Keeping it in reset before turning on the voltage seems to make things behave better. No more 'uart hack'. But that's it.. I don't see anything wrong except that my scripts are not uploaded... Maybe they contain loops and this won't work? No difference.. It's really strange: the 1220 does erase but no program, while the 2620 doesn't do erase. Jeezes.. There is a spot with delay: I've added a delay in: (define (target-on) (CLR_UPLOAD_BFR) (EXECUTE_SCRIPT (VDD_GND_OFF) (VDD_ON)) (msleep 150) ;; from pk2cmd source (void)) But this gives all zeros.. There's only one thing to do: start from an exact copy of the trace and build abstractions around that. I don't know what I messed up, but it did work before.. http://www.auelectronics.selfip.com/pdfs/CB0703_PICKit2_Schematic.pdf Entry: more pk2 Date: Sat Feb 14 12:03:09 CET 2009 I don't understand... The dumps are almost exactly the same, but it doesn't work.. Somehow there's something wrong with the primitives. I'm loosing to much time with this. Isn't there a better way? What I want is to upload a program and then connect to the serial port, that's all. Here's a dump of ~/pickit/pk2cmdv1.20LinuxMacSource/pk2cmd -I -p PIC18F2620 -E >staapler.pk2.erase.log W: A6 02 FE FD W: A0 80 2A D2 W: A1 40 DF 9C W: AF R: 3F 00 F7 25 9E BA 01 D0 FD D7 AB A2 03 D0 AB 98 AB 88 F8 D7 EC 6E AE 50 AB A4 02 D0 ED 50 F2 D7 12 00 AC B2 01 D0 FD D7 AD 6E ED 50 12 00 05 00 22 D8 FE 6E ED 50 FD 6E ED 50 12 00 EC 6E 00 0E 00 W: AB W: A4 00 12 FA F7 F9 F5 F3 00 E8 14 F6 FB E7 7F F7 FA FB F6 E8 13 W: A4 01 08 FA F7 F8 F3 03 E8 0A F4 W: A4 02 2D DA 2A 0E DA 15 09 DA 00 00 DA F8 6E DA AA 0E DA 55 0A DA F7 6E DA AA 0E DA 54 09 DA 00 00 DA F6 6E EE 04 09 F2 00 F0 DA FF FF E9 09 01 W: A4 03 09 EE 04 09 F2 00 F0 E9 06 7F W: A4 05 1B EE 04 00 F1 F2 0E DA F6 6E EE 04 00 F1 F2 0E DA F7 6E EE 04 00 F1 F2 0E DA F8 6E W: A4 06 21 EE 04 00 F1 F2 0E DA F6 6E EE 04 00 F1 F2 0E DA F7 6E EE 04 00 F1 F2 0E DA F8 6E DA A6 8E DA A6 9C W: A4 07 23 EE 04 0D F1 F1 E9 05 1E EE 04 0F F1 F1 EE 03 00 F3 04 E7 2F F3 00 E7 05 F2 00 F2 00 EE 04 0D F2 FF F2 FF W: A4 08 1F DA A6 9E DA A6 9C EE 04 00 F1 F2 0E DA A9 6E EE 04 00 F1 F2 0E DA AA 6E DB DA 99 0E DA F5 6E W: A4 09 21 DA A6 80 DA A8 50 DA F5 6E DA 00 00 DA 00 00 EE 04 02 F2 00 F0 DA A9 2A DA D8 B0 DA AA 2A E9 1E 1F W: A4 0A 1C DA F8 6A DA A6 9E DA A6 9C EE 04 00 F1 F2 0E DA A9 6E EE 04 00 F1 F2 0E DA AA 6E DB W: A4 0B 23 EE 04 00 F1 F2 0E DA A8 6E DA A6 84 DA A6 82 DA 00 00 E9 03 03 E8 01 DA 00 00 DA A9 2A DA D8 B0 DA AA 2A W: A4 0D 1B DA 30 0E DA F8 6E DA 00 0E DA F7 6E DA 00 0E DA F6 6E EE 04 09 F2 00 F0 E9 06 0D W: A4 0E 1E DA A6 8E DA A6 8C DA 00 EF DA 00 F8 DA 30 0E DA F8 6E DA 00 0E DA F7 6E DA F6 6E DB DB DB W: A4 0F 33 EE 04 0F F1 F2 00 EE 03 00 F3 04 E7 2F F3 00 E7 05 F2 00 F2 00 DA F6 2A EE 04 0F F2 00 F1 EE 03 00 F3 04 E7 2F F3 00 E7 05 F2 00 F2 00 DA F6 2A E9 30 06 W: A4 11 1B DA 20 0E DA F8 6E DA 00 0E DA F7 6E DA 00 0E DA F6 6E EE 04 09 F2 00 F0 E9 06 07 W: A4 13 2F DA 20 0E DA F8 6E DA 00 0E DA F7 6E DA F6 6E DA A6 8E DA A6 9C EE 04 0D F1 F1 E9 05 02 EE 04 0F F1 F1 EE 03 00 F3 04 E7 2F F3 00 F2 00 F2 00 W: A4 16 32 DA 3C 0E DA F8 6E DA 00 0E DA F7 6E DA 05 0E DA F6 6E EE 04 0C F2 3F F2 3F DA 04 0E DA F6 6E EE 04 0C F2 8F F2 8F DA 00 00 EE 04 00 E8 01 F2 00 F2 00 W: A4 17 32 DA 3C 0E DA F8 6E DA 00 0E DA F7 6E DA 05 0E DA F6 6E EE 04 0C F2 0F F2 0F DA 04 0E DA F6 6E EE 04 0C F2 83 F2 83 DA 00 00 EE 04 00 E8 01 F2 00 F2 00 W: AF R: 48 02 25 3C 9E BA 01 D0 FD D7 AB A2 03 D0 AB 98 AB 88 F8 D7 EC 6E AE 50 AB A4 02 D0 ED 50 F2 D7 12 00 AC B2 01 D0 FD D7 AD 6E ED 50 12 00 05 00 22 D8 FE 6E ED 50 FD 6E ED 50 12 00 EC 6E 00 0E 00 W: A6 02 EA 00 Erasing Device... W: A6 01 F7 W: A6 02 FC FF W: A9 A5 00 01 W: A9 A5 16 01 W: A9 A5 01 01 W: A6 02 FE FD W: A6 01 F6 W: A6 01 F7 W: A6 02 FC FF W: A9 A5 00 01 W: A9 A5 02 01 W: AA R: 02 86 0C 3C 9E BA 01 D0 FD D7 AB A2 03 D0 AB 98 AB 88 F8 D7 EC 6E AE 50 AB A4 02 D0 ED 50 F2 D7 12 00 AC B2 01 D0 FD D7 AD 6E ED 50 12 00 05 00 22 D8 FE 6E ED 50 FD 6E ED 50 12 00 EC 6E 00 0E 00 W: A9 A5 01 01 W: A6 02 FE FD W: A6 01 F6 Device ID = 0C80 Revision = 0006 Device Name = PIC18F2620 W: A6 01 F7 W: A6 02 FE FD W: A2 R: 05 00 0C 3C 9E BA 01 D0 FD D7 AB A2 03 D0 AB 98 AB 88 F8 D7 EC 6E AE 50 AB A4 02 D0 ED 50 F2 D7 12 00 AC B2 01 D0 FD D7 AD 6E ED 50 12 00 05 00 22 D8 FE 6E ED 50 FD 6E ED 50 12 00 EC 6E 00 0E 00 Operation Succeeded # The only thing that matters here is: W: A6 01 F7 W: A6 02 FC FF W: A9 A5 00 01 # ENTER PROGRAMMING W: A9 A5 16 01 W: A9 A5 01 01 # LEAVE PROGRAMMING W: A6 02 FE FD W: A6 01 F6 W: A6 01 F7 W: A6 02 FC FF # I don't see any differences.. It's probably something outside this layer. Maybe USB doesn't get initialized properly? Maybe the PK2 needs an extra reset or so? Entry: dropping pk2 support Date: Sat Feb 14 12:31:52 CET 2009 It looks like it's best to drop programming support and focus on serial connectivity only, although I have a hard time letting this go, but I'm really stuck and it doesn't look like there's help available. It might also be better to implement the serial + reset interface outside of staapl. So, let's do that. Entry: pk2serial.c Date: Sat Feb 14 19:00:57 CET 2009 In tools/pk2serial.c there's a stdio bridge for PK2 serial communication. This is useful as a general-purpose utility and might be included in gputils? Tested with loopback. The bitbanged IO for PIC however is not really relyable. I might want to fix that first. Entry: forget about bitbanged uart Date: Sat Feb 14 19:41:36 CET 2009 Actually, at the expense of 2 pins I can just forget about bitbanged serial and connect the uart+icd pins together and get on with writing some software.. There is no general way to solve this: when the serial port is necessary for something else, the console comm needs to be incorporated into the whole app anyway, depending on the app's need. Timer-less / iterrupt-less bitbanged serial is really asking for trouble apparently.. Maybe it's easier to get synchronous communication working over the ICD2 connector. Now that I do have the basic PK2 protocol figured out and implemented in a C program, this step might not be as great.. The simplest way to do this is to use the AUX pin as a slave->master interrupt. Upon reception, the master can clock out the reply. When the message is finished, slave resets AUX pin. Alternatively, after writing a message the data pin could be used as an interrupt, and the clock's falling edge would serve as an acknowledgement, with the rising edge sampling the first bit. But.. This is a still lot of work to get going. Probably not worth the trouble. One more thing: PGC should then be connected to target's RX/DT and not the CK input, so synchronous slave mode can only be used if these wires are swapped.. Damn... ICD2 serial color 1 /MCLR white 2 VDD red 3 GND black 4 PGD RX (<- target TX/CK) yellow 5 PGC TX (-> target RX/DT) orange 6 PGM Unidirectional transfer with PIC as slave seems simple enough though.. But that's not very useful. One last hack: After host has transmitted last message, client asserts data line low. Host starts reading out these low bits. When client is done, it waits for a clock signal and sends a single high bit, then starts slave transmission of the message. Host reads length byte, and subsequent data bytes. Entry: next Date: Sat Feb 14 23:02:11 CET 2009 - report identifier errors with source location - switch examples back to ICD + hardware uart. - fix upload + console with external apps - start working on 18F2550 USB support Entry: workflow Date: Sun Feb 15 11:24:10 CET 2009 Maybe staapl should behave a bit more like ordinary compiler / telnet. This works very well (staapl/app/Makefile) .PHONY: 452-40 452-40: staaplc -d /dev/ttyUSB0 452-40.f pk2cmd -I -p PIC18F452 -f 452-40.hex -M -R staapl 452-40.dict Perfected it a bit with rules + programmer commands embedded in the .f files. It works on the 18f2550 too now.. Apparently one of the usbpicstamp boards is defective. Maybe oscillator? Entry: wait a minute.. Date: Sun Feb 15 13:14:52 CET 2009 The pk2 code might actually work. Maybe I just configured the wrong device? Exactly: trying to do a 2620 with a 1220 config. Entry: gdb vs. forth Date: Sun Feb 15 13:38:09 CET 2009 The Staapl/Forth approach is to see a program as a library with multiple entry points, which can be invoked during debugging. In gdb it's also possible to call functions. To debug a library of functions, simply add dummy main() and instantiate the program. I.e. test.c : #include void boo(void){ printf("boo!\n"); } int main(int argc, char **argv){ return 0; } tom@zzz:/tmp$ gcc -g test.c tom@zzz:/tmp$ gdb ./a.out GNU gdb 6.8-debian Copyright (C) 2008 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "i486-linux-gnu"... (gdb) break main Breakpoint 1 at 0x80483c6: file test.c, line 7. (gdb) r Starting program: /tmp/a.out Breakpoint 1, main () at test.c:7 7 return 0; (gdb) p boo() boo! $1 = void (gdb) Entry: pk2serial.c Date: Sun Feb 15 15:30:05 CET 2009 It seems to work fine, but when I connect it to the 2550, after a while its TX goes low.. ??? Let's decouple this problem from getting the USB to work.. Use USB + PK2 + serial connection. It would be nice to have a fully functional pk2 interface, so I'm not going to give up yet. The synchronous + interrupt scheme mentioned yesterday might be an interesting route and should be not too hard to implement. Entry: old usb driver Date: Sun Feb 15 18:19:41 CET 2009 It works "a bit".. I got stuck in may 2008 when trying to get it to work on a tight schedule.. The problem then was probably a compiler bug that I hope has made it out now.. The files I have are: ~/darcs/brood-4/prj/USBPicStamp/usb.f ~/darcs/brood-4/prj/USBPicStamp/cdc.usb ~/darcs/brood-4/host/usb.ss These are now in staapl/pic18 The architecture is as follows: * Each usb driver starts from a .usb file containing a high level version of the configuration data structure that will be queried by the host. From this a "compiled" version of this data structure is generated. * Next to this there is a driver library that implements the USB logic and state machine. There is one big decision to be made up front though: use the extended instruction set or not? In this mode, there is a "frame pointer". The first 0x60 bytes of the first page are interpreted relative to FSR2. The thing is: these might be convenient for c-style local variables or objects, but I'm not sure if this deserves a whole new architecture / ABI. Is it possible to make this just a simple extension? Or can we do without? I did use it extensively before.. At first sight, this does intruce a new problem: namespaces. When objects are used, they behave as namespaces, and the language simply doesn't have any straightforward mechanisms for it.. So.. Let's just keep this as an experiment and solve it using dynamic variables. Together with cooperative multitasking and clear data separation of the preemptive parts (interrupts) dynamic binding should work just fine. Entry: dynamic binding Date: Sun Feb 15 19:46:01 CET 2009 Since Forth already has its local mechanisms for function composition, adding dynamic binding (in the form of shallow binding) to implement limited extent objects might not be such a bad idea. The big advantage here is ease of implementation: * It doesn't require separate namespaces (local vs. global variables) * It can use deterministic stacks for storage (the retain/return stack). * No indirect addressing is necessary (expensive on PIC) There are plenty of disadvantages though, most of them boiling down to "no closures". But, for a real-time language without parameter names, this is a really good mechanism, since it is 100% deterministic for memory allocation. Entry: 18F extended mode Date: Mon Feb 16 13:22:52 CET 2009 Note however that this IS an interesting target for writing a more traditional algol style language without having to resort to register allocation. Entry: linux usbmon Date: Tue Feb 17 10:08:52 CET 2009 To enable usbmon in the 2.6.x kernel, follow these steps :- 1) Compile the USB_MON module (Device Drivers / USB Support / USB Monitor) or compile it into the kernel 2) Enable debugfs in the kernel, rebuild the kernel and reboot using the new kernel 3) mount -t debugfs none_debugs /sys/kernel/debug 4) modprobe usbmon 5) Find the Bus to which the USB device connects (cat /proc/bus/usb/devices and look for the 'T:' line which gives the bus number 6) cat /sys/kernel/debug/usbmon/t > /tmp/Bus.txt (NOTE : It's t, not just ) 7) Use the USB device 8) Kill the 'cat' command 9) Examine the file /tmp/Bus.txt Read /usr/src/linux/Documentation/usb/usbmon.txt to decode the output :-) Peter Entry: Macros and such Date: Fri Feb 20 15:29:19 CET 2009 Some important ideas that need to be cleared out - Lisp vs. other syntax and quasiquotation. As opposed to what I used to think, Lisp syntax is only _convenient_ for quasiquotation based metaprogramming, but not at all _necessary_. This is illustrated by MetaOcaml's syntax extensions. One just needs to extend the parser basicly. - Pre-checks: giving semantics to the macro INPUT instead of "specification by compiler" As explained [1], untyped macros can be made much more useful when they can be combined with static semantics. This is a very powerful idea: bottom up _typed_ language design. - MetaOcaml approach: limited to bind and eval, no other constructs? In dynamic vs. static: A big advantage of Lisp/Scheme style macros is that ALL langauge constructs can be expanded to. In MetaOcaml only EVALUATION TIME of nested expressions can be manipulated, but the expressions themselves are really only let, lambda and apply. [1] http://www.cc.gatech.edu/~dfisher/ziggurat/icfp-ziggurat.tar.gz Entry: next Date: Sat Feb 21 11:24:39 CET 2009 I need to get organized.. It's difficult at this time to pick a target, so here's a list. First, the practical things that need to happen: * USB driver + driver generator. * Better error reporting. * A logic analyser. Then what I would like: * C parser + prettyprinter in typed scheme. * Use it to refactor PacketForth code. * Staapl core in typed scheme. * Understand the Ziggurat paper * Static semantics for Staapl forth. * Understand Tony Garnock-Jones' packrat parser. * Definitive state-machine abstraction. * A pure stack VM for implementing Occam primitives. * A macro-extensible static subset of (typed) Scheme that maps to C. Entry: error messages Date: Sat Feb 21 11:37:57 CET 2009 This is quite straightforward: for each error that's obscure, go fix it! - Allow EOF terminated identifiers. Hacked by adding \#newline - reference to undefined identifier: macro/SETUP.wLengthHigh This should state the place of reference. The problem is that this is due to namespace bindings.. Compiling as a module should work better? Hmm.. This is a can of worms.. Bout time it's getting fixed. FIXED: problem was in symbol prefixing: source info got lost. Entry: require Date: Sat Feb 21 12:58:05 CET 2009 The problem is that I don't remember how it works.. Let's go over it again. A "#lang planet zwizwa/staapl/pic18" will associate the forth reader and expand the current file to a module form. Let's add some debug print to that. The problem seems to be that files in the pic18/ directory are not in the load path when we invoke the forth langauge as a module bye requiring it. First problem to fix: a "load" that occurs in a required .f module should resolve relative to the location of that .f module. I'm confused.. Let's try to get the minimal things to work first: OK. I've added another form in parser-tx.ss / require-tx (syntax-case code (planet staapl) ((_ planet module . code+) (next `(require (planet ,(p #'module))) #'code+)) ((_ staapl module . code+) (next `(require (planet ,(p #'module) ("zwizwa" "staapl.plt"))) #'code+)) ((_ module . code+) (next `(require ,(p #'module))) #'code+) )) This way "require staapl " will work. Let's see if this can be propagated all the way.. The idea is to have some application that loads as a module. This requires all "parameters" to be defined.. (This is really getting a bit messy..) Ok. so there's an example that can be loaded like this: mzscheme -p zwizwa/staapl/pic18/mod-example or: (require (planet zwizwa/staapl/pic18/mod-example)) (I used an .ss extension to be able to use shorthand syntax.) What does this give you? After also requireing the support code (require (planet zwizwa/staapl/pic18)) this gives you access to the compiled code: (all-bin) So.. This enables us to track down undefined symbols. However, compare these two: tom@zzz:~/staapl/staapl/pic18$ mzc -k test.ss mzc v4.1.0.3 [3m], Copyright (c) 2004-2008 PLT Scheme Inc. "test.ss": making "/data/safe/tom/darcs/brood-5/staapl/pic18/test.ss" making "/home/tom/staapl/staapl/forth/module-reader.ss" making "/home/tom/staapl/staapl/pic18/lang/reader.ss" making "/home/tom/staapl/staapl/pic18/lang.ss" compile: unbound identifier in module in: macro/device-descriptor tom@zzz:/tmp$ cat broem.ss #lang scheme/base broem tom@zzz:/tmp$ mzc -k broem.ss mzc v4.1.0.3 [3m], Copyright (c) 2004-2008 PLT Scheme Inc. "broem.ss": making "/tmp/broem.ss" broem.ss:2:0: compile: unbound identifier in module in: broem === context === /usr/local/plt-zwizwa/lib/plt/collects/compiler/cm.ss:117:0: compile-zo* /usr/local/plt-zwizwa/lib/plt/collects/compiler/cm.ss:201:0: compile-zo /usr/local/plt-zwizwa/lib/plt/collects/compiler/cm.ss:235:2: do-check /usr/local/plt-zwizwa/lib/plt/collects/compiler/cm.ss:281:4 for-loop The source location information seems to get lost somewhere.. I have no clue why.. Too many layers. The problem seems to be in load-tx. I guess in file->forth-syntax. Nope.. All is ok there.. Looks like I found it: in dispatch-element in rpn-tx.ss, after mapping the symbol there seems to be no source info. The mapper for purrr is defined in coma/macro-tx.ss This seems to trace back to ns-prefixed in scat/ns-tx.ss which traces back to prefix in tools/stx.ss YES! That's it. Fixed by adding 3rd argument to ->syntax: (define (prefix . names) (let ((orig-stx (car (reverse names)))) ;; use original name info (->syntax orig-stx ;; lexical context (string->symbol (apply string-append (map (lambda (x) (format "~a" (->datum x))) names))) orig-stx ;; source info ))) aint this pretty: -*- mode: compilation; default-directory: "~/staapl/staapl/pic18/" -*- Compilation started at Sat Feb 21 16:01:34 cd ~/staapl/staapl/pic18 && mzc -k test.ss mzc v4.1.0.3 [3m], Copyright (c) 2004-2008 PLT Scheme Inc. "test.ss": making "/data/safe/tom/darcs/brood-5/staapl/pic18/test.ss" making "/home/tom/staapl/staapl/pic18/lang.ss" usb.f:207:4: compile: unbound identifier in module in: macro/device-descriptor === context === /usr/local/plt-zwizwa/lib/plt/collects/compiler/cm.ss:117:0: compile-zo* /usr/local/plt-zwizwa/lib/plt/collects/compiler/cm.ss:201:0: compile-zo /usr/local/plt-zwizwa/lib/plt/collects/compiler/cm.ss:235:2: do-check /usr/local/plt-zwizwa/lib/plt/collects/compiler/cm.ss:281:4 for-loop Compilation exited abnormally with code 1 at Sat Feb 21 16:01:36 Now, is it possible to somehow make the dynamic scripts in app/ behave as if they were static modules? This means "load" should know the path even if it's required.. Now, if you add this it works: #lang planet zwizwa/staapl/pic18 path /home/tom/staapl/staapl/pic18 The idea is: somwhere in the expansion of the module the path should be added. Ok.. simplified it to "path staapl pic18". Is it possible (or desirable) to do this automatically? Next step: do the same in staaplc: instead of loading the file into a namespace, do a toplevel "require". Let's leave this for later. There's some restructuring and simplification necessary to make this the default application development approach. Entry: usb Date: Sat Feb 21 17:53:45 CET 2009 The core code for usb seems simple. It needs rewriting with a bit smarter use of "dynamic" variables and run-time constructs instead of being littered with macros. The most problematic is the code generator, which translates highlevel descriptors to something that can be embedded in the code. Entry: core design documentation Date: Sat Mar 7 12:03:35 CET 2009 Maybe it's time to start abstracting the core design of the SCAT machine: exactly what is a state, how to extend and lift states, how can it be made imperative and how to relate it to monads. A good starting point would be: what can you do when you type this in drscheme and press "run" : #lang planet zwizwa/staapl/pic18 : abc 123 ; One example is (print-code). Entry: hacking Date: Sun Mar 8 09:20:34 CET 2009 Like I said before, I don't know where to move next. However, I'd like to concentrate on some practical things.. I still would like to give the synchronous ICD2 comm on PK2 approach a try, since this would give a very straightforward tethered development workflow. I wonder, how does this relate to AVR programming? I think AVR uses (a variant of) SPI. Let's try to re-incorporate the pk2 code back into the project, then see if I can make a transmission work. ah crap.. let's do something else.. I really need to focus on applications, and learn to live with the incomplete state. Entry: drscheme Date: Sun Mar 8 12:44:15 CET 2009 running the macro expander on forth code gives an error. apparently source location can be a symbol too, so the instantiation macro needs extra quotation FIXED. Entry: Object code : types or other static information? Date: Sun Mar 8 13:24:19 CET 2009 What information is attached to assembly language opcodes at compile time? At this moment, they are nothing but tags used to guide pattern matching. Maybe this should contain some more static info? This is something i started working on in asm/static.ss but apparently stopped. This is one of the most important points of change: assembler instructions need to be properly typed. Entry: evaluation hardware Date: Sat Mar 21 09:50:36 CET 2009 - something with TI dsp, i.e. the davinci/C6000 - xilinx virtex, 500k gates Entry: reading C code Date: Sat Mar 21 09:51:32 CET 2009 Not strictly part of Staapl, but I'd like to comment on it here. Dave Herman recently released version 2.0 of his C parsing library, and suggested using his prettyprinter tools for generating C code. Goal: create a struct reference -> get/set method refactoring tool and let it loose on the PF source code. http://planet.plt-scheme.org/display.ss?package=c.plt&owner=dherman http://planet.plt-scheme.org/display.ss?package=pprint.plt&owner=dherman Entry: FPGA Date: Sat Mar 21 10:09:48 CET 2009 I'd like to start doing some VHDL experiments. What hardware to use? There are 2 big players: Xilinx and Alterra. Both have limited free tools that work on linux. It's a bit arbitrary, but I'm going to stick with Xilinx. Circumstantial bias mostly points in that direction... There are mainly 2 classes of devices: - high density: Virtex - low density: Spartan Dev boards: Spartan 3A - 400k gates. 49,- http://www.xilinx.com/products/devkits/aes_sp3a_eval400_avnet.htm Entry: DSP target Date: Sat Mar 21 11:00:53 CET 2009 C64x+ fixed point DSP as a Staapl target. The most likely candidate is the Davinci chip, and the Neuros open settop box. http://store.neurostechnology.com/neuros-osd2-platform-p-55.html Texas Instruments DaVinci DM6446 SoC. = ARM9 + C64x+ http://focus.ti.com/docs/prod/folders/print/tms320dm6446.html For TI C67x+ floating point DSPs, start here: http://focus.ti.com/paramsearch/docs/parametricsearch.tsp?family=dsp§ionId=2&tabId=1948&familyId=1401 TI Home > Digital Signal Processing > Processor Platforms > C6000™ Floating-point DSPs Entry: Forth and data structures Date: Sun Mar 22 21:51:26 CET 2009 There is a clear link between lazy function application and data structure pattern matching in a functional programming language. The body of a pattern matching clause behaves as the body of a function: the data structure "transports" the parameters of datastructure construction to the datastructure deconstruction binding environment, much like the invokation of a function does. In Forth this isn't so clear because there are no lexical binding forms. To get a similar behaviour, a data structure should be "executable" and load a number of values on the parameter stack, however, this is impractical. I have the impression that Forth is better at "process" oriented programming instead of data structure manipulation: data structures are about postponing interpretation, while embedded processing is mostly about immediate action/reaction: there is more code than data. This is exactly the reason why I moved to Scheme to implement the compiler (compile time data structure interpretation). Can this be made into some kind of mantra? process <-> structure RUN COMPILE Or put differently: When writing highly specialized lowlevel code, try to avoid datastructures, or only use the datastructures at compile time. Think of that which would be the result of deforestation in a FP language, and write it directly, or pass the datastructures between macros. Deforestation is essentially jit-compilation: eliminate intermediate (code) representation. Entry: openfirmware Date: Sun Mar 22 23:30:56 CET 2009 svn co svn://openfirmware.info/openfirmware cd openfirmware/cpu/x86/pc/emu/build make cp emuofw.rom /usr/share/qemu/ qemu -bios emuofw.rom -hda fat:/tmp Entry: eForth primitives Date: Fri Mar 27 20:22:33 CET 2009 I'm implementing a machine to run eForth. Almost done with the primitives, except execution model and dictionary format. I had a very nice introduction to eForth as a pdf somewhere.. This looks OK: http://www.offete.com/files/zeneForth.htm First: what is the point? I'd like to find a way to implement a stand-alone ANS Forth on top of Staapl primitives. But what I really need is a Forth that will run partly on the host and partly on the target. I'm trying to first implement it as is, and solve the bootstrapping problem. Probably translation to a standard binary image that can by itself be translated to a binary image of a real machine. OK. Got primtive execution working.. The thing which always confused me is that on an emulated machine, I have this tendency to use 3 levels: threaded code, primitives represented as numeric opcodes, and real primitives (here Scheme thunks). But this isn't necessary. Forth primitives that end in a jump to NEXT can be represented by sequential scheme code ending in a NEXT tailcall. Next: bootstrap threaded code. With a threaded interpreter, primitives can't be executed directly but need to be wrapped in highlevel words that contain the primitive instructions. Entry: Threaded code + control primitives working Date: Sat Mar 28 12:01:04 CET 2009 The only remaining parts are bootstrapping the rest of the Forth code and/or defining the structure of the dictionary. I'm thinking about bootstrapping the code by hand (writing an interpreter in scheme) instead of using an external interpreter. This requires breaking two feedback loops: - parsing words - macros Once the forth is bootstrapped it can be used to generate images for other interpreters by changing the semantics. How to represent the dictionary? The bootstrap parser should work directly on the binary form of the dictionary. It looks like the best approach is this: massage the Forth source file such that it is simple enough to be parsed by a non-reflective parser, but remains ANS compliant. Removed all CODE words from eforth.f The remaining forth->machine code reference problem is CALL, (and =CALL) How to bootstrap. * First make sure everything parses to lists of words. * Then it's probably best to use the macros from the .f file to do the rest of the compiling. This requires them to be identified right after parsing. * Build a dependency graph of the code and manually resolve all circular conflicts by replacing words with primitives. Entry: RTL eforth (RISC portability through macros) Date: Sat Mar 28 13:03:08 CET 2009 There are a lot of operations that are reused between primitives: * stack push/pop * register/memory store/fetch It might be interesting to write eForth completely in terms of these primitives, not the forth ones, or divise forth primitive MACROS for these. The main idea is that not WORDS should be the basis, but MACROS that generate primitive operations: write in terms of a generic register machine. With just memory access, optimizations could be made: register instead of ram, indirection, etc.. So the main idea is: how to _automatically_ map a RISC machine (memory + MIMO logic blocks) to a Forth VM. EDIT: the problem with this is that all code needs to be implemented with registers, without using the "locals" construct that is so convenient. It requires knowledge of addressing modes and operator arity. Entry: Loops and data structures Date: Sun Mar 29 12:27:38 CEST 2009 A simplified view on functional programming: FP is about passing data structures around until they are ready to be reduced into pure behaviour. Ideally, you'd want to program subparts in terms of processors of intermediate data stuctures, and have the compiler use those datastructures as a skeleton to produce code sequencing with data structures eliminated. I can't express it yet -- the idea isn't formed completely, but it's about this: Datastructures don't make sense without code processing it, and relating it to some kind of semantics derived from the real, physical world. In other words, they need to be _compiled_ to some form of physical machine code. Now, the simplified view: Code = serialized traversal/update of data structures. What is the link of this to deforestation: elimination of intermediate datastructures? The classic example is unix pipes: instead of read/writing from/to intermediate data files, processes can be connected using pipes. It looks like some of the Oleg Kiselyov's ideas are starting to make sense to me: next I'd like to understand the relation between staging, deforestation and delimited control. http://okmij.org/ftp/Computation/Continuations.html#context-OS http://okmij.org/ftp/papers/context-OS.pdf http://www.cs.rutgers.edu/~ccshan/zipper/poster.ps Entry: The purpose of Coma Date: Sun Mar 29 13:21:07 CEST 2009 Rehash of the ideas behind the to-be Coma layers. - conditional + tail recursion for control flow - first order functions (no lambda) - higher order macros and structured parameterization. I would like to take the approach of "mandatory deforestation". Write a language+metalanguage with compile-time only data structures, and allow compilation of expression _only_ when complete deforestation (elimination of _all_ data structures at compile time) is possible. This leads to the following slogan: Staapl eliminate lists and lambdas. The target machine should then be simple such that it does not need garbage collection (i.e. make it linear) and even further: not need stacks or linear trees either. If I'm correct this is somewhat related to the Hume project - a hierarchy of machines: 1. finite state machines 2. FSM + stacks (push down automata / linear structures) 3. cons cells + lambda (needs garbage collection to implement) Lambda is really the code equivalent of a cons cell: lambda conses static code templates with an environment structure. Entry: FL Date: Sun Mar 29 16:26:51 CEST 2009 This is if I recall very much related to what John Nowak is working on. (Inspired by Backus' FP). I find it difficult to follow the posts on the concatenative [stack] list. Maybe this paper helps: FL is a kind of meta-FP. http://www.cs.berkeley.edu/~aiken/ftp/FL.ps Entry: Forth vs. C Date: Sun Mar 29 16:29:02 CEST 2009 I'm trying to see why a single (indexable) stack for both continuations AND environment is so different from having the two separated. Also compare the forth 2 stacks with the E and K stacks from the CEK machine. http://www.cs.utah.edu/~mflatt/past-courses/cs6520/public_html/s00/secd.ps http://www.cs.utah.edu/plt/publications/pllc.pdf 6.4 p73 http://lambda-the-ultimate.org/node/2423 Maybe I should implement a CEK machine and see how it behaves compared to Forth? I.e. identify where data sharing is introduced. Maybe the key is in 'let' : it introduces an element in the environment E, but leaves the continuation intact. Which operation leaves the environment intact but changes the continuation? Entry: Shifting the Stage: Staging with Delimited Control Date: Mon Mar 30 12:52:39 CEST 2009 This looks like an interesting next step: http://okmij.org/ftp/Computation/Generative.html#circle-shift In my own uneducated view, there seems to be a relation between staging and delimited control. The goal of staging is to re-order evaluation so some evaluations can be performed before final code is generated. Intermediate data structures (intermediate compuations) are eliminated. Now.. A zipper produces some (coroutine style) execution sequencing. It mainly turns things inside out directed by the shape of a data structure: data is passed between processes in a way directed by the data structure, but if the eventual result is not necessary, re-constructing the structure can be avoided. This is essentially deforestation (elimination of intermediate structure, where intermediate computation is a form of deconstructed intermediate structure.). Now, that's a very vague notion. Let's try to make it more precise. Maybe it's the "left fold with abort" that's been mentioned in Oleg's writings. Let's just write a zipper based on a left (non-recursive) fold. This leads to: (define-struct zipper (element state yield)) (define (collection/traverse->zipper collection init-state [fold traverse/fold]) (reset (fold (lambda (el state) (shift k (make-zipper el state k))) init-state collection))) (define (traverse/fold fn init-state lst) (let next ((s init-state) (l lst)) (if (null? l) s (let ((s+ (or (fn (car l) s) s))) (next s+ (cdr l)))))) The contents of the struct: - element: current focus - state: state of the output (data structure) - yield: state of the input (continuation) Now, does it make sense to put the input and output state on the same ground? In the zipper both input and output state are contained in the continuation. Is it possible to write a zipper which has only an abstract representation of output state (compile-time only) but never produces any real data: all we're interested in is how code is sequenced in the end. That's another vague notion.. This probably relates to trouble with having to write code in CPS when staging is involved. Entry: normal numbers / autodiff -> memoized expressions Date: Mon Mar 30 13:16:47 CEST 2009 Combining data structures reminds me of normal numbers and automatic differentiation: instead of producing a giant expression, autodiff produces a memoized version that has lower complexity than a fully subsituted single expression. Differentiation doesn't really change complexity, but it is a "convolution" operation that increases only memory use. Maybe this can be generalized to composition of any signal/image-processing operations: instead of working with intermediate representations, use deforestation to construct memoized expression iterators. Entry: memoization Date: Mon Mar 30 14:32:45 CEST 2009 http://okmij.org/ftp/Computation/staging/circle-shift.pdf contains an example of how to memoize a recursive function using a modified Y-combinator which leaves the (parameterized) function body unmodified. The paper is about memoization (let-insertion) and ... (if-insertion) in a safe way (preventing side-effects to create unsoundness) using restricted use of delimited control. Entry: names in concatenative language Date: Mon Mar 30 14:40:52 CEST 2009 What I still don't fully get is how to treat names in a concatenative language in a localized way. It has been quite clear to me for a while that names should _only_ reflect functions, not values. It's possible to construct complete static reasoning based on functions alone with values (i.e. numbers and stacks) being restricted to the run time only. But then.. What is a number? In lambda calculus it can be modeled by a function.. This remark is more about binding forms. I.e. with the ``|'' word used in macros names get bound to values (actually constant functions), though there was the doubt for a while to support a different binding form that did not perform this form of automatic abstraction. Entry: Staapl vs. MetaOcaml Date: Mon Mar 30 15:10:03 CEST 2009 Staapl is simpler due to absence of binding problems in the generated language and due to absence of static type guarantees. But, can some of the static typing used in MetaOcaml be used to make some operations in Staapl better defined? Maybe it's time to start splitting the project in two? One part moves along with Forth, Stacks and macros and evolves towards some kind of type system or better characterization of the compile time compilations, while the other uses MetaOcaml to target nested C expressions so lambda calculus can be used and low-level machine mapping is left to the C compiler. I'm really not so interested in register allocation and machine-specific data and control flow hacks. Ok for simple processors and Forth, but for RISC it's already been solved many times. So where do I move from here? Probably typed scheme. The MetaOcaml part will be split off as http://zwizwa.be/darcs/ip Entry: let and lambda Date: Mon Mar 30 15:22:58 CEST 2009 In another post I talked about using traversal macros in C instead of traversal functions that take context objects since macros allow the enclosing lexical environment to be used. The real point is the difference between ``let'' and ``lambda''. The latter ``forks'' the stack while the former does not. This leads to the following question: Is it possible to replace lambda with fork completely? Single-stack tasks instead of higher order functions? Entry: static Staapl : core problem? Date: Mon Mar 30 16:20:59 CEST 2009 The compile time data types are already typed, but the data type processors are not: there might still be code transformer definitions that don't make sense. Because this is rather entangled, it might be interesting to just make the cut (turn assembly opcodes into structs instead of lists) and see where it starts to bleed. From staapl/pic18/asm.ss it looks like the best place to start is the instruction-set macro. Next: setup some regression tests. Entry: bugfixes + regression tests Date: Tue Mar 31 10:33:17 CEST 2009 Before changing anyting in the compiler, testing needs to be made a bit more relyable: The simplest to do is to record the output of compiling all programs in app/ requiring these to be the same. bug: "load synth/synth.f" doesn't work: staaplc -d /dev/ttyUSB0 synth-icd-p18f1220.f current-load-relative-directory: not a complete path: "synth/" === context === /home/tom/staapl/staapl/forth/parser-tx.ss:287:0: load-tx /home/tom/staapl/staapl/forth/parser-tx.ss:287:0: load-tx /home/tom/staapl/staapl/forth/parser-tx.ss:194:5 /home/tom/staapl/staapl/purrr/forth-begin-tx.ss:87:2: code->toplevel-form /home/tom/staapl/staapl/purrr/forth-begin-tx.ss:46:0: forth-begin-tx make: *** [synth-icd-p18f1220.hex] Error 1 OK. got toplevel "make test" now which builds everything from scratch. Entry: instruction-set Date: Tue Mar 31 11:13:36 CEST 2009 The assembler contains two things: - static description of I/O - a 2-way function (equation) relating bit vector to parsed bit vector Let's look at typed assembler languages and see what the basic ideas are. Some form of annotation is required.. box> (instruction-set-tx #f #f #'((addwf (f d a) "0010 01da ffff ffff"))) (begin (begin (#f 'addwf (make-asm (lambda (f d a) (list (ignore-overflow f 8 (ignore-overflow a 1 (ignore-overflow d 1 9))))) '(addwf (f d a) ((9 . 6) (d . 1) (a . 1) (f . 8))))) (#f 9 6 (lambda (opcode) (match (cadr (chain `(,opcode ()) (dasm-step 'f 8) (dasm-step 'a 1) (dasm-step 'd 1))) ((d a f) (list 'addwf f d a))))))) This produces both an assembler and a disassembler. For static analysis however, these functions are better turned into interpretation steps mapping between binary and struct:asm Probably best to start from scratch since the current code is a bit entangled. So, basic idea: the assembler description is where machine and compiler meet. This should eventually be linked to a machine simulator for verification. ;; Eventually the assembler should include a complete model of the ;; machine's execution engine, which combined with a memory model ;; (including memory mapped io devices) can fully simulate code. ;; The primary relation we're interested in now is the equivalence ;; relation BI <-> SI for asm/dasm. ;; BI = binary instruction ;; SI = symbolic instruction ;; I = instruction (modulo equivalence, represented by SI) ;; asm : SI -> BI ;; dasm : BI -> SI ;; sim : SI -> machine -> machine To get to know typed scheme I'm trying to express the representation of the assemblers as ts types. (define-struct: Bitfield ([value : Number] [width : Number])) (define-struct: SI ([asm : Assembler] [dasm : Disassembler])) (define-type-alias Disassembler (Bitfield -> (Listof Bitfield))) (define-type-alias Assembler ((Listof Bitfield) -> Bitfield)) Now, what should addwf be? It needs to be an instance of a type since it will be used as part of pattern based tree-transformation. (define-struct: addwf ...) But each instance of such a struct will be linked to an assembler and disassembler function that map between this struct and Bitfields. So "addwf" is not just a type. (define-struct: addwf ()) (: addwf-asm (addwf -> Bitfield)) (: addwf-dasm (Bitfield -> addwf)) Note: Writing in a typed language it strikes me that in an dynamically typed language you get polymorphy for free: types are predicate functions. (define-struct: addwf ([f : (-> Bitfield)] [d : (-> Bitfield)] [a : (-> Bitfield)])) (: addwf-asm (addwf -> Bitfield)) (: addwf-dasm (Bitfield -> addwf)) The types of addwf's arguments are thunks. The important thing to note here is that they can't be numbers, since they might depend on addresses. This cyclic dependency is resolved later in the relaxation algorithm, where the thunks are re-evaluated. So, it might be best to include this in the typing too: a bitfield contains something that depends on a dictionary. Next: parameterize opcode (for classes of opcodes with same operands). This is where the simulator needs to be plugged in too. I.e. for (addwf f d a) we need: - a class (fda f d a) - behaviour in 2 stages: assembly and simulation simulation is necessary for partial evaluation. (which then needs 2 separate semantics: bit-true and extended types. Note: this is starting to make sense.. requiring types for all this requires me to think harder about dependencies. looks like figuring out this type system is the next stage for Staapl. i really needed to do this once to know what to write down now.. Next: * make current implementation of addwf correct * add classes + 2-stage behaviour Entry: defunctionalization Date: Wed Apr 1 10:34:03 CEST 2009 http://www.kennknowles.com/blog/2008/05/24/what-is-defunctionalization/ Defunctionalization is transforming a program to eliminate higher-order functions. I ran into this before in the Gambit-C presentation by Marc Feeley: http://www.iro.umontreal.ca/~boucherd/mslug/meetings/20041020/minutes-en.html Entry: fine grained multitasking Date: Wed Apr 1 21:06:32 CEST 2009 Here entry://20090322-215126 I wrote something about Forth being a ``control language''. Something which is mostly sequencing of instructions and control/branching instead of numbercrunching or datastructure processing. Is there a name for this class of programs? Entry: ideas and confusion Date: Fri Apr 3 12:54:07 CEST 2009 Thing's I'm working on but don't make complete sense yet: TYPES: - staged assembly language (machine semantics: compile / eval). eval = relative to machine, compile = relative to branch label allocation. - fixing the macro specification "patterns" language to use typed scheme. PEVAL: - "total deforestation" : use of intermediate high-level data structures for program specification and simulation + composition that allows total elimination of this intermediate data representation. - iterators -> tasks using delimited continuations (actor model) Entry: machine Date: Fri Apr 3 13:02:23 CEST 2009 Before going anywhere, it needs to be defined what a "machine" is. It is important that the description of the machine (data structures and interpreter) is written in such a way that it can be easily specialized/compiled (to C code). One important question: why use bottom-up operational semantics? Why not define an intermediate semantics (i.e. Forth)? The answer is: this is about (specializing) compiler correctness. The high level semantics themselves are trivial and could be used to verify the bottom-up semantics, but this requires complete simulation of the true (physical) semantics. A machine is: * a collection of registers / memory array * MIMO functions Note that the registers are _not_ SA like in [LLVM], since we're trying to model exact machine semantics for simulation, and are not trying to map a generic language to a register machine. Should I take a hint from this? Should the machine model be completely functional so code can be transformed into data flow networks? Or can I write a model using assignments only to avoid overspecifying operations (declaring outputs), then use dataflow analysis to transform it to a different representation? I probably need to take a better look at Chaper 8 in [ACDI]. A lot of intuition is still missing.. The bottom line however: machine description needs both: - assembly (compilation) - simulation (evaluation / compile time computation / type system). The description of the semantics should be compilable as fast and standalone code (read: C). I can do this only (practically) for simple machines, or subsets of larger machines. [ACDI] http://books.google.be/books?id=Pq7pHwG1_OkC [LLVM] http://llvm.org/ Entry: staapl/comp Date: Fri Apr 3 20:53:12 CEST 2009 ;; Code to build structured assembly code graph from forth code. This ;; uses an extension to scat's 2-stack model to represent ;; concatenative macros with a Forth-style control stack. I'm wondering if it's possible to completely eliminate this part. Implementing coma control on top of comp's control is maybe not the right approach. The code in staapl/comp needs to be documented properly, or simplified in a bottom up approach. It's ok to have fallthrough and multiple entry, and exit points, but the way it is implemented is not quite clear. Entry: transposed notation Date: Sat Apr 4 09:50:59 CEST 2009 (define-syntax-rule (update ((formal next) ...)) (lambda (formal ...) (values (next ...)))) What about using this macro to write down machine operations? One advantage of this is that it makes implementation faster because it doesn't need dynamic memory for the arguments/return values. (Can I test this assumption?) Let's stick to syntactic abstraction first and leave implementation for later. Entry: machine specification Date: Sat Apr 4 11:05:53 CEST 2009 * The machine's namespace should be accessible partly. I.e. if the machine has registers A B C, it should be possible to define an update using only a sumbset of the registers, with the other registers left untouched. * Implementation of sequencing should be completely abstract (on the macro level). This way it can later be changed from struct -> struct to values -> values to list -> list or whatever suits the needs. So, what is necessary? NAMESPACE: some way of specifying the machine's namespace. This could be implemented as a struct, list, vector, threaded values... This needs compile-time identifiers. UPDATE: parallel assignment defined for the abstract namespace. Roadmap: first transform current machine's (i.e. in comp/instantiate.ss and forth/parser-tx.ss to a transposed update form, then replace with fixed namespace). To do this, perform automatic transformation of the state-lambda form in staapl/scat/stack.ss 1. trivial syntax-rules -> syntax-case 2. pretty print old form for Complications: * 'update is used explicitly in the state-lambda syntax, probably to allow for more flexible exit points. * pattern matching is used on the registers, so the name-based matching cannot be used, making the two mechanisms (machine namespace / positional binding + deconstruction) incompatible in the current form. Example: (define merge-store (state-lambda compiler ('() ;; empty asm (list-rest (list asm rs (struct dict (current chain store))) ctrl) (struct dict (#f '() store+)) '()) ;; empty rs (update asm ctrl (make-dict current chain (append store+ store)) rs))) What about this form, annotated with some redundant visual marker syntax. ;; (field pattern expr) ((asm : '() -> asm) (ctrl : (list-rest (list asm rs (struct dict (current chain store))) ctrl) -> ctrl) (dict : (struct dict (#f '() store+)) -> (make-dict current chain (append store+ store))) (rs : '() -> rs)) Advantage here is that it is immediately clear which registers are modified. A clause (r :) would mean (r : r -> r). Now.. instead of using fn -> values, isn't it better to use tail recursion (CPS) instead, and use delimited control to terminate a sequence of operations? Entry: values vs CPS Date: Sat Apr 4 13:37:57 CEST 2009 For "functional update" of a state machine, it is simplest to use Scheme's built-in tail recursion. The idea is that heap data is not allocated for building argument lists - instead this is implemented using a stack with reclaimable memory. For mzscheme the place to look seems to be scheme_values() defined in src/fun.c: if (p->values_buffer && (p->values_buffer_size >= argc)) then values get copied to the current thread's values buffer, otherwise the "slow" procedure is used, which allocates a new values buffer. Then there's stuff in the JIT. Apparently the nb of function arguments is limited to 3. Anyways: implementation is for later. The important question is: do we model individual transition functions (machine state update using external sequencing) or do we include the machine's continuation model in the transition model. I.e. (define (instr R1 R2) (values (+ 1 R1) R2)) vs (define (instr R1 R2 k) (k (+ 1 R1) (R2))) It's probably best to use the latter since it is more general. Entry: composition and compile time information Date: Sat Apr 4 14:07:15 CEST 2009 Composing machine operations can be simplified by including compile time information about the registers modified. This allows simpler lifting of machines. (I.e. single stack -> dual stack). Entry: ARM vs. MIPS Date: Tue Apr 7 08:57:20 CEST 2009 The two 32 bit architectures for which it makes sense to try to write native support. As a general ``getting aquainted'' move, how do these two architectures compare? http://www.embeddedrelated.com/usenet/embedded/show/74090-1.php Entry: machines Date: Wed Apr 8 08:21:23 CEST 2009 The interface consists of: 1. register namespace 2. register content representation (for registers containing stacks) The first part is necessary to decouple the cardinality and order of the registers in the representation from specification of functionality. For the second I'm not sure whether datastructures should be limited to stacks only, in which case the pattern matching becomes simpler. In all use cases in staapl (*) these are the only two (*) registers only: real machine simulation (w. stacks implemented in memory) stacks: all scat based machines + forth parser machine So, the macros are factored in the following way: - convert to normal form: this introduces and/or completes clauses - convert to binding form For the latter there should be more than one: simple lambda forms and match expressions. So, I have a basic lambda form now: (machine-lambda #'values ;; continuation '(X Y) ;; identifier namespace + order #'((X -> (car X)))) ;; update function => (lambda (X Y) (values (car X) Y)) Entry: imperative module system Date: Wed Apr 8 18:55:58 CEST 2009 Explain why the current "imperative" module system is not a good idea and how it should be changed to implement something better. Relate this to Ocaml's functors. The problem I'm trying to solve is about the following conflict: One wants the freedom to be able to change _all_ elementary code generating functions but one does not want to be burdened with declaring them as replacable. This is about _defaults_ and declaring how we are not using defaults. The problem is: with unconstrained mutation of behaviour, locality is lost: we really do want to change global behaviour in some cases (i.e.: different machine), but would like some semantics to stay invariant. What is the proper way to change this? Entry: assembler incremental changes Date: Wed Apr 8 21:04:58 CEST 2009 With the machine/vm.ss macros working it should be straightforward to start building the assembler/simulator structure and gradually translate the macro language to a typed one. EDIT: simply start adding _some_ form of static information to assembler instructions and build other machinery on top of this: start at the asm-register! and dasm-register! macros. Entry: stacks Date: Thu Apr 9 08:13:58 CEST 2009 The big problem however is the static implementation of stacks. This requires a model for memory, but I'd like to keep memory and machine separate. Entry: asm hygiene problems Date: Thu Apr 9 09:28:40 CEST 2009 Looks like there are a couple of problems with the current assembler implementation: when introducing symbols, some lexical information apparently gets lost. Time to cleanup the lot. Hygiene problems seem to be solved, but now the app/ tests have changed code. I'm going to revert the assembler sources to isolate the change that breaks the tests. See the offending change below. I'm going to introduce just lexical correctness, and see where it goes. hunk ./staapl/asm/asmgen-tx.ss 28 + "../tools/stx.ss" hunk ./staapl/asm/asmgen-tx.ss 39 +(check-set-mode! 'report-failed) hunk ./staapl/asm/asmgen-tx.ss 55 -;; (bitstring->list "0101 kkkk ffff ffff") +;; Parse bitstring. hunk ./staapl/asm/asmgen-tx.ss 61 +(check (bitstring->list "0101 kkkk ffff ffff") + => '(0 1 0 1 k k k k f f f f f f f f)) hunk ./staapl/asm/asmgen-tx.ss 65 -;; (bin->number '(1 1 0 0)) hunk ./staapl/asm/asmgen-tx.ss 68 +(check (bin->number '(1 1 0 0)) => 12) hunk ./staapl/asm/asmgen-tx.ss 72 -;; (combine-bits '((k . 1) (k . 1) (k . 1))) hunk ./staapl/asm/asmgen-tx.ss 79 - +(check (combine-bits '((k . 1) (k . 1) (k . 1) + (l . 1) (l . 1))) => '((k . 3) (l . 2))) hunk ./staapl/asm/asmgen-tx.ss 93 +(check (split-opcode '(1 0 1 0 k k k k l l)) => '((10 . 4) (k . 4) (l . 2))) hunk ./staapl/asm/asmgen-tx.ss 97 -(define (parse-opcode-proto str) - (split-opcode - (bitstring->list str))) +(define (parse-opcode-proto str-stx) + ;; After parsing the string, the lexical information needs to be restored. + (define restore-lexical-info + (match-lambda + ((name . bits) + (cons + (if (symbol? name) + (datum->syntax str-stx + name) + name) + bits)))) + (map restore-lexical-info + (split-opcode + (bitstring->list + (syntax->datum str-stx))))) hunk ./staapl/asm/asmgen-tx.ss 118 -(define (binary->proto row) - (match row - ((name proto . binary) - (append (list name proto) - (map parse-opcode-proto binary))))) +(define (binary->proto row-stx) + (syntax-case row-stx () + ((name proto . binary) + (append (list #'name ;; preserve name's lexical info + #'proto) + (map parse-opcode-proto (syntax->list #'binary)))))) hunk ./staapl/asm/asmgen-tx.ss 126 -(check (binary->proto '(xorwf (f d a) "0001 10da ffff ffff")) - => '(xorwf (f d a) ((6 . 6) (d . 1) (a . 1) (f . 8)))) +(check (->sexp (binary->proto '(xorwf (f d a) "0001 10da ffff ffff"))) + => `(xorwf (f d a) ((6 . 6) (d . 1) (a . 1) (f . 8)))) hunk ./staapl/asm/asmgen-tx.ss 159 + (->sexp hunk ./staapl/asm/asmgen-tx.ss 166 + ) hunk ./staapl/asm/asmgen-tx.ss 198 -(define (instruction-set-tx asm! dasm! instructions) - (let ((protos - (map - binary->proto - (syntax->datum instructions)))) - [_$_] +(define (instruction-set-tx define-assembler dasm! instructions) + (let ((protos (map binary->proto (syntax->list instructions)))) hunk ./staapl/asm/asmgen-tx.ss 207 - (#,asm! - '#,name + (#,define-assembler + #,name hunk ./staapl/asm/asmgen.ss 36 - ((_ asm! dasm! instructions ...) - (instruction-set-tx #'asm! + ((_ define-assembler dasm! instructions ...) + (instruction-set-tx #'define-assembler hunk ./staapl/asm/asmgen.ss 42 - (iset asm-register! + (iset define-assembler hunk ./staapl/asm/asmgen.ss 50 -(let ((asm #f) - (dasm #f)) - (let ((asm! (lambda (name fn) (set! asm fn))) - (dasm! (lambda (opc bits fn) (set! dasm fn)))) - (iset asm! dasm! - (testopc (a b R) "1010 RRRR aaaa bbbb")) - (parameterize - ((current-pointers #hasheq((code . (-1))))) - (check (asm 4 2 -1) => '(#xAF42)) - (check (dasm #xAF42) => '(testopc (a . 4) (b . 2) (R . -1))) - (void)))) + +'(let* + ((testopc #f) + (dasm #f) + (dasm! (lambda (opc bits fn) (set! dasm fn)))) + + (iset set! ;; define-assembler + dasm! + (testopc (a b R) "1010 RRRR aaaa bbbb")) + (parameterize + ((current-pointers #hasheq((code . (-1))))) + (check (testopc 4 2 -1) => '(#xAF42)) + (check (dasm #xAF42) => '(testopc (a . 4) (b . 2) (R . -1))) + (void))) hunk ./staapl/asm/dictionary.ss 23 - asm-register! asm-find + + [_$_] + ;; asm-register! asm-find + define-assembler + + [_$_] hunk ./staapl/asm/dictionary.ss 31 - define-asm + define-asm ;; FIXME: get rid of this hunk ./staapl/asm/dictionary.ss 47 + + +(define-syntax-rule (define-assembler name fn) (asm-register! 'name fn)) + + + + hunk ./staapl/asm/dictionary.ss 119 + + hunk ./staapl/tools/stx.ss 15 + +;; FIXME: doesn't handle all cases (i.e. vectors..) hunk ./staapl/tools/stx.ss 21 + ((pair? x) (cons (->sexp (car x)) (->sexp (cdr x)))) + ((null? x) '()) Entry: hygiene bug Date: Thu Apr 9 10:34:34 CEST 2009 The minimal patch I can find that breaks the test is this: hunk ./staapl/asm/asmgen-tx.ss 92 -(define (parse-opcode-proto str) - (split-opcode - (bitstring->list str))) +(define (parse-opcode-proto str-stx) + (map (match-lambda ((param . bits) (cons (datum->syntax str-stx param) bits))) + (split-opcode (bitstring->list (syntax->datum str-stx))))) hunk ./staapl/asm/asmgen-tx.ss 101 -(define (binary->proto row) - (match row - ((name proto . binary) - (append (list name proto) - (map parse-opcode-proto binary))))) +(define (binary->proto row-stx) + (syntax-case row-stx () + ((name proto . binary) + (append (list #'name #'proto) + (map parse-opcode-proto (syntax->list #'binary)))))) hunk ./staapl/asm/asmgen-tx.ss 108 -(check (binary->proto '(xorwf (f d a) "0001 10da ffff ffff")) +'(check (binary->proto '(xorwf (f d a) "0001 10da ffff ffff")) hunk ./staapl/asm/asmgen-tx.ss 178 -(define (instruction-set-tx asm! dasm! instructions) - (let ((protos - (map - binary->proto - (syntax->datum instructions)))) +(define (instruction-set-tx asm! dasm! ins-stx) + (let ((protos (map binary->proto (syntax->list ins-stx)))) Let's have a look at the disassembly. The're all 'bra 'bnz 'rcall instructions. The other code is intact. I suspect this is in assembler composition. tom@zzz:~/staapl/app$ make 1220-8.diff < 000000: d020 bra 0x42 < 000040: d101 bra 0x244 --- > 000000: d01f bra 0x40 > 000040: d0e0 bra 0x202 4,5c4,5 < 000044: d024 bra 0x8e < 000046: d021 bra 0x8a --- > 000044: d001 bra 0x48 > 000046: d7fd bra 0x1042 7c7 < 00004a: d029 bra 0x9e --- > 00004a: d003 bra 0x52 10c10 < 000050: d021 bra 0x94 --- > 000050: d7f8 bra 0x1042 14c14 < 000058: d02f bra 0xb8 --- > 000058: d002 bra 0x5e 16c16 < 00005c: d021 bra 0xa0 --- Entry: refactoring Date: Thu Apr 9 12:54:54 CEST 2009 factored out asm-lambda.ss / asm-lambda-tx.ss implementing the DSL for assembler/disassembler specification : now separate from symbol binding. next: fix the disassembler as lazy list parsing step. First step done: generic disassemble word (works also for arbitrary literal fields) and a macro that expands to something like this: box> (disassembler-body #'diff #'(k s) #'((118 7) (s 1) (k 8) (3 2))) (lambda (w) (let-values (((field_0 s k field_3) (disassemble/values '(7 1 8 2) w))) (and (= field_3 3) (= field_0 118) (list 'diff k s)))) Entry: transposes Date: Thu Apr 9 20:12:44 CEST 2009 Loop transformations.. These are really just about permutations of indices. Anyways.. I'd like to transform this: (((a 1) (b 2)) ((c 3) (d 4))) => (((a b) (c d)) ((1 2) (3 4))) Which is the (outer) index transpose (i j k) -> (k i j) Fixed. Transposition is really best handled with syntax-case ellipsis. box> (disassembler-body #'foo #'(s k) #'(((118 7) (s 1) (3 2)) ((k 12)))) (lambda (temp54 temp55) (let-values (((temp56 s temp57) (disassemble/values '(7 1 2) temp54)) ((k) (disassemble/values '(12) temp55))) (and (= temp57 3) (= temp56 118) (list 'foo s k)))) The macro: (define-syntax-rule (push! stack x) (set! stack (cons x stack))) (define-syntax-rule (lambda* formals . body) (lambda (a) (apply (lambda formals . body) a))) (define (generate-temp) (car (generate-temporaries #'(#f)))) (define (disassembler-body opcode formals body-stx) (define literals '()) (define (fix-names! stx) (for/list ((n (syntax->list stx)) (i (in-naturals))) (let ((_n (syntax-e n))) (if (number? _n) (let ((__n (generate-temp))) (push! literals (list __n _n)) __n) n)))) (let ((ws (generate-temporaries body-stx))) (syntax-case (list ws body-stx) () (((w ...) (((name bits) ...) ...)) #`(lambda (w ...) (let-values ;; Transpose it #,(for/list ((stx (syntax->list #'((w (name ...) (bits ...)) ...)))) (syntax-case stx () ((w ns bs) #`(#,(fix-names! #'ns) (disassemble/values 'bs w))))) (and #,@(map (lambda* (name value) #`(= #,name #,value)) literals) (list '#,opcode #,@formals)))))))) Looks like i'm getting the hang of combining the highlevel ellipsis based pattern matching where possible with lowlevel macros where needed. Entry: combined asm dasm Date: Sat Apr 11 13:35:40 CEST 2009 From the same form both asm and dasm can be generated: box> (asm/dasm-lambda-tx #'(add (a b) "0101 aaaa" "bbbb bbbb")) (values (lambda (a b) (list (asm+ a 4 5) b)) (lambda (temp70 temp71) (let-values (((temp72 a) (disassemble/values '(4 4) temp70)) ((b) (disassemble/values '(8) temp71))) (and (= temp72 5) (list a b))))) next: cleanup tools.ss dictionary.ss Now it's important to not mess up the way the disassembler is integrated. 'dasm-find is no longer possible (we can't distinguish based on a single word). This probably means that the following lineage needs to disappear: disassemble->word (asm/tools.ss) tsee (live/tethered.ss) Let's start at dasm-register! Ok: need to repair dasm-find and dasm-register! Let's have them introduce toplevel bindings, but also register these in a dynamic namespace (or use eval?) I'm loosing oversight. Entry: Berkely CS61C Machine Structures Date: Sat Apr 11 21:44:41 CEST 2009 http://webcast.berkeley.edu/course_details.php?seriesid=1906978500 3op instruction format: rator rand,rand,rand $zero register -> enables assignment: add $r1, $r2, $zero The lecture series is rather silly and slow, but MIPS is a nice ISA. Entry: disassembler types Date: Sun Apr 12 10:34:52 CEST 2009 Should the disassembler know about signedness of values? Yes, but where do we do that? Upper case are signed values, R is PC relative.. This needs a better spec : type info needs to travel somehow. It's probably best to see the assembler as a 2-way function (equation) and provide a composition and contract mechanism. Entry: more fixes Date: Sun Apr 12 11:01:18 CEST 2009 word-ref: expects type as 1st argument, given: #; other arguments were: 1 === context === /data/safe/tom/darcs/brood-5/staapl/asm/dictionary.ss:72:0: proto->asm-error-handler /data/safe/tom/darcs/brood-5/staapl/asm/assembler.ss:249:0: resolve/assemble /data/safe/tom/darcs/brood-5/staapl/asm/assembler.ss:276:0: nop core /data/safe/tom/darcs/brood-5/staapl/live/state.ss:56:0: assemble-chains /usr/local/plt/collects/scheme/private/map.ss:44:11: for-each Apparently, asm procedures are wrapped in a struct that contains meta info. Let's keep this for now (see struct:asm in staapl/asm/dictionary.ss) Entry: magic Date: Sun Apr 12 11:18:07 CEST 2009 The problem with improperly compiled jumps seems to have disappeared.. Maybe it was a consequence of lingering non-hygienic constructs that simply disappeared with the cleanup? Somehow I don't believe this so let's try to execute one of the images. Seems to run just fine. (cd app ; make picstamp.live) Entry: static assembler data Date: Sun Apr 12 11:23:04 CEST 2009 The next thing to figure out is how to attach "type" info to the assembler. Need to think about: * assembler primitives * assembler composites * asm code processors (operate on structs) Another thing: all operand processing (signed/unsigned relative/absolute) should be somehow atttached to the disassembler. These are enough to start thinking about representation: * operands are not bit vectors: they have types * assembly, disassembly and opcode construction/deconstruction should be handled. Entry: restating the question Date: Thu Apr 16 20:24:59 CEST 2009 So the real question is: can we build a typed metaprogramming system from the ground up? types + abstract interpretation module system for instantiating generic code + some formalism to think about names. (forth + functors?) instantiation module --------------> package application function ------------> value Entry: sequence tools Date: Fri Apr 17 09:56:30 CEST 2009 Instead of letting staapl depend on zwizwa/plt i'm going to simply copy the files (symbolic links won't work). Later this could be factored out as a separate planet package, whenever I publish something else. Entry: asm/dasm cleanup Date: Fri Apr 17 12:32:10 CEST 2009 So, the idea is that asm and dasm become integral parts of the language tower, to allow compile-time checks (type checking) and partial evaluation based on abstract interpretation. The challenge is going to be to get a better idea of what is static and what is dynamic. The confusion is rooted in the type of application: it is a code transformer, so should all code transforming code be scheme compile-time code or not? next: * referential transparency for asm and dasm. (replace parameters with explicit environment objects). * syntax cleanup for signed/unsigned relative/absolute addressing It's probably best to focus on dasm for now, since it's simpler. Asm also contains a link to target dictionary. next: * add address input to dasm * add type convertors to dasm type converters the main decision is: run time (interpretation) or compile time conversion? let's tick to run-time, and move to compile time when necessary. what is needed is the analogue of this: (define (paramclass->asm name) (case name ((R) #'asm+/pcr) ;; used for relative jumps (else #'asm+))) ;; assemble value ignoring overflow Ok. Using this: (define (paramclass->dasm name) (case name ((R) #'dasm/pcr) (else #'dasm/unsigned))) we will do parameter class translation at compile time: box> (dasm-lambda-tx #'foo #'(a R) #'(((123 10) (a 1) (R 5)))) (lambda (temp12) (let-values (((temp13 a R) (disassemble/values '(10 1 5) temp12))) (and (= temp13 123) (list 'foo (dasm/unsigned a) (dasm/pcr R))))) Entry: don't mix side-effects and streams.. Date: Sat Apr 18 12:56:55 CEST 2009 The problem is context-dependency of the disassembler on the current code pointer, which probably doesn't work well with parameterization.. The result of disassembly seems to be correct, but i doubt it really does what i think it does.. Maybe the parameter should be pushed down? No, let's do this properly: all dasm words get passed as a first argument the current code location. arbitrary choice: the PC that's plugged into a disassembler, should it be the location of the instruction, or the value of the PC when the instruction executes (pointing _past_ the instruction)? Let's stick as close as possible to the machine definition: PC should point after the instruction. next: * assume that PCR addressing mechanism is the same for all architectures of interest. if not, this should be parameterized in operand.ss * make the assemblers referentially transparent. Problem: when invoking the assembler, the next position of the PC is not known, so we can't use this! The transformation from instruction address -> PC is done in the macro, which knows instruction sizes. Entry: next problem Date: Sat Apr 18 15:19:01 CEST 2009 The pattern matcher checks arity of the assembler functions. There's a problem when moving from (formals) -> (address . formals) for asm prototype. Maybe this should be temporarily disabled? ok. stupid typo. Entry: disassembly address? Date: Sat Apr 18 17:14:14 CEST 2009 should the disassembler contain the address? maybe not, since it's already converted to absolute addressing.. Entry: further cleanup Date: Sat Apr 18 17:37:52 CEST 2009 define-lowlevel-asm : needs "define" original: (define-sr (define-lowlevel-asm (name addr . formals) . body) (asm-register! 'name (make-asm (lambda (addr . formals) . body) '(name formals)))) there's more problems here: is the '(name formals) info really necessary at run time? -> no this is supposed to be filled with the prototype from the assembler generator. Entry: name structure Date: Sat Apr 18 18:50:26 CEST 2009 I find there are a lot of subtleties due to the hash-table based (mutable) name structure. To solve this properly will require quite an overhaul.. Where to start? Let's first take the files apart, and figure out what is necessary at compile time to perform pattern matching. Hmm.. too much junk still.. Let's eliminate the run-time pattern first. What is it used for? Easy to see by changing the accessor interface: asm-prototype -> asm-proto It is only used in the error handler. Let's take it out completely. proto->asm-error-handler : assembler.ss Ok.. Now, what information is necessary at run time about an instruction assembler? For testing purposes it might be linked to its disassembler, but that's about it.. All other info is best checked at compile time (i.e. in pattern matching). Entry: moving towards hash-less assembler Date: Sun Apr 19 11:08:25 CEST 2009 This requires the 'patterns macro to be adjusted to take names from the module namespace instead of the hash table. Note that this is part of a more ambitious project of eliminating _all_ mutation of names (also for macros) and replace them with directed acyclic name structures (modules). The problem here is that in a lot of places, symbols get quoted directly, so it's a bit spread all over. Maybe I'm not quite ready yet. I don't completely understand _what_ an assembler should be. This is lack of theoretical knowledge and a good model of what I'm doing. It is: - syntax to be manipulated (object language) - syntax to be translated to machine code - syntax to be interpreted (simulator) Can we model these as functions? - macro : [asm] -> [asm] - assemble : [asm] -> [bin] - simulate : [asm] -> (m -> m) Where [] = list of asm = assembly instructions bin = binary instructions (bit vectors) m = machine state object There are others: - primitive-assemble : asm -> [bin] - disassemble : [bin] -> [asm] Instead of poking around in current code, it might be wiser to start building the core from ground up, to see what exactly can be done with it. Entry: what is an assembly instruction? Date: Sun Apr 19 11:27:37 CEST 2009 SYNTAX: [asm] -> [asm] An abstract data type to be manipulated by coma macros. This type has expansion-time information that checks constraints on the syntax: name and arity. (and later possibly type). SEMANTICS: (asm,here,operands) -> [bin] Assembly instructions can be _compiled_ to binary machine code, relative to the location at which they are supposed to be placed. Note that operands need to be bit vectors. This function is called a primitive assembler. The (partial) inverse of this is called a primitive disassembler. (asm,here,operands) -> (m -> m) The primitive simulator is merely a different representation of the semantics, accessible from the syntax manipulation phase. It translates a fully specified instruction into a machine state update function, enabling the instruction to be _interpreted_. GRAPH: ([asm],dict,start) -> [bin] Instances of asm structures can contain symbolic information which eventually depends on the size of the [bin] output of primitive assemblers. This circularity is broken by sucessive approximation using a relaxation algorithm. Anything that translates to [bin] looses information: in this stage, all symbolic/parameterized information is gone. The interesting parts are the coma [asm] -> [asm] translation and the interpretation steps (asm,here,operands) -> (m -> m). If the latter can be made to perform abstract interpretation, a lot of checks can be constructed. It looks like the important conclusions are: - flexibility of representation of instruction semantics (m -> m) will determine how the information encoded in these functions can be fed back to [asm] -> [asm] transformers (i.e. to generate peephole optimizations). however, this can be added later on, on top of basic static structure. - practically: information about the instruction's type needs a compile-time binding. Entry: compile time bindings Date: Sun Apr 19 12:07:59 CEST 2009 let's change the asmgen macro to add a syntax binding containing assembler prototype. Ok.. it's not what it needs to be, but the basic defining form is there. Time to give things a name. How should we call the data structure that when collected in stacks is symbolic machine code? Let's call it ``op''. Then: - coma manipulates ops - assembler translates ops into binary code or simulator code. So.. Now that the stage is set, what am i going to do with it? I have a little bit more information than just op's arity: types and rel/abs addressing are available too. Are these interesting? Not really... These only matter when the operands are numerical -- not during [op] transformation phase. So.. Until there is some form of abstract interpretation possible, we can't do much more than simply checking existence and arity. So.. Is it necessary to re-invent structures? Will I use something a static struct can't carry? Yes.. It's exactly the "type" information associated with the op that can lead to automatic transformer derivations.. Let's just see if the static structure is in place to actually verify existence first. In pattern-tx.ss In the pattern expansion, tags now are wrapped in a verify struct which looks up the tag in the (op) namespace. Next: unify all definition syntaxes. ;; Main definer body for asm/dasm/op namespaces. (define-syntax-rule (define-asm/dasm/op static asm-body dasm-body) (begin (define-syntax-ns (op) name static) (define-values-ns (((asm) name) ((dasm) name)) (let ((asm asm-body) (dasm dasm-body)) ;; backwards compat (later, use reflective operations for this) (asm-register! 'name asm) (dasm-register! #f #f dasm) (values asm dasm))))) OK.. got arity checking working in pattern-tx.ss : (define (check-ins type ins) (syntax-case ins () ((rator rand ...) (if (not (identifier? #'rator)) (printf "warning: can't verify parametric instruction template ~a\n" (syntax->datum ins)) (let* ((id (ns-prefixed #'(op) #'rator)) (op (syntax-local-value id (lambda () #f)))) (if (not op) (printf "warning: unchecked ~a: ~a\n" type (syntax-e id)) (if (= (op-arity op) (length (syntax->list #'(rand ...)))) (void) ;; (printf "ok: ~a\n" (syntax->datum #'rator)) (raise-syntax-error #f "incorrect arity" ins)))))))) Some things can't be checked yet, which produces some warnings. It's safe to remove the old run-time checking method now. OK: i've left the symbolic representation as-is, and used the static info just as a check : it emits warnings when it can't find prototypes. Now, is it possible to eliminate current warnings? It should be possible to separate declaration and implementation of certain instructions... Somehow circularity needs to be broken. The problem is that it is possible to define op manipulations (macros) without having ops defined, with implementation following later. Maybe we should allow single assignment of functionality? This keeps things declarative, but gets rid of order of particular definitions. I'm a bit confused.. What's the right question to ask? The problem is best understood with an example: (patterns (macro) (([cw a] primitive-exit) ([jw a])) ((primitive-exit) ([exit]))) This is from core.ss The problem is that core.ss needs to know the definitions of cw, jw and exit before this makes sense. So, instead of using late binding (which is why the assemblers are in a hash table right now) this could be solved using parameterized modules : if you want to use core.ss, you need to first provide meaningful definitions. The fact that i want to export compile time values makes this problematic. Entry: the real problem Date: Sun Apr 19 19:45:44 CEST 2009 Currently I'm using late binding for some behaviour, both for assembler and for macros : they behave as mutable nodes. I'd like to turn this into a declarative structure. Maybe i should have a look at PLT Scheme's unit facility to tackle this one.. Units organize a program into separately compilable and reusable components. The imports and exports of a unit are grouped into a signature, which can include “static” information (such as macros) in addition to placeholders for run-time values. Is this different than before? If they do export static information, it's basicly what I'm looking for. Compiler template + Machine description -> Compiler http://download.plt-scheme.org/doc/html/guide/units.html Could be signatures (plug-in interfaces): - machine model (cw,jw,qw + basic macros) - compiler core (control flow graph construction vs. "label") Entry: big change Date: Mon Apr 20 10:12:20 CEST 2009 question is, should i dive right in? eliminate all mutation and rebuild? it doesn't seem there is a gradual way.. let's try. start with the machine model. (staapl/coma/core.ss) Entry: ns refactoring Date: Mon Apr 20 10:28:11 CEST 2009 Taking it out and placing it in a separate submodule. Separate from scat code and simplify prefix tricks. Entry: all pseudo Date: Mon Apr 20 12:00:18 CEST 2009 What about this: simply require some basic ops to be pseudo instructions (which cannot be overridden) and require the elimination of these in the machine definition. That way there is no "inner" dependency on the machine's assembler: the machine's compiler will eliminate all virtual code in terms of which the core compiler and macro language are defined. So.. Let's define the virtuals once and for all in coma/core.ss Aha. The problem is that when in different modules, names _can_ be redefined, since they are independent. It is only visible globally in the asm register. So, I'm confident I can pull this off and change the pattern matching and assembler so they don't need the hash any more. Dependencies will just roll out. Next: change instruction representation. Maybe it's best to just stick with replacing the symbol with an abstract rep. Maybe just the 'asm struct. Matcher + op transformers compile. Matchers seem to work too: coma.ss : box> (macro> 1) [# #x1] box> (macro> 1 2) [# #x1] [# #x2] box> (macro> 1 2 +) [# (1 2 +)] moving into assembler.ss now -> getting rid of asm-find hmm.. assembler needs to know (ns (asm) nop) -> set to 0000 for now. so.. looks like the basic change works, just some rules won't match. ha.. silly match bug: matches never failed. ok. now removing asm-register! and asm-find Entry: now the same for macros? Date: Mon Apr 20 18:41:48 CEST 2009 That's a different pair of sleeves! This needs a bit more thought. There are 2 forms of mutation: - what to do with "extensions" : augmenting pure compile-time constructs with compilable code. (super) - override: complete re-implementation (extension without delegate). Entry: update: we're still symbolic Date: Mon Apr 20 18:45:01 CEST 2009 The assembler transformation still uses just lists, however the tags are now identifiers. It's still possible to move around tags though, so these aren't algebraic types. It would be interesting to see exactly why not. * Polymorphy: it's possible to parameterize tags and work with instruction prototypes only. This could be properly abstracted into instruction class objects. * Non-exhaustiveness: operators act on only a very small portion of the possible operations, with the default a simple concatenation. Entry: Macro transformers Date: Tue Apr 21 10:04:36 CEST 2009 Now there are m : [op] -> [op] transformers. Is it possible to deduce from these [m] -> [m] transformers? Entry: Mutation Date: Tue Apr 21 10:06:34 CEST 2009 Now it's time to see how we can get rid of mutation. Only redefine behaviour by explicit renaming. This will break .f code, so it's best to do it in two stages: first for the core code then the .f code. Entry: today.. Date: Tue Apr 21 11:18:25 CEST 2009 I'm going to take it easy.. let this rest a bit. Let's pick some low-hanging fruit. next: ns syntax. Is it possible to do all of this with a single ns syntax? Entry: expand-to-top-form Date: Tue Apr 21 15:14:40 CEST 2009 (define-syntax (ns stx) (syntax-case stx () ((_ (ns ...) expr) (let ((prefixed (lambda (n) (ns-prefixed #'(ns ...) n)))) (if (identifier? #'expr) (prefixed #'expr) (let ((id=? (lambda (stx symb) (eq? (syntax->datum stx) symb))) (prefixed-list (lambda (stx) (map prefixed (syntax->list stx)))) (exp (expand-syntax-to-top-form #'expr))) ;; (printf "top: ~a\n" (syntax->datum exp)) (syntax-case exp () (((form (((n1) (form1 names e ...))) n2) i ...) (id=? #'form 'letrec-values) #`((form (((n1) (form1 #,(prefixed-list #'names) e ...))) n2) i ...)) ((form b . e) (id=? #'form 'let-values) #`(form #,(for/list ((n (syntax->list #'b))) (syntax-case n () ((names e) #`(#,(prefixed-list #'names) e)))) . e)) ((form names e) (or (id=? #'form 'define-values) (id=? #'form 'define-syntaxes)) #`(form #,(prefixed-list #'names) e)) ))))))) This doesn't work.. I get undefined references to let-values / define-values ... Maybe it's best to not invoke the transformer? OK.. got it working with an explicit preprocessing macro: (define-syntax (ns stx) (syntax-case stx () ((_ (ns ...) expr) (let* ((prefixed (lambda (n) (ns-prefixed #'(ns ...) n))) (prefixed-list (lambda (stx) (map prefixed (syntax->list stx)))) (prefixed-binders (lambda (p) (lambda (binders) (for/list ((b (syntax->list binders))) (syntax-case b () ((n e) #`(#,(p #'n) e)))))))) (if (identifier? #'expr) (prefixed #'expr) (let ((form? (let ((form (car (syntax->datum #'expr)))) ;; (printf "form = ~a\n" form) (lambda (symb) (eq? form symb))))) (syntax-case #'expr () ((form (name . a) e) (or (form? 'define) (form? 'define-syntax)) #`(form (#,(prefixed #'name) . a) e)) ((form name e) (or (form? 'define) (form? 'define-syntax)) #`(form #,(prefixed #'name) e)) ((form names e) (or (form? 'define-values) (form? 'define-syntaxes)) #`(form #,(prefixed-list #'names) e)) ((form binders e) (or (form? 'let) (form? 'letrec) (form? 'shared)) #`(form #,((prefixed-binders prefixed) #'binders) e)) ((form binders e) (or (form? 'let-values) (form? 'letrec-values)) #`(form #,((prefixed-binders prefixed-list) #'binders) e)) ))))))) Managed to delete a whole lot of code that's no longer used with this simpler approach. Entry: the real deal Date: Wed Apr 22 12:26:23 CEST 2009 Now for the biggies: * how to get rid of mutation of macros? * internal state monad? ;; \__ target ;; \__ scat ;; \__ coma ;; macro::jump ;; \__ control ;; macro::sym ;; macro::label: ;; macro::exit ;; \__ comp ;; \__ asm ;; \__ forth ;; \__ live ;; macro::+ ;; macro::/ ;; macro::* ;; macro::- ;; macro::dup ;; macro::drop ;; macro::swap ;; macro::, ;; macro::or-jump ;; macro::not ;; macro::then ;; \__ purrr ;; \__ pic18 ;; macro::TRISC ;; macro::STATUS ;; macro::FSR2L ;; macro::FSR2H ;; macro::PLUSW2 ;; macro::PREINC2 ;; macro::POSTDEC2 ;; macro::POSTINC2 ;; macro::INDF2 ;; macro::FSR1L ;; macro::FSR1H ;; macro::PLUSW1 ;; macro::PREINC1 ;; macro::POSTDEC1 ;; macro::POSTINC1 ;; macro::INDF1 ;; macro::WREG ;; macro::FSR0L ;; macro::FSR0H ;; macro::PLUSW0 ;; macro::PREINC0 ;; macro::POSTDEC0 ;; macro::POSTINC0 ;; macro::INDF0 ;; macro::PRODL ;; macro::PRODH ;; macro::TABLAT ;; macro::TBLPTRL ;; macro::TBLPTRH ;; macro::TOSL ;; macro::TOSH ;; macro::TOSU ;; macro::C ;; macro::Z So... I tried to remove the redefine! syntax, but it seems to depend on (rpn-map-identifier) for finding already existing bindings. Can this be moved to static namespace names? Upgraded ns (ns-tx) so it can do require/provide forms too. Now replaced redefine!-ns in (compositions ...) with a proper define, and i'm trying to use (ns-in (macro) (except-in ...)) to explicitly re-define things. Entry: hierarchical language Date: Wed Apr 22 14:24:04 CEST 2009 So.. Why should coma/language.ss have jump? Maybe coma/language.ss should just be about partial evaluation: all control flow stuff should be goine. This means it should not know about CW JW EXIT But.. Let's fix that later. Now take out things that conflict. It's probably best to make the languages include each other, and later separate out disjoint parts: core < language < coma < control < comp Ok.. This compiles, but there are still problems with the toplevel namespace using staaplc. Maybe i should just continue and take the mutation also out of the forth code? Another thing (now that i've re-discovered the "run" button) : how to test comp.ss? Seems there are still some problems.. It's probably best to make this into a linear chain of extensions, to make sure a parent module doesn't import a non-modified deeper core module. This should include the assemblers. Entry: then <-> declarative macros Date: Wed Apr 22 15:09:41 CEST 2009 "then" is a problem because it uses a plug-in optimization: macros defined in terms of "then" in the lower language layers will not change behaviour. this is a point where we have to give up flexibility due to absence of hooks. hook = hole in module this needs to be solved later when i do have a way to put holes in modules. but overall it's probably best to stick to a more static bottom-up code structure. in general: hooks in functional programming can usually be solved with higher order functions (create holes with lambda). i can probably do the same here too. EDIT: it's worse than that. "label:" and "sym" have the same problem. Looks like it's time for a unit. Entry: static dasm Date: Wed Apr 22 16:15:28 CEST 2009 There's still "dasm-register!". Maybe disassembly should be turned into a reflective operation. Something that collects all the names and seals up the dasm (could be done statically probably..) Entry: parameterized control compiler Date: Wed Apr 22 16:49:35 CEST 2009 control.ss is parameterized by the code's control graph representation. in the current version there is either a flat assembler list, or a control flow graph structure. these come from the implementation of "sym" and "label:" It's time to parameterize these into a unit. procedure application: expected procedure, given: #; arguments were: # # Looks like this interferes with the procedure? predicate used in the rpn parser.. OK. ignoring set!-transformer? Units seem to work now too. Next : what is purrr.ss ? hmm... it's confusing. - i'd like to be able to use the pic18 compiler without the baggage of the CFG, so it needs to be a unit too? Entry: a lot of backpatching Date: Wed Apr 22 23:21:42 CEST 2009 just ran into this: staaplc -d /dev/ttyUSB0 1220-8.f asm-pattern: match failed for: not, asm: [bit? #xf9e #x5 #x1] this is because "not" is the one from coma/coma.ss i'm using the wrong abstraction. units are a good start, but this must be made easier to use. small embedded programs can use a flat namespace. building a specialized program is really building this namespace. also, each word should delist where its implementation comes from. let's implement that first, then it will be clear why some functionality won't work.. Entry: units as basic block Date: Wed Apr 22 23:42:50 CEST 2009 It's starting to look better: units are probably a better way to construct applications. At least it's going to be a better way to organize all the circular dependencies in the core code.. Entry: taking it apart again. Date: Thu Apr 23 08:30:07 CEST 2009 so, let's start in coma/core.ss separate out virtual.ss : virtual instructions for use in the macro evaluator. next: how to parameterize control with the proper branching mechanism? the important part is to be able to tell where there are exit points: exit or or-jump. Entry: compilation state representation Date: Thu Apr 23 10:13:45 CEST 2009 Is it possible to get rid of the parameters in coma? And is this really desirable? - macro-state-check - macro-eval-init-state Who determines what the compilation state is? The compiler. The real question is, why does the macro evaluator need to know the internal compiler state? Because it compiles of course.. This is quite a deep feedback loop that's hard to explicitly propagate outward for linking resolution. But.. Since the compiler state are all subtypes (stack < 2stack < comp), it should be possible to automatically upgrade when necessary. Fix this later. First get test-comp.ss to run. It's probably more important to get the parameterization in the compiler to work properly. Entry: de-parameterize comp/instantiate.ss Date: Thu Apr 23 10:41:12 CEST 2009 parameters: compile-literal compile-word compile-exit compile-mexit semi target-postprocess ok.. getting rid of parameters should not be too difficult. however, fixing the primitive compilation steps (especially mexit) might be a challenge. mexit is probably best implemented not with a parameter, but with placing functions on the rs stack.. Entry: next Date: Thu Apr 23 17:00:58 CEST 2009 first i need to get the full code gen + tests back online, then it's time to cleanup the CFG compiler. had to fix 'org and introduced machine^ containing cell size. ;; This is for printing only.. Maybe keep the parameters? Otoh, ;; there should be a simple way to do this properly too: fill in these ;; params deeper in the code.. (target-code-unit 2) ;; a code word is 2 bytes (target-code-bits 16) (target-address-size 24) lets see if we can get the test to run next problem: forth-begin-tx.ss depends on the compiler through the wrap-macro/... functions. the reason is that forth-begin contains instantiation. this probably needs an extra interface. ok.. i got it i think.. next: postprocessing. OK works. Entry: final test Date: Fri Apr 24 08:54:27 CEST 2009 Cleaning up forth-begin code to make it parameterizable. What i really want is a sort of intermediate form to display Forth files in a similar way to the 'compositions macro. Basicly, .f parsing should get rid of all the parsing words, but leave the rest intact. This will probably expose some hairy bits.. I'm getting rid of the rpn-map-target-identifier parameter: let's stick to '(target) namespace in the forth-begin macro. If not, later simply pass the namepspaces to the forth-begin macro. Ok.. time for the final test. compile: unbound identifier in module in: asm/qw This is defined in op/asm.ss Need to rename things: jump-cfg doesn't really compile cfg: just a list of list of chunks anyways, in the mean time i fixed the command line interface to the compiler. it's again possible in snot to simply type forth code and have the resulting assembly code displayed. next prob: /home/tom/staapl/staapl/comp/unit-jump-cfg.ss:225:6: opcode ``cw'' won't match opcode with same name. /home/tom/staapl/staapl/pic18/unit-pic18.ss:599:14: opcode ``exit'' won't match opcode with same name. /home/tom/staapl/staapl/pic18/unit-pic18.ss:600:4: opcode ``exit'' won't match opcode with same name. /home/tom/staapl/staapl/pic18/unit-pic18.ss:609:4: opcode ``save'' won't match opcode with same name. seems like there are different instances of the asm? objects due to unit instantiation.. let's make sure they are linked by putting them in a separate unit. now.. instead of rushing to a maybe solution, is there a better way to match? the problem is that (asm: foo) produces different results depending on wheter it is executed inside a unit or outside. i don't see a way to fix this without telling the instantiation process to share instances.. let's test this. Entry: syntaxes and units Date: Fri Apr 24 16:24:24 CEST 2009 What I don't understand is why syntax transformer definition has to be part of a unit's signature. Maybe it helps to understand how units are implemented... So, an identifier in a unit is a rename transformer. What better than to look at the expansion? (define-signature foo^ (foo (define-syntaxes (baz) '(1 2 3)))) (define-signature bar^ (bar)) (expand #'(unit (import foo^) (export bar^) (define bar (+ 123)) (define-syntax (print-baz stx) (printf "baz: ~a\n" (syntax-local-value #'baz)) #'456) (print-baz))) PRINTS: baz: (1 2 3) VALUE: (#%app make-unit 'eval:42:0 (#%app vector-immutable (#%app cons 'foo^ (#%app vector-immutable (#%top . signature-tag)))) (#%app vector-immutable (#%app cons 'bar^ (#%app vector-immutable (#%top . signature-tag)))) (#%app list) (let-values () (let-values () (lambda () (let-values (((temp51) (#%app box undefined))) (#%app values (lambda (import-table) (let-values (((temp50) (#%app vector->values (#%app hash-table-get import-table (#%top . signature-tag)) '0 '1))) (let-values () (let-values () (letrec-values (((bar) (#%app + '123))) (#%app set-box! temp51 bar) '456))))) (#%app make-immutable-hash (#%app list (#%app cons (#%top . signature-tag) (#%app vector-immutable (lambda () (#%app unbox temp51)))))))))))) Judging from this, units are completely compiled once defined. This consumes the signature info. All the information that's necessary at compile time needs to be provided _in the signature_. The compile-time info patterns-tx depends on thus needs to be embedded in the signatures also. Looks like it's best to define operations themselves as signatures. What is necessary is the separation of interface and implementation for the assembler opcodes. Opcodes should be _declared_ somewhere, then when _defined_ the declaration should be verified (or possibly created). Maybe the signature can be stored _inside_ the static info? (define-syntax (define-op-signature stx) (syntax-case stx () ((_ name^ (name arg ...) ...) _ _ _))) Entry: making qw parametric Date: Fri Apr 24 19:03:32 CEST 2009 This goes really deep.. In macro-tx.ss the qw opcode is used to define the immediate semantics of the macro language. Considering that this is implemented with parameters defined at compile time, i think i'm stuck. Maybe the scat languages themselves should also be implemented as units.. Then multiple instances can be defined similarly. Well.. I guess the idea was to make everything declarative, so this probably includes the parameters at the very core of staapl... I'm not sure if it's possible to do this gradually. Let's first establish some working point to get back to. Entry: mark: test-pic18.ss / pic18.ss work with console Date: Fri Apr 24 19:20:41 CEST 2009 i will now start breaking things to turn scat into a unit that can be re-instantiated multiple times, instead of using compile-time parameters to change behavior of a single compiler core. this will probably be quite some effort.. now, instead of stupidly starting to break things, isn't it possible to write a unit interface on top of rpn-tx.ss ?? Entry: rpn compiler Date: Fri Apr 24 20:09:03 CEST 2009 maybe it's time to rewrite the whole thing... one of the problems i had is to have to work around limitations of the parser.. maybe it should be made to parse forth in the first place, then limit it to be able to do simpler things too? the basic problem is a parse from linear code -> dictionary. this is solved using prefix parsing keywords. so, what is needed is a representation of a dictionary, somewhat like the chunk-collecting compiler in comp/compiler.ss another thing to mention is (delimited) continuations. upto now this was problematic because of the way code was mapped to lambda expressions. it might be better to work directly in cps, since that's a lot closer to the way forth works. let's see.. the code fragment (a b c) start in rpn/new.ss Entry: the new scat Date: Fri Apr 24 20:44:03 CEST 2009 So.. What is a stack language? It's like a functional language where the environment is replaced with an explicit stack. ok, it went fast for a bit.. trying this now: (define-syntax-rule (op-2->1 op) (lambda (d r) ((car r) ;; call continuation (cons (+ (car d) (cadr d)) (cddr d)) (cdr r)))) (define (done) (list (lambda (d r) d))) (define-syntax-rule (run program stack) ((ns (rpn) program) stack (done))) Basicly, each fn takes 2 arguments, the data stack and the continuation stack. It performs a primitive computation and passes the (d r) to the popped continuation. Next step is to define a parser, basicly an expansion time interpreter. This will lead us to something very close to the CEK machine. (lambda (d r c) ...) p : parameter stack r : continuation stack (return stack) c : code stack d : dictionary the p and r can be eliminated during parsing.. probably re-introduced later. the continuation stack can be abstracted as a procedure. it never needs to be implemented as a stack. it might also be faster to run functions from a list than to construct syntax for them. DAMN.. almost there, but i'm fading out.. i'm confused about the tension of implementing the contuations as an explicit stack, or as an abstract function. for functions, i need to figure out how to write "compose" (define (a p k) ...) (define (b p k) ...) (compose (a b) (lambda (p k1) (a p (lambda (p k2) (b p k1))))) Now, how does a primitive look? (define-syntax-rule (op-2->1 op) (lambda (p k) (k (cons (op (p-car p) (p-cadr p)) (p-cddr p))))) Entry: list interpreter Date: Fri Apr 24 23:31:28 CEST 2009 Why not use a forth-style list interpreter? Why does the code have to be abstracted as a nested lambda expression? What about CPS? I'm a bit confused now.. Let's see.. The goal is to be able to run the code without excessive consing. This means either nested expressions: (lambda (stack) (c (b (a stack)))) List interpretation: (lambda (stack) (do-list (list a b c) stack)) Or CPS: (lambda (stack k) ;; L1 (a stack (lambda (stack1) ;; L2 (b stack1 (lambda (stack2) ;; L3 (c stack2 k)))))) With a,b,c primitives. [1] http://en.wikipedia.org/wiki/Continuation-passing_style Funny, writing the CPS as follows makes it easier to see the Haskell "do" notation (with arrows reversed). (define (add p k) (k (cons (+ (car p) (cadr p)) (cddr p)))) (define (add3 p k) (add p (lambda (p1) ;; add p -> p1 (add p1 k)))) ;; return p1 (define (add4 p k) (add p (lambda (p1) ;; add p -> p1 (add p1 (lambda (p2) ;; add p1 -> p2 (add p2 k)))))) ;; return p2 (define (comp a b) (lambda (p k) (a p (lambda (p1) (b p1 k))))) So.. I see no advantage in CPS. Does PLT optimize this to a simpler form? The closures are really no better than te naive consed return stack used earlier. It's probably best to stick to the native representation. So.. Nothing better. But there's one optimization that should be possible. Instead of using 1->1 functions, it might be possible to use n->n functions, which eliminates construction of the state vector. Entry: n-ary instead of unary ops Date: Sat Apr 25 00:24:04 CEST 2009 the problem here is that the simple nesting can't be used: (lambda (stack) (c (b (a stack)))) instead we have (lambda (x y z code) (let next ((x x) (y y) (z z) (c code)) (if (null? c) (values x y z) (call-with-values (lambda () ((car c) x y z)) next)))) it's probably not worth fussing about this, since i don't know what's going on without knowledge of the compiler. maybe i'm just looking for 'compose? (define (pp a b) (values (cons (car b) a) (cdr b))) ((compose pp pp pp) '() '(a b c d e f g h i j k)) => (c b a) (d e f g h i j k) Entry: collecting Date: Sat Apr 25 01:27:52 CEST 2009 This isn't really parsing. It's just collecting elements of a dictionary. The same pattern happens twice in the code: * the forth parser: collects (name type body) code * the CFG compiler: collects (target code) chunks * binchunk parser The binchunk parser uses the stack of stacks abstraction. It should be quite straightforward to do the same for the other dictionaries. The operations: - make-sos - sos->list - sos-push - sos-collapse Entry: basic parser Date: Sat Apr 25 02:13:51 CEST 2009 Is working.. Now to see if it can do the fancy stuff like locals. The problem is: how to parameterize this? It can't be done with just macros. I.e. we need to get at the local syntax value of the real identifiers after mapping to determine whether it's a transformer that has to be called. Is it enough to have the mapping namespace parameterized? Also, the representation of the dictionary which is passed between macros should be abstract. However, it might be enough to keep this fixed at a list of 2 stacks. Actually, the only real problem is to parse the flat code into something which can easily be parsed with syntax-rules macros. Now... Maybe the sos is enough. This is what i get out of the new CPS style rpn-parse (define-syntax-rule (rpn: code ...) (rpn-parse (quote (rpn) p-apply p-cons) code ...)) box> (rpn: 123 : abc 1 2 + : 100 -) (() (((name 100) (p-apply rpn/-)) ((name abc) (p-cons 1) (p-cons 2) (p-apply rpn/+)) ((p-cons 123)))) This is actually colorForth ;) Entry: mixing pattern names and variable names Date: Sat Apr 25 12:10:12 CEST 2009 This doesn't work: (define-syntax rpn/: (make-rpn-transformer (lambda (w d k) (let ((name (cadr w)) (w+ (cddr w))) (k w+ (d-compile #`(name ;; <- this #,name) (d-close d))))))) The marked "name" is not an introduced identifier that will later match to a literal "name" pattern as in : (define-syntax rpn-define (syntax-rules (name) ((_ (name n) (type param) ...) (ns (rpn) (define n (rpn-lambda (type param) ...)))) ((_) (void)) ;; ignore empty code )) Entry: parameterization Date: Sat Apr 25 12:32:44 CEST 2009 So the question is mostly: does the current lexical parameterization work for solving all possible syntax combinations? Let's see. All works pretty well, except for quasiquote. I don't understand how it works any more.. Ok.. Quasiquote serves to build datastructures containing functions: `(1 2 ,+) -> (list 1 2 xxx/+) `(1 2 ,(+ +)) -> (list 1 2 (xxx: + +)) This is a bit ad-hoc, and I don't beleive it is used much.. Looks like the basic parser is done. Entry: macro continuations Date: Sat Apr 25 16:20:07 CEST 2009 Trying to avoid low-level macros, I do run into occasions where I need to resort to CPS to be able to properly separate concerns (use macros to "preprocess" other macro's input). Usually this can be done by passing a macro continuation. But what if this continuation takes other arguments? Maybe what is needed is a "macro curry". I've been playing with this when I tried writing the parser completely in syntax-rules. What I'm looking for here is a simple mechanism that catches most of my specialization problems. This is a bit more insidious than i thought. My current approach doesn't compose well... Let's just stick with simple one continuation macros.... maybe local syntax would help? Something to fake "lambda" for macros? The thing is: in CPS, primitives take an exptra continuation argument. However, to compose promitives requires the creation of new continuations (in the form of closures). This doesn't work for macros, so the best way is to create local syntax. It works with local syntax. This is a bit tricky due to (... ...) which i never got to intuitively understand very well (it seems upside down). (define-syntax-rule (rpn-begin code ...) (let-syntax ((tx-mk (syntax-rules (name) ((_ (() ;; empty anonymous list ((name n) ;; entries need tagged name . tagged-code) (... ...))) (begin (ns (rpn) (define n (rpn-lambda . tagged-code))) (... ...)))))) (rpn-specialized-parse tx-mk code ...))) Entry: scat in terms of new rpn syntax Date: Sat Apr 25 17:53:24 CEST 2009 rpn-scat.ss is quite simple now let's start taking out the entire old parser and work up from here. I don't remember what rpn-wrap is about. Taking out shift/reset. (ns (scat) (define-syntax reset (rpn-wrap (expr) #`(reset-at scat-prompt #,expr)))) (ns (scat) (define-syntax shift (rpn-wrap (expr) #`(shift-at scat-prompt k #,((rpn-immediate) #'k expr))))) For the rest it seems to work fine. next is coma. From macro-syntax.ss i don't know what this is about: (define-sr (define-procedure name) (ns (macro) (define name (postponed-word 'name)))) now, looks like tscat: isn't so trivial.. let's see. indeed.. What it does is to figure out whether a variable is locally bound (i.e. not a module import or toplevel) and depending on this will perform some action. this is the worst kind of reflection there is.. how can this be re-incorporated? Hmm.. it's not so bad. It's about the input of tscat: itself, not some inside behaviour mod. Can probably just be copied: ;; Operate on rpn code body, processing lexical and other variables. (define (rpn-lex-mapper fn-lex [fn-no-lex (lambda (x) x)]) (lambda (stx-lst) (map (lambda (stx) (if (and (identifier? stx) (lexical-binding? stx)) (fn-lex stx) (fn-no-lex stx))) stx-lst))) Entry: next Date: Sat Apr 25 19:08:40 CEST 2009 Turning coma into a unit in terms of basic ops - or, maybe it's best to dive into the parser first... Ok. let's try that then. Let's do locals-tx.ss first. One of the problems i saw coming was the use of semantics macros in the parsing words... Simply, there is no way to do this without passing some parameters into the transformer.. Let's see.. Instead of doing this in macro code, try it in plain rpn first. The syntax is: (| a b | b a) The difficulty here is in finding the end of the code. Once this is done it's quite straightforward. (define-syntax-rule (rpn/locals: (a b) prog ...) (lambda (p) (let ((a (lambda (x) (immediate (p-car p) x))) (b (lambda (x) (immediate (p-cadr p) x))) (p+ (p-caddr p))) (function (rpn: prog ...) p+)))) The convoluted nature of the old implementation seems to be mostly about not being able to terminate the code properly. As an s-expression, it's really trival. So.. What about solving this in the postprocessor? Can't do that since some identifiers might be interpreted as compile-time actions.. So probably the current way it works is already wrong: compile-time bindings won't get shadowed! There's only one way: the body needs to be expanded inside a context that has expanded the surrounding let form. Hmm.. One thing: I'm going to change the local syntax stuff to explicitly named transformers to make the expansions a bit less verbose. This is the locals transformer: (define-syntax (rpn-parse-locals stx) (syntax-case stx () ((_ (namespace function immediate program: pop-values) (param ...) code ...) (let ((plist (syntax->list #'(param ...)))) #`(lambda (p) (let-values (((p+ param ...) (pop-values p #,(length plist)))) (ns namespace (let #,(for/list ((p (syntax->list #'(param ...)))) #`(#,p (program: ',#,p))) (function (program: code ...) p+))))))))) Entry: identifier -> source path Date: Sun Apr 26 00:15:15 CEST 2009 From PLT list: (define (definition-source id) (let ([binding (identifier-binding id)]) (and (list? binding) (resolved-module-path-name (module-path-index-resolve (car binding)))))) So.. next problem: how to parse this in forth code? It would be great if all code could be segmented first, then properly parsed one definition at a time. This is limitation of Forth syntax, which is a consequence of its imperative nature.. So let's stick to the imperative nature and solve it. Entry: compile time is compile time Date: Sun Apr 26 01:47:00 CEST 2009 Q: The problems I'm trying to solve at this moment are about run-time values not matching across unit boundaries. What about avoiding these run-time values and turning the macros into proper macros (executing at scheme compile time). Would that solve anything? Really, I should find a simple test case first.. Rewriting the RPN compiler was (is) fun and necessary for clarity, but units aren't modules when it comes to macros.. The following example shows otherwise: values are properly shared, the module is instantiated only once. (eq? it1 it2) => #t ;; it.ss #lang scheme/base (provide it) (printf "instantiating it\n") (define it "it") ;; sig.ss #lang scheme/base (require scheme/unit) (provide sig1^ sig2^) (define-signature sig1^ (it1)) (define-signature sig2^ (it2)) ;; top.ss #lang scheme/base (require "unit1.ss" "unit2.ss" "sig.ss" scheme/unit) (define-syntax-rule (define/invoke (sig ...) (unit ...)) (begin (define-compound-unit/infer combined@ (import) (export sig ...) (link unit ...)) (define-values/invoke-unit combined@ (import) (export sig ...)))) (define/invoke (sig1^ sig2^) (unit1@ unit2@)) ;; unit1.ss #lang scheme/unit (require "sig.ss" "it.ss") (import) (export sig1^) (define it1 it) ;; unit2.ss #lang scheme/unit (require "sig.ss" "it.ss") (import) (export sig2^) (define it2 it) So.. what is the problem then? Is it because values are somehow wrapped? Is this related to the remark that eq? doesn't work across unit boundaries? Maybe I should accept that the eq? is bad practice. The assembler instance can't be used for pattern matching. Maybe it should just match the name then.. Entry: Forth parsing Date: Sun Apr 26 03:01:08 CEST 2009 The trouble is that instead of a list with a hole in it (the dictionary's compile point = cursor), it would be nice to be able to put holes in nested applications. This is exactly what the previous implementation did. Instead of being able to process the whole locals expression at once, it incrementally bulds an inside-out structure, where the outside is represented by a (stack of) wrapping procedures. So, there are 2 options: * create a zipper-like dictionary for incremental compilation * do multi-pass parsing using delimiters Some trick needs to be used.. The thing is this: : test | ; : | ; What is this supposed to mean? Introducing the local variables then kills all hope of using either ; or : as delimiter, since they should be redefined. Ha, Looks like the Factor syntax changed too :) Let's go look at Forth for inspiration. I found Gforths's syntax: { l1 l2 ... -- comment } The problem is that I'm using non-local exits. I like them, and they are used for both macros and Forth words. The only real solution is to keep it like it is: * new definition terminates old * locals installs a wrapper around the remainder of the code. * parsing words cannot be used as argument names, the last one because they are not recognized as lexically bound during parsing, only when the parsed source is expanded after passing it to the dictionary compiler. this is also a good thing since it would otherwise allow redefinition of words like : as a local variable, messing up the code structure. it's probably best to have a separate 'locals tag that is recognized by the -> scheme compiler. actually, this doesn't look all that bad. : foo | a b | b a 1 - - -> ((name foo) (locals (a b)) (fn a) (fn b) (im 1) (fn -) (fn -)) 1. tag source and execute parsing words 2. compile tagged source to scheme (including some special forms) now, can 'locals be implemented as a simple "cons" macro? Entry: regular expressions Date: Sun Apr 26 09:41:26 CEST 2009 Maybe i should just move on to regular expressions for all this lexing activity (essentially "tabel parsers"). It's ok to write one state machine, but it gets old quite fast.. Entry: sharing data between macros Date: Sun Apr 26 09:59:19 CEST 2009 Problem: individual rpn-transformers should be able to tag words with semantics just like the core transformer does. One thing is to always pass the semantics macros environment around in a hash table or so. But i wonder if it isn't a lot simpler to re-resort to paramters, in this case syntax parameters. Maybe I should learn to take a hint and see the use of a literal 'name symbol as exactly the problem I'm trying to solve. The parser is mainly about translating Forth code to simple s-expressions (in the form of tagged atoms) so it is trivial to parse later _with_ information provided by the parsing words themselves. (: abc 123 +) -> ((name abc) (lit 123) (apply +)) -> (name abc ((lit 123) (apply +))) Hmm.. i lost perspective a bit.. It looks like there's a very simple and elegant solution hidden somewhere, but I can't find it. The clue seems to be "fold": there is a datastructure parsed up to a certain point, how does the next token add to this? A new slogan: Forth parsing = fold. To be continued.. Entry: ns -> rename transformer Date: Sun Apr 26 10:05:46 CEST 2009 not using this is probably the reason why syntax checking wont work for lexical variables in scat quotations. hmm.. not really what i thought. Entry: parser idea Date: Sun Apr 26 10:42:44 CEST 2009 The thing is: what if rpn-parse will _not_ translate identifiers if it deems they are not transformer bindings? The idea beeing that parsing words only serve to translate a flat file into forth syntax, annotating each atom with a compiler semantics. There is really no way around using some kind of cursor into the final data structure = the scheme expression. An empty dictionary is equivalent to: (begin (lambda (p) p)) With the cursor made explicit: (begin (lambda (p) (C p))) compiling 123: (begin (lambda (p) (C (push 123 p)))) compiling a reference to a in ns (foo): (begin (lambda (p) (C (apply foo/a (push 123 p))))) compiling the name tag 'bar in namespace (foo): (begin (lambda (p) (apply foo/a (push 123 p))) (define foo/bar (lambda (p) (C p)))) now, compiling the locals '(a b)' I find it so increadibly difficult to switch between nested lambda representation and a CPS/monadic style representation. But, in words, what does the locals transform do? It transforms the current lambda expression: - it replaces it with a new lambda expression that applies the expression collected upto now to its input stack. - pops values off the stack and binds them lexically to constant functions. - expands the rest of the code inside this new context. (before ...) (local ...) (after ...) (lambda (p) (let ((p-in (apply (program: before ...) p))) (let-values (((p-popped local ...) (pop-stack p-in))) (apply (program: after ...) p-popped)))) How does this translate to a flat representation? - grabbing the (before ...) list isn't that difficult: it's already in the dictionary. - closing it needs access to its closing procedure. Entry: zipper dictionary Date: Sun Apr 26 12:39:23 CEST 2009 So I abstracted the pattern into an abstract data structure. The key seems to be that instead of postponing the "collapse" operation to a postprocessing step, it should be accessible to users of the library so the current entry can be collapsed and tagged. Let's apply this to locals to see what's necessary: - grab the current code and return it as an expression - install a new collapse routine, which uses the default collapse as an embedded operation. So the dictionary needs to store the default collapse. It's probably best to make sure the internal state of the dictionary is never observable: users of the dictionary can only put stuff inside, but never take out. The semantics function (packer) itself should be observable, just like the result of its application to the current entry. ;; Locals. Transform object + semantics into new semantics. (define (locals-obj+pack->pack obj pack) (lambda (instructions) `(lambda (p) (let ((p+ (apply ,obj p))) (let ((top (car p+)) (p++ (cdr p+))) (apply ,(pack instructions) p++)))))) It looks like this works. Time to use it in the rpn parser. First, remove code tagging. That's about how the structure is used and it can be hidden in the semantics. Wait: nested locals. It should have a proper semantics. Hmm... it doesn't do what I expect. Using the default semantics will not nest the locals properly (earlier ones no longer visible). Using the current replaced semantics will apply the first program twice. Using the current semantics: (lambda (p) (let ((p+ (apply (lambda (p) (let ((p+ (apply (program: 10000) p))) (let ((*OUTER* (car p+)) (p++ (cdr p+))) (apply (program: 20000) p++)))) p))) (let ((*INNER* (car p+)) (p++ (cdr p+))) (apply (lambda (p) (let ((p+ (apply (program: 10000) p))) (let ((*OUTER* (car p+)) (p++ (cdr p+))) (apply (program: 30000) p++)))) p++)))) This is not correct as (program: 10000) gets applied twice. To make sure let's alpha-convert. (lambda (p0) (let ((p1 (apply (lambda (p2) (let ((p3 (apply (program: 10000) p2))) (let ((*OUTER* (car p3)) (p4 (cdr p3))) (apply (program: 20000) p4)))) p0))) (let ((*INNER* (car p1)) (p5 (cdr p1))) (apply (lambda (p6) (let ((p7 (apply (program: 10000) p6))) (let ((*OUTER* (car p+)) (p8 (cdr p7))) (apply (program: 30000) p8)))) p5)))) Damn.. i miss the intution and i can't make the algebra make sense.. I'm probably confusing two things. The problem is that normal code nesting reverses the order, but locals nesting is the same order as in scheme. So.. Is it possible to somehow change the code so that (a b c) corresponds to (a (b (c _))) instead of (c (b (a _)))? Yes.. It's CPS. So, maybe I should use cps.. When I do that, I do want to know how good things are optimized away though, to make sure representation is a bit decent. Entry: CPS + optimization Date: Sun Apr 26 16:41:40 CEST 2009 http://docs.plt-scheme.org/mzc/decompile.html Using mzc -vk comp.ss ; mzc --decompile compiled/comp_ss.zo comp.ss : (define (inc x k) (k (+ 1 x))) (define (inc3_ x k) (inc x (lambda (x) (inc x (lambda (x) (inc x k)))))) Inspecting the output gives something like this: (define (inc3 p k) (let* ((p1 (+ '1 p)) (p2 (+ '1 p1))) (k (+ '1 p2)))) So it does look like this rep might be valuable as an abstraction mechanism. Let's see if it still works with more complicated functions. (define (add p k) (k (cons (+ (car p) (cadr p)) (cddr p)))) (define (add4 p k) (add p (lambda (p1) (add p1 (lambda (p2) (add p2 k)))))) => (define (add4 p k) (let* ((p1 (cons (+ (car p) (cadr p)) (cddr p))) (p2 (cons (+ (car p1) (cadr p1)) (cddr p1)))) (k (cons (+ (car p2) (cadr p2)) (cddr p2))))) So the function gets inlined and applied closures are converted to let expressions. Now, instead of using cps, it might be easier to already put the compositions in let form. The goal eventually is to add other let bindings to a list! So, let*-transformation works like: (a b c) -> (lambda (p) (let* ((p (a p)) (p (b p)) (p (c p))) p)) Or better in it's nested form which works better with other let-transformers, which can be spliced right in. This then makes the locals problem completely trivial. (lambda (p) (let ((p (a p))) (let ((p (b p))) (let ((p (c p))) p)))) So, let's switch the rpn language to a "nested let" representation (EDIT: administrative normal form[1]). This is quite trivial. ... foldl ... #`(apply atom #,expr) -> ... foldr ... #`(let ((p (apply atom p))) #,expr)))) Tests pass without trouble. Ha! for once i have the feeling i know what i'm doing! Entry: zipper-dict Date: Sun Apr 26 18:45:49 CEST 2009 After the CPS / nested-let story I forgot about the zipper dict. But is it really necessary? The semantics is no longer needed: problem solved with nested reps -- it was only used for locals. So, I've closed the abstraction: there is now only: d-open ;; create a new dictionary d-compile ;; compile an instruction d-start ;; start new definition d->forms ;; close dictionary and output forms in order [1] http://en.wikipedia.org/wiki/Administrative_normal_form Entry: the forth parser Date: Sun Apr 26 21:13:58 CEST 2009 now need to fix the forth parser one by one. the running state might be problematic though: - current mode - toplevel forms maybe use a syntax-parameter for this? Entry: more forth parsing Date: Mon Apr 27 13:15:42 CEST 2009 fixed the 'include' parser this morning. this means i'm done, except for the parsing state, which will probably need a parameter. maybe it's best to use the same trick as in the include parser: install a dynamic context and continue the parser, then at the end collect all data. then can probably be written in terms of each other.. Entry: dictionary "extra" state Date: Mon Apr 27 16:20:16 CEST 2009 It's not really elegant to augment the purely functional parser with parameters and imperative stacks. However, it is quite isolated and should be fixable by making the dictionary object itself extensible. Prime objective now is to get it to work again. That last piece of cosmetics is for later. I do wonder though how this would be solved using monads: if you have a piece of threaded state, how do you extend it? The current implementation confuses me. The dictionary probably needs type tags: semantics for compiling an entry: lambda-tx should be passed in! Ok. This works + i made testing a bit easier. test-pic18.ss works too again. Entry: forth bootstrap with new rpn parser Date: Mon Apr 27 18:14:50 CEST 2009 with the new parser infrastructure it should be quite straightforward to bootstrap a standalone forth compiler: map the dictionary to the machine VM semantics. as long as the Forth code defining the interpreter doesn't use a feedback loop to define lexing/parsing/macro words this should work just fine. Entry: fixing pic18 forth parsing Date: Mon Apr 27 18:52:17 CEST 2009 next: need to fix the parser to make embedding toplevel scheme expressions work, and allow some state to be maintained during parsing. actually.. it's not that incredibly difficult: just add semantics to the name -> this gives how it should be defined. toplevel forms can then just be added as anonymous definers. or better yet, simply use a list of macros, which means the form can be executed immediately once it is done. i.e. (register-rpn name rpn-lambda code ...) to make require forms work better, they could be expanded in-line. Let's forget about arbitrary scheme forms, but let's have a good look at how require can be made to import syntax. The problem uptil now has always been that while it was possible to add require forms in the module body, the forth syntax would already be completely expanded before these require forms could introduce new bindings.. What is necessary: whenever a 'require form is encountered, that module should be instantiated such that it's syntax bindings are visible. Anyways. Let's first make sure that a dictionary can be expanded to any kind of top-level form. Ok. For 'rpn-begin the dictionary compiler is just 'begin, so any expression can be inserted. Let's fix this first for scheme forms. Entry: prefix macros : term rewriting in Forth code Date: Mon Apr 27 23:43:05 CEST 2009 One basic prefix macros are working, they behave exactly the same as Scheme syntax-rules macros. (define-syntax-rule (rpn-syntax-rules (literal ...) ((pattern ...) (template ...)) ...) (make-rpn-transformer (lambda (w d k) (syntax-case w (literal ...) ((pattern ... . w+) (k (syntax->list #`(template ... . w+)) d)) ...)))) Entry: next Date: Tue Apr 28 01:07:38 CEST 2009 TODO: - current mode - source location Current mode can copy the mode from the last dictionary item. Source location requires an updated rpn-parse, and can be added later. Entry: top level coma/forth parser Date: Tue Apr 28 09:49:24 CEST 2009 This needs a new name. Actually, it should be part of "forth" since what it actually does is to multiplex 3 semantics in one file: variable, word and (concatenative) macro. Entry: cleanup Date: Tue Apr 28 10:31:33 CEST 2009 salvaged from rpn/cps.ss : (define-syntax (cps stx) (let ((cps-fns (cdr (syntax->list stx)))) #`(lambda (k) #,(foldr (lambda (fn k) #`(lambda (p) (#,fn p #,k))) #'k cps-fns)))) (define-syntax (cps-let stx) (let ((cps-fns (cdr (syntax->list stx)))) #`(lambda (p) #,(foldr (lambda (fn sub) #`(let ((p (#,fn p))) #,sub)) #'p cps-fns)))) (define-syntax-rule (macro form) (syntax->datum (expand-once #'form))) (check (macro (cps a b c)) => '(lambda (k) (lambda (p) (a p (lambda (p) (b p (lambda (p) (c p k)))))))) (check (macro (cps-let a b c)) => '(lambda (p) (let ((p (a p))) (let ((p (b p))) (let ((p (c p))) p))))) Entry: instantiation Date: Tue Apr 28 10:43:12 CEST 2009 What is necessary first is an instantiation syntax that's independent of Forth syntax. Then later translate Forth to it using the new parser. ok: basic macros are in place + "mode" works by inspection of last dictionary element. Entry: require + define-syntax Date: Tue Apr 28 13:44:38 CEST 2009 Now, when 'require and 'define-syntax are encountered in a dictionary, it is probably best to expand them before parsing the rest of code. This can be done by generating a recursive call to the top begin form. I'm happy with this new representation: things are much easier to express and the problem actually looks simple now -- as it should, since it is already solved. It really _looks_ like a forth compiler too now, with the only exception that all mutation is replaced with some functional counterpart. Entry: what is what.. Date: Tue Apr 28 15:30:21 CEST 2009 now i'd like to be able to parameterize the forth so it doesn't depend on macro: and (macro) but that doesn't seem to be possible. so what is it part of? it's more like an emerging thing.. rpn = forth-style dictionary compiler used to implement pure concatenative languages in s-expression syntax and forth prefix style parsing. comp = macro instantiation + post-proc opti since instantiation is a big part of what comp is, the forth should be part of comp. Entry: target: Date: Tue Apr 28 16:09:30 CEST 2009 Forgot about that one.. (How come I only now see it not compiling? something wrong with deps..) This is part of the live.ss stuff and broken now. Let's focus on pic18.ss = just the compiler. Entry: separate parser namespace Date: Tue Apr 28 16:22:17 CEST 2009 maybe the prefix parsers should live in a separate namespace? the problem however is that they are bound to primitive semantics macros. but these could be perameterized. well, i'm not 100% convinced this will work with units due to the amount of syntax juggling. let's keep it as it is. it's always possible to define a non-hygienic parser macro. in fact i'm already in trouble! now i'm really confused.. time to give it some rest. Entry: tscat: Date: Wed Apr 29 10:23:57 CEST 2009 This uses some special trick in the name mapping. Will the simpler infrastructure be able to handle it? The problem is this: (define (map-id id) (let ((target (ns-prefixed #'(target) id)) (macro (ns-prefixed #'(macro) id))) (cond ((identifier-binding target) target) (else #`(target-simulated #,macro))))) Name mapping in the new rpn code cannot be made procedural. To get to the same functionality, all macros need to be wrapped in the target space as well. So, instead of seeing a namespace as a collection of objects, what about seeing it as a _language_ first, with the objects being part of the implementation. So, add a macro to map from (macro) -> (target) for each target word. No.. The cleanest solution is to make the namespace syntax functional by setting (define-syntax-rule (macro . form) (ns (macro) . form)) And figuring out how to get to the local identifiers. Let's try in rpn first. Entry: namespace mapping Date: Wed Apr 29 11:42:49 CEST 2009 I'd like to expand namespaces to something more abstract: a generic identifier mapping mechanism. This to implement the 'target: macro which takes elements from a different namespace if they are not transformer bindings. The problem is that I can't seem to figure out how "local-expand" can be used to turn the abstract mechanism back into a concrete identifier mapping, so rpn-transformer instances can be looked up at compile time through syntax bindings. Now, to keep a bit of sanity, I found that staying away from deep macro system internals is generally a good idea. There's a reason why these things are hidden: they are quite complicated. So, I'm going to adopt the following convention: * At compile time (namespace ... id) is directly interpreted as (ns (namespace ...) id) and cannot be overridden. * For run time identifiers the frorm (namespace ... id) can have arbitrary meaning. So later when I figure out how to properly implement this it could be changed. Entry: units and macros Date: Wed Apr 29 18:00:49 CEST 2009 There is one thing i didn't understand when starting with units: it is not so straightforward to use transformer bindings. Maybe it isn't the abstraction I'm looking for after all. It's a bit obvious that you can't separately compile things and later fill in compile time dependencies. I'm missing some intuition again. Where do I put the parsing words and the type info? I need another abstraction. Now, the wordlists were a good idea. Can they be combined with some other macro-based linking form? .. After reading this[1] again: "Each id in a signature declaration means that a unit implementing the signature must supply a variable definition for the id. That is, id is available for use in units importing the signature, and id must be defined by units exporting the signature. "Each define-syntaxes form in a signature declaration introduces a macro to that is available for use in any unit that imports the signature. Free variables in the definition’s expr refer to other identifiers in the signature first, or the context of the define-signature form if the signature does not include the identifier. It seems quite clear that yes units only link _variable_ declarations and inside the unit's body it is possible to use syntax depending on some of the variable bindings, but this syntax needs to be tied to the signature's compile time data. Now, how does this relate to parsing words. When are they necessary? Only in .f code. S-expressions don't need them, as they can use pure concatenative code. So.. * The basic composition part in Staapl is still the module, but it is possible to create modules from units. * For ops the signature contains compile-time data for type checking so this problem should be solved. * Abstracting macros becomes difficult. [1] http://docs.plt-scheme.org/reference/creatingunits.html Entry: Abstracting parsing words. Date: Wed Apr 29 18:29:37 CEST 2009 Let's try to abstract parsing words in units too. Since they are always written in terms of scat words (which are values), units _can_ be used. However, it does seem there is no way to then get at this syntax by requiring a module.. Only by importing the signature. That seems to answer the question: * forth files are units, because they need _syntax_ which depends on functionality provided by other units. * all parsing words are part of signatures if they depend on external functionality. So, I'm confident I can find some way to organize all code, so let's start with where I left off wrt. the operator signatures, and work up from there. Entry: ops and op matching Date: Wed Apr 29 19:07:44 CEST 2009 This is a tough one.. What I'm really trying to do is to: * Check syntax of pattern matching at compile time. This is currently only existence of op + arity. * Implement matching at run-time (using tagging). * Associate the op instance with semantics. (avoid symbolic lookup / interpretation here). I'm currently confusing tags with implementation instances: it is possible to have an instance (a tagged list) without associated semantics, or to have multiple instances of the same type (op) with different semantics. So basicly * _matching_ should just check the symbol, not the semantics. since we're taking the instruction apart, we really don't care about what it would do if it were still there. * _construction_ needs associated semantics So matching needs: * compile time data * type predicate Construction needs: * compile time data * constructor This allows the insertion of operations that are not known except for the local scope, but will properly assemble to machine code. That's good. Let's use (op info) and (op predicate). Entry: namespaces are properties Date: Wed Apr 29 19:26:34 CEST 2009 Actually it's much more reasonable to use the namspaces like this: (ns (op addwf info)) ;; compile time information (ns (op addwf ?)) ;; run time predicate (ns (op addwf asm)) ;; run time semantics (ns (op addwf dasm)) ;; inverse of semantics Then they ring like properties of addwf. There's a complication though: current implementation takes the lexical context from the last element, so this won't work without extra syntax to indicate from what context you want to pick a symbol... So it becomes: (op info addwf) (op ? addwf) (op asm addwf) (op dasm addwf) (Note: row store vs. column store) Entry: macro types Date: Wed Apr 29 19:40:25 CEST 2009 But but! When I want to add types to macros, units no longer suffice! Or.. macro types should then again be part of the signature. Which makes sense: it's ok to use a different _implementation_ but the _specification_ (of which the I->O type is a part) should be static. So again units seem to be the proper abstraction. Entry: from dynamic to static Date: Wed Apr 29 20:47:27 CEST 2009 So basicly a certain instance of the asm struct behaves as a parameterized type. The name tag determines the main variant, but the implementation can be different. It's a pain in some way to go through it like this, but it _is_ all starting to make more sense. Static typing is hard.. The help the compiler gives doesn't come free! It will probably take a couple of iterations but it looks like i'm at least going to end up somewhere. Entry: forth signature Date: Wed Apr 29 22:27:06 CEST 2009 The deal is now: I have a file full of macro definitions, all in terms of instantiate^ with as most as possible of the compile-time code factored out in a separate module. This should then be part of the instantiate^ signature, right? Alternatively it can be instantiated outside of signatures. Probably best with (extends instantiate^) Now, before writing a Forth-only form for instantiation, let's first concentrate on an s-expression. (forth-variables (name ...)) (forth-words (name code ...)) (forth-macros (name code ...)) ;; Scheme forms for instantiation. Maybe these can better be ;; avoided.. Its probably simpler to define in terms of the wrap-xxx ;; functions directly. (define-syntax-rule (forth-macros (name code ...) ...) (begin (define-macro wrap-macro name code ...) ...)) (define-syntax-rule (forth-words (name code ...) ...) (begin (define-forth wrap-word name code ...) ...)) (define-syntax-rule (forth-variables (name ...)) (begin (define-forth wrap-variable name 1 allot) ...)) Entry: the signature Date: Thu Apr 30 09:06:19 CEST 2009 This probably needs a macro to not have to use define-syntaxes only. Entry: levels again.. Date: Thu Apr 30 10:24:31 CEST 2009 When I require the module, the rpn-syntax-rule form works, but the prefix-parsers rule doesn't work. (define-syntax-rule (rpn-syntax-rules (literal ...) ((pattern ...) (template ...)) ...) (make-rpn-transformer (lambda (w d k) (syntax-case w (literal ...) ((pattern ... . w+) (k (syntax->list #`(template ... . w+)) d)) ...)))) (define-syntax-rule (prefix-parsers namespace ((name arg ...) template) ...) (ns namespace (define-syntaxes (name ...) (values (rpn-syntax-rules () ((_ arg ...) template)) ...)))) This is because it is used at compile time. So 'rpn-syntax-rule will probably need to be defined in parser-tx.ss In addittion, the file that defines prefix-parsers needs to (require (for-syntax "parse-tx.ss")) Entry: postprocessing macro using expand-to-top-form Date: Thu Apr 30 11:10:51 CEST 2009 Maybe I should give it a try again, to make ns-tx a postprocessing macro. expand-to-top-form The problem is: expansion needs to stop before any bindings are introduced in body code. It looks like expand-syntax-to-top-form is necessary since the lexical environment needs to be left intact. Maybe do it in this way: use expand-syntax to figure out which identifiers are in binding position so they can be mapped. OK.. I find it strange why I can't get this to work (wont match): (syntax-case top-form (define-values) ((define-values (name ...) expr) #`(#,(datum->syntax stx 'define-values) #,(prefixed-list #'(name ...)) expr)) And have to resort to something like this: (syntax-case top-form () ((form (name ...) expr) (form? 'define-values) #`(#,(datum->syntax stx 'define-values) #,(prefixed-list #'(name ...)) expr)) The datum->syntax take 'define-values from the caller's context.. Maybe that's not a good idea, and it should be our context. The expanded form's tag itself is not visible in the caller's context, so we cant just re-insert it. Hmmm.. I got it to work, but I don't understand why 'define still shows up.. Thought that was not a primitive form? (-> define-values) Anyways, where was I. define-signature for forth parsers... this won't work with expand-to-top-form in ns-tx.. probably best to do it in a separate macro. Entry: sometimes it doesn't expand Date: Thu Apr 30 14:34:45 CEST 2009 This won't expand ns forms unless there is a (require (for-syntax "ns.ss")) in the definition place of this macro Expansion happens in the transformer environment, so it needs bindings for the forms. ;; Collects all syntax definitions. (define-syntax (define-signature-begin stx) (define (tx-forms forms) (for/list ((form (in-stx forms))) (if (identifier? form) form (let ((top-form (expand-syntax-to-top-form form))) ;; (pretty-print (syntax->datum top-form)) (syntax-case top-form () ((form names expr) (and (eq? 'define-syntaxes (syntax->datum #'form))) #`(define-syntaxes names expr))))))) (syntax-case stx (extends) ((_ id^ . forms) #`(define-signature id^ #,(tx-forms #'forms))) ((_ id^ extends id-super^ . forms) #`(define-signature id^ extends id-super^ #,(tx-forms #'forms))))) Entry: syntax signatures Date: Thu Apr 30 14:56:51 CEST 2009 now how to get the syntax out of the signature, back into a module namespace? you can't it only makes sense inside a unit because it binds unit imports. as a consequence, .f files can never be modules since their most basic syntax requires syntax transformers depending on the compiler instantiation code. as a result, each compiled .f file then needs to be linked with the compiler to be able to produce code. i'm still not convinced.. need to sleep on it.. anyways, apart from the "is it possible" thing here, it's probably enough (and better) to have units only for .f files. so next: create a unit in s-expr syntax. ok.. i don't understand it.. maybe it was a bad idea to put the forth syntax in a unit. let's try to use macro defining macros. Entry: units vs. modules Date: Thu Apr 30 15:38:14 CEST 2009 * modules serve to properly handle macros they provide bottom-up language design. * units serve to identify separately compiled components. Entry: just cut & paste Date: Thu Apr 30 17:13:49 CEST 2009 i can't figure out how to abstract it: too many levels at once makes my head hurt.. so let's first try to get the bugs out and then try again. ;; TEST (macro-forth-begin : abc 1 + : def 1 +) (define cfg (pic18-compile->cfg (list inline/abc inline/def))) (print-target-word (car cfg)) This creates a CFG. So it looks like the full circle works.. Now clean it up. Entry: code collection Date: Fri May 1 09:31:09 CEST 2009 Is trivial once the registration macro gets passed to the definer words. Entry: expand-to-top-form again Date: Fri May 1 12:27:04 CEST 2009 I suspect I have another problem with expand-to-top-form.. The identifier 'new-lambda' shows up in a place i don't understand. It's not one of mine. In the PLT code i find this: tom@zzz:/plt/collects$ grep -Ire 'new-lambda' * scheme/private/kw.ss: (#%provide new-lambda new-λ scheme/private/kw.ss: (define-syntaxes (new-lambda new-λ) scheme/private/kw.ss: (let ([new-lambda scheme/private/kw.ss: (values new-lambda new-lambda))) scheme/private/kw.ss: (normalize-definition stx #'new-lambda #t #t)]) scheme/private/pre-base.ss: (rename new-lambda lambda) Ok. After restoring the old ns-tx all works fine. Entry: local Date: Fri May 1 13:02:03 CEST 2009 A small tangent. With (require scheme/local) : box> (local [(pic18-begin : abc 123)] foo) coma/macro-forth.ss:50:7: local: not a definition at: (register! inline) in: (local ((pic18-begin : abc 123)) foo) It would be nice to restore that to be able to perform isolated parsing. Probably needs a separate 'let form. Entry: jump^ Date: Fri May 1 17:07:17 CEST 2009 I'd like to override jw/false in PIC18 but it is already exported in the compiler. The reason being that PIC18 performs a lot of optimizations to optimize jumps and bit tests (about the only thing the PIC18 is good at). But, there's a conflict here: should this optimization be done as a 2nd pass optimization, or can we do it at once? + is this second pass necessary (or should the first pass do opti?) Entry: things to fix Date: Fri May 1 18:23:42 CEST 2009 - pic18 conditional jump optimization - pic18 org - pic18 full dictionary export - CFG without mutation - compiler cleanup - local exit without parameter Most of these need to be done with test enabled, which means the cjump has priority. Entry: jw/false Date: Sat May 2 11:00:30 CEST 2009 The simplest solution is to take it out of compiler-unit.ss and jump^ Now, delegation to run-time words won't work any more. So how to implement this: ?? (([qw l] jw/false) (macro: ~>z ,(insert `([bpz 0 ,l])))) ;; STUB (([qw l] jw/false) ([decf WREG 0 0] [drop] [bpc 0 l])))) ;; STUB Entry: what is a .fm ? Date: Sun May 3 09:06:05 CEST 2009 In other words: should it already export compiled code or leave that to the loader? I'm in favour of keeping modules at the macro level, and leaving instantiation to the client. But first: * instantiation checks * staaplc Ok. After fixing bug instantiation works for the 452-40.f example. Not checked yet if it produces the same code as before. So.. A forth module. How to make it more declarative. The problem with the global state is that multiple modules will use it to attach to. I'd like to get rid of this final bit of state. Problems: - If modules are made to export a symbol, this will clash. - If 2 modules use a 3rd, how will instantiation work? The latter is probably the key: as long as code is not doubly instantiated we're ok. Fixed some bugs, now it's clear to me how it should work: it's ok to instantiate code on require, as long as * the original postponed macro stack is left intact * the cfg's are collected in sequence Instantiator can get at the code by requiring the correct module. Entry: trouble with locals probably Date: Sun May 3 10:00:59 CEST 2009 Some matching error somewhere This works: (state-pop (make-state:stack `((,op/asm/qw 123))) 1 op/?/qw) This too: (macro-pop (make-state:stack `((,op/asm/qw 123) (,op/asm/qw 123))) 1) Ah, no it returns the original stack: 123 #(struct:stack # ((#(struct:asm # qw) 123) (#(struct:asm # qw) 123))) looks like a prototype mismatch: stack is expected in first position but given last. ok fixed. Entry: parser macros again Date: Sun May 3 11:19:45 CEST 2009 Hmm.. because 'expand uses the 'forth-begin linked to the compiler implementation, instantiating the forth words like it's done now wont work unless the dependencies are propagated through: (define-forth ;; defines parser invocation from scheme + Forth parser words ... (forth-begin forth macro : :macro :forth :variable) ;; ... in terms of registration and compiler wrappers. (register! wrap-macro wrap-word wrap-variable)) So I'm going to make this macro non-hygienic to solve the dependency problem. This seems to work: (begin-for-syntax (define $ syntax-local-introduce)) ;; Primitive parsing word transformers coupled to compilation forms. (define-syntax (define-forth stx) (syntax-case stx () ((_ (forth macro : :macro :forth :variable) (reg wrap-macro wrap-word wrap-variable)) #`(begin (define-syntax-rule (#,($ #'forth-begin) . code) (forth-begin/init (forth-word reg wrap-word #f rpn-lambda) . code)) ;; (*) (ns (macro) (define-syntax :macro (with-mode #'macro-word #'reg #'wrap-macro))) (ns (macro) (define-syntax :forth (with-mode #'forth-word #'reg #'wrap-word))) (ns (macro) (define-syntax :variable (with-mode #'forth-word #'reg #'wrap-variable))) (ns (macro) (define-syntax : (last-mode #'reg #'forth-word #'wrap-word #'macro-word #'wrap-macro))) (prefix-parsers (macro) ((forth) (:forth #f)) ((macro) (:macro #f))))))) Now to be honest I'm getting confused by the levels of quoting / unquoting here. That code is derived by successive transformation of correct code, but I've lost the intuitive understanding. Maybe by introducing the identifiers once this can be made more readable. Ok: this abstracts it: ;; Non-higienically introduce a collection of identifiers. (define (datum->syntax-list stx lst) (map (lambda (x) (datum->syntax stx x)) lst)) (define-syntax-rule (syntax-introduce-identifiers stx lst body) (syntax-case (datum->syntax-list stx 'lst) () (lst body))) Entry: require + recursive expand Date: Sun May 3 12:58:15 CEST 2009 This works now. Care needs to be taken to preserve the proper context for 'require forms though: (define-syntax (require-file-id stx) (syntax-case stx () ((_ id) (datum->syntax #'id ;; Note that the whole form should have the caller's context. `(require (file ,(path->string (stx->path #'id)))))))) Entry: compiler Date: Sun May 3 14:07:42 CEST 2009 So, with modules working, shouldn't compilation become trivial? Yes for module -> binary, but there's also the .dict file which requires reflection. I wonder what 'init-namespace did.. Let's make compilation only work for modules. This way there are never any missing identifiers. (require "app/452-40.fm" (planet zwizwa/staapl/code) ;; code->binary (planet zwizwa/staapl/port/ihex)) ;; write-ihex (write-ihex (code->binary)) This produces almost correct code: diff 452-40.hex /tmp/test.hex 2c2,3 < :0E00000000260F0E000181000FC00FE00F4020 --- > :10000000000026000F000E0000000100810000002B > :0C0010000F00C0000F00E0000F004000D7 The difference is in ',' performing word compile instead of byte compile. fixed. Entry: recursive expansion Date: Sun May 3 15:41:43 CEST 2009 messes up all parameter context, so "load" won't work properly. the solution is of course to dump out the context in the ..-begin continuation. Entry: minor problems to fix Date: Sun May 3 17:04:39 CEST 2009 - context for recursive expansion (maybe save context in the dictionary in the first place?) - staaplc But first something more inspiring. Abstracting some of the parser words into a simpler syntax. The CPS and explicit CDR on the stream makes it hard to read. -> ok, but not many chances to use it.. Entry: staaplc Date: Sun May 3 19:01:02 CEST 2009 Simplified it: - the file needs to be a module, which means it knows its own language and can be a scheme module which exports binary code. the most important change is that a module can now be compiled before it is instantiated: enabling static checks (like identifier bindings) to happen at this stage. this solves the problem of not knowing where undefined identifiers are referenced. - no more connection to the programmer or device. staaplc is offline only. all the live code is in staapl/live.ss Problems: - some reflection issues : solved - simulation : fixed (problem with exporting base language namespace) - disassembler : mostly fixed (just add byte addressing) - macro evaluater start state disassembler needs to be coupled to the _instance_ of the assembler. currently only the forms get built together. maybe the disassembler form should be a curried exprssion that can produce the disassembler. done Entry: prj Date: Sun May 3 20:10:08 CEST 2009 now i forgot: what was this prj thing about? multiple namespaces i think.. i'm going to leave it in, but it's currently not used because everything is module-based now. the only interesting code that i can remember is sharing code for running multiple instances with the same core compiler instance. Entry: disassembler Date: Sun May 3 20:46:26 CEST 2009 what about turning disassembly into a reflective operation: query the namespace instead of building state. DONE: define-dasm-collection but conceptually what is the difference with collecting global code? Entry: macro exit without parameters Date: Mon May 4 01:09:04 CEST 2009 Seems to be quite straightforward. Removed the parameter and replaced it with this function: ;; The ";" word inspects the macro return stack. If there's context, ;; execute mexit. Otherwise we're in straight line code and can ;; execute primitive-exit. (define (semi state) (if (null? (compiler-rs state)) ((ns (macro) primitive-exit) state) ((ns (macro) mexit) state))) Entry: compiler-unit.ss cleanup Date: Mon May 4 01:26:36 CEST 2009 Now.. using the machine.ss notation i worked on before, cleanup the compiler-unit.ss file so it's actually readable. Entry: non-splitting return Date: Mon May 4 01:35:15 CEST 2009 For constructing jump tables I now use the word '.' instead of '.,' while ';' will return and mark the following code non-reachable. Entry: using machine/vm-stx.ss for compiler Date: Mon May 4 08:51:36 CEST 2009 In [1] there is some explanation of the syntax. The gut of the syntax transformer is in the function 'machine-nf which translates a specification syntax to a normal form given a list and order of registers. (syntax->datum (machine-nf '(A B C) #'((A) (B -> (cons A B))))) => ((A A A) (B B (cons A B)) (C C C)) ;; Convert machine definition form to a symbol-indexed dictionary. ;; Use hash table for usage marking and duplicate checks. (define (form->clauses form) (define hash (make-hash)) (for-each (lambda (clause) (match (syntax->list clause) ((list-rest name expr) (let ((key (syntax->datum name))) (when (hash-ref hash key (lambda () #f)) (raise-syntax-error 'duplicate-name "Form contains duplicate name" clause name)) (hash-set! hash key clause))))) (syntax->list form)) hash) (define (clauses-ref/mark-defined! clauses r) ;; Hygienically introduce default (identifier not reachable from body code). (define (default) (list (datum->syntax #f r))) (let ((clause (hash-ref clauses r default))) ;; Mark it used. (hash-set! clauses r #f) clause)) (define (clauses-check-undefined dict) (hash-map dict (lambda (key notused) (when notused (raise-syntax-error 'undefined-register "Undefined register" notused (datum->syntax notused key) ))))) ;; Convert machine definition clauses to normal form, completing ;; clauses if necessary, and sorting them in the correct order. (define (machine-nf registers stx) (let* ((dict (form->clauses stx)) (nf (datum->syntax stx (for/list ((r registers)) (syntax-case (clauses-ref/mark-defined! dict r) () ;; Annotated syntax. This makes it easier to use the same ;; language for clauses with and without pattern matching. ((reg -> expr) #`(reg reg expr)) ((reg : pat -> expr) #`(reg pat expr)) ;; Non-annotated. ((reg) #`(reg reg reg)) ((reg pat) #`(reg pat reg)) ((reg pat expr) #`(reg pat expr)) ))))) (clauses-check-undefined dict) nf)) So what's next? Make this work on structure fields. How does scheme/match do this? It uses syntax certifiers to access the stuct type. Now, it doesn't look like the original field names are preserved, only the accessor and mutator names. This means that the namespace has to be provided externally, possibly by generating both the struct and the update form at the same time. Ok.. basic form is working: (define (machine-update-struct i struct-id registers stx) (let* ((info (extract-struct-info (syntax-local-value struct-id))) (make-struct-id (cadr info))) (printf "constructor: ~a\n" (syntax->datum make-struct-id)) (syntax-case (machine-nf registers stx) () (((reg pat expr) ...) #`(match #,i ((struct #,struct-id (pat ...)) (#,make-struct-id expr ...))))))) Now to find some good names. Function is updated to just copy non-defined fields: (define (machine-update-struct-tx i struct-id registers-stx stx) (let* ((info (extract-struct-info (syntax-local-value struct-id))) (make-struct-id (cadr info)) (size (length (cadddr info))) (registers (syntax->datum registers-stx))) (when (< size (length registers)) (raise-syntax-error #f "Too many fields" registers-stx)) ;; Pad fields if there aren't enough. (let ((registers (append registers (for/list ((n (in-range (- size (length registers))))) (string->uninterned-symbol (format "R~s" n)))))) (syntax-case (machine-nf registers stx) () (((reg pat expr) ...) #`(match #,i ((struct #,struct-id (pat ...)) (#,make-struct-id expr ...)))))))) [1] entry://20090408-082123 Entry: using the machine update macro in compiler-unit.ss Date: Mon May 4 11:39:49 CEST 2009 The only problem is that an update function can't call anoter update function since the clauses are parallel. I'm going to see if this is a problem. So far it can easily be solved by leaving continuations in the asm stream. Entry: making all examples compile Date: Mon May 4 12:50:27 CEST 2009 Works well upto the synth. Apart from some undefined symbols I run into a problem with 'variable' which needs 'allot'. geo-seq : this metaprogrammed word could probably be included properly once require/load is working. problem here is name binding though.. solved now with parameterizing the macro: planet zwizwa/staapl/pic18/geo : geo-seq ' ,, compile-geo-seq ; This does mess up the load path so need to fix that now. Ok. At least fixed for all 'require statements that come _before_ 'load statements. Maybe 'load should be re-implemented to use a serializable decent structure. Otoh: 'load performs re-entering the parser so maybe not such a good idea. It might be simpler to use explicit stacks. Entry: weird error Date: Mon May 4 14:50:12 CEST 2009 abort-current-continuation: continuation includes no prompt with the given tag: # This was due to not guarding target word evaluation. Fixed. Entry: missing code Date: Mon May 4 15:10:49 CEST 2009 the synt compiles nicely, but there's a whole chunk of code simply missing. 0248-392 is not there. the rest is what it should be. this looks like a problem with org. Composition order is wrong here, leading to dead code. (close-chain ,(make-target-split #f) ,(state-update ((dict -> (dict-terminate dict))))) Ok. test now passes. Wow.. What a maraton. Entry: load vs. units Date: Mon May 4 15:17:01 CEST 2009 now, i'm already really happy with the current single assignment structure. however, units are neater. in the case of forth code however, they can probably be automatically derived: simply take the defined words and the referenced words with library subtracted. so, i'm going to leave that like it is right now. the only benefit is separate unit compiation, which isn't really necessary yet. Entry: next Date: Mon May 4 20:08:27 CEST 2009 the rest is mostly cosmetics. * fix the disassembler so it gives a more userfriendly printout. * change the init-state behaviour for macro-eval. * fix recursive loading interaction with expand then the next step is static information about the macros, and possibly a simulator: move towards more static checks. find redundancies and fix them in rules. eliminate fancy tricks - simplify all semantics (there's still quite a lot of this). i've been thinging about Forth assembler generators but the current code with the 'patterns-class macro is already general enough. Entry: optimizer / semanitcs Date: Tue May 5 09:16:31 CEST 2009 Now pic18-unit.ss contains a series of ad-hoc rules foor peephole optimization. Is there a way to separate this from the core semantics? A good example are these: (patterns-class (macro) ;;-------------------------------- (word l-opcode s-opcode) ;;-------------------------------- ((+ addlw addwf) (and andlw andwf) (or iorlw iorwf) (xor xorlw xorwf)) ;;--------------------------------------------------------------- (([qw a ] [qw b] word) ([qw (tscat: a b word)])) (([l-opcode a] [qw b] word) ([l-opcode (tscat: a b word)])) (([qw a] word) ([l-opcode a])) (([save] [movf a 0 0] word) ([s-opcode a 0 0])) ((word) ([s-opcode POSTDEC0 0 0]))) This contains a couple of concerns interwoven: * compile time evaluation (tscat: ...) * stack manipulation optimization * use of unary (one literal) and binary target ops Entry: Futamura Date: Tue May 5 09:25:25 CEST 2009 Now, the question is: is this a specializer that can take itself as input? Do the works of mister Futamura have any significance here? Can the theory of partial evaluation be used to shed some light here? The following is from [1] page 43. PE is a partial evaluator, prog is a program, const is its static data and data is its dynamic data: PE(prog,const)(data) = prog(const,data) Here PE(prog,const) = prog' is a specialized program. The 3 Futamura projections are then about prog being an interpreter int, with the static data const = prog. The first projection combines the interpreter and the program into a compiled program. PE(int,prog)(data) = int(prog,data) +----------+ compiled program When this specialization happens a lot with int = constant, the PE itself can be specialized to the interpreter resulting in a compiler. PE(PE,int)(prog)(data) = int(prog,data) +--------+ compiler +--------------+ compiled program Now, if we have to do this for a lot of different interpreters, we can specialize the PE(PE,int) invokation to the static PE giving a compiler generator PE(PE,PE) PE(PE,PE)(int)(prog)(data) = int(prog,data) +-------+ compiler generator +------------+ compiler +------------------+ compiled program Anyways, I'd like to read some more of this [2]. It seems to contain some answers about why partial evaluation isn't trivial. On page 42 in [1] there is some simple answer to the non-triviality: partial evaluators need to be conservative to ensure termination. [1] http://thyer.name/phd-thesis/ [2] http://www.itu.dk/people/sestoft/pebook/ Entry: rules Date: Wed May 6 10:10:06 CEST 2009 So instead of the rules in [1] isn't it better to explain what addlw means in terms of qw then derive the rule? in other words, instead of defining rules for packing, define them for unpacking too. (([addlw a] unpack) ([qw a] [cw +])) As an intermediate step towards more insight, it might be best to find which rules or clauses within a rule are invertible. Once there are inverses, it should be possible to start optimizing by search. The thing is: once subsets of code have better isolated properties, transformations on them could be done on a higher level. In general, code concatenation is a moniod [2] : - closure - associativity - identity element Identifying invertible elements would maybe make it possible to find a subgroup in the monoid. Then finding commutation relations could construct an abelian group. Actually, the 'unpack macros are disassemblers. I'm thinking that a mechanism on top of the current macros is necessary for this. The problem with rules and general rewriters is that they are algorithmically complex. However, it might be possible to devise a couple of passes of eager macros from a set of more general rules and a bunch of training data in the form of programs. The trick is going to be to link together the semantics of the transformers defined in terms of how they act on code, and the algebraic structure defined by the transformers alone, without this semantics attached. [1] entry://20090505-091631 [2] http://en.wikipedia.org/wiki/Monoid [3] http://en.wikipedia.org/wiki/Transition_monoid Entry: verifying a transformation rule given concrete machine semantics Date: Wed May 6 11:05:43 CEST 2009 Maybe it's best to find a way to verify rules first. Or better individual clauses. ([addlw a] [qw b] +) -> ([addlw (a b +)]) This particular one is simpler when we're able to unpack addlw: ([qw a] [cw +] [qw b] [cw +]) -> ([qw (a b +)] [cw +]) or in concatenative code compiled to the generic vm with partial eval removed: ( a + b + ) -> ( a b + + ) Entry: proof that [ a + b + ] == [ a b + + ] Date: Wed May 6 11:20:19 CEST 2009 Funny, it's been a while since I did this kind of stuff. But maybe trying to do this unprepared will reveal some tricks. To prove this, lets try to prove [ x a + b + ] == [ x a b + + ] first. Note that a, b, x need to be values (self-quoting operators), not generic operators (which might be non-invertible). To prove this we relate the semantics of '+ in the concatenative code to the semantics of '+ in a nested expression: ([qw a] [qw b] +) -> ([qw (+ a b)]) LHS: x [qw x] a [qw x] [qw a] + [qw (+ x a)] b [qw (+ x a)] [qw b] + [qw (+ (+ x a) b)] RHS: x [qw x] a [qw x] [qw a] b [qw x] [qw a] [qw b] + [qw x] [qw (+ a b)] + [qw (+ x (+ a b))] By unification we now need to prove (+ (+ x a) b) == (+ x (+ a b)) Wich is easier to see with infix notation: (x + a) + b == x + (a + b) This is true by the associative property of '+ Generalizing this to all 'a we could attempt (*) derive the property [ + b + ] == [ b + + ] which should be read as [ + (b +) ] == [ (b +) + ] Then chopping off the last '+ gives [ + b ] == [ b + ] In general, this is only works in one direction: [ + b ] ---> [ b + ] [ b + ] -/-> [ + b ] because the former replaces a strong typing constraint (need two input values) with a weaker one (need one input value). So (*) introducing variables to be able to prove relations makes typing stricter. This is a significant insight. So, the associativity law in concatenative code looks like: [ a b c + + ] == [ a b + c + ] Equivalent and with a 2-value type constraint: [ x + ] == [ + x ] Associativity is commutation of variable quote and apply. Actually this is not so strange: associativity is about changing the order of function application. To write this without variable quote we get, typed with 3-value arguments: [ + + ] == [ >r + r> + ] or with dip notation [ + + ] == [ [ + ] dip + ] The variable quote form is simpler: '+ commutes with single result functions. Entry: lazy partial evaluation Date: Wed May 6 18:16:37 CEST 2009 Interesting article here [1]. I'm getting a faint hint at what this signifies. It is related to reordering applications as mentioned above. I need to read a bit more about terminology. "Evaluation under lambda" First comment in [2]. Reducing expressions inside abstractions, instead of leftmost outermost. First, let's go back to Felleisens comment about reduction strategies vs. calculi. It's better to define what a reducable expression is and always reduce the leftmost outermost expression. "Normal order and applicative order are failed attempts to explain the nature of call-by-name programming languages and call-by-value programming languages as models of the lambda calculus. Each describes a so-called _reduction strategy_, which is an algorithm that picks the position of next redex BETA that should be reduced. By 1972, it was clear that instead you want different kind of calculi for different calling conventions and evaluation strategies (to the first outermost lambda, not inside). That is, you always reduce at the leftmost-outermost point in a program but you use either BETA- NAME or BETA-VALUE." [3] What is the equivalent of reduction inside abstractions for a concatenative language? Probably reduction of subprograms. I wonder: is there any difference between reducing from left to right and reducing in arbitrary order? Also, the Staapl macros behave as higher order abstract syntax: they describe terms instead of being terms (machine code) [5]. [1] http://lukepalmer.wordpress.com/2009/05/04/lazy-partial-evaluation/ [2] http://lambda-the-ultimate.org/node/3217 [3] http://list.cs.brown.edu/pipermail/plt-scheme/2009-February/030354.html [4] http://en.wikipedia.org/wiki/Supercombinator [5] http://en.wikipedia.org/wiki/Higher-order_abstract_syntax Entry: higher order macros Date: Thu May 7 09:14:16 CEST 2009 I believe it's time for implementing control structures as higher order macros. This will probably stir up the biggest problems with the current partial evaluator: swapping code and data. law swap ----------------- ASSOCIATIVE code COMMUTATIVE data Entry: specializer Date: Thu May 7 09:30:31 CEST 2009 Maybe it is better to see the macros as specializers. Take the definition for '+ for example: (([qw a ] [qw b] +) ([qw (tscat: a b +)])) (([addlw a] [qw b] +) ([addlw (tscat: a b +)])) (([qw a] +) ([addlw a])) (([save] [movf a 0 0] +) ([addwf a 0 0])) ((+) ([addwf POSTDEC0 0 0]))) This is defined as a function that combines original syntax 'qw with specialized syntax 'addlw 'movf 'addwf. The advantage is that it can be example-based: if you see some pattern in target code that you didn't expect, it can usually just be added to the rules of some specializer. The problem is that this is quite lowlevel and difficult to understand because it mixes 2 conceptual levels: real machine code and pseudocode directly representing a trivial compilation of concatenative/Forth code. Another problem is that this is eager: always reduce in the same place: at the top of the code stack. There is no "reduction under lambda". What I wonder is if this actually is a limitation. So. Next task: figure out a way to compile highlevel rewrite rules (which operate only on macro syntax) into eager lowlevel ones. I.e. the commutative law: (swap +) = (+) Something related to Luke Palmer's post: what if it is possible to keep functions that return a single value wrapped as promises? The idea is that concatenative code (a b +) can be replaced with (b a +) as long as 'a and 'b return a single value. Using some kind of type system it is possible to identify "pseudoconstants" that will make moving code around a lot easier. I.e. Suppose there is some occurance of (a b +) where 'a will not recombine with the code before it, but 'b will, then it is better to swap them and evaluate the whole bunch into a value. Actually, the type system could be _implemented_ as something that wraps into 'qw. I.e. macro results should be left as quotations as long as possible, and when finally evaluated should be memoized. This requires a representation of "pop the runtime stack" as a quoted macro. More specifically: reference into the runtime stack. The stack could then be pushed/popped on every call. Bottom line: more lazyness -> better partial evaluation. To organize this it might be best to keep the eager and lazy parts separate. I.e. the current PIC18 macros could implement the final eager specializer (the optimizing compiler) while a lazy specializer is written on top of it. Some examples: @ always should bind directly [m1 a] -> [m1 (macro: a @)] + binds lazily with a check for its two arguments [m2 ab] -> [m1 (best (macro: ab +) (macro: ab swap +))] [m1 a] [m1 b] -> [m1 (best (macro: a b +) (macro b a +))] Now this 'best thing in there.. That's the reason the compiler needs to be purely functional - so we can easily fork 2 different states and pick the best one to continue. One thing about this scheme: it's not clear when to actually force the expression. I think this is the same as Luke means with "I am still not totally sure when to introduce an abs" in [1]. It looks like a good time is whenever there are more items packed together in a single macro return than the next macro takes as input. Michael Thyer's lamba animator [2] contains examples of both eager and lazy specialization. [1] http://lukepalmer.wordpress.com/2009/05/04/lazy-partial-evaluation/ [2] http://thyer.name/lambda-animator/ Entry: lazy partial evaluation Date: Fri May 8 14:44:26 CEST 2009 I'm trying to find the proper type of lazy evaluations in Staapl, inspired by [1]. Something like this: (define-virtual-ops (mw a)) (patterns (macro) (([mw a] [mw b] +) ([mw (macro: ,a ,b _+)])) (([qw a] delay) ([mw (macro: ',a)])) (([mw m] force) m) ) Here _+ is the stict one from pic18.ss To use this in the strict (macro: ...) form do something like: 1 delay 2 delay + force Now, what I wonder is how this can be used to separate all compile time computations (manipulation of qw's) from target compilation (using target machine code only). There is one thing though that I didn't do yet from [1]: I made closures, but not abs's. "Indirections represent a logical reordering of the stack, and are used to model bound variables." What I don't quite understand is why this needs to pop the element referred to. Anyways: a basic idea is that if you use a stack and start to reorder operations on it, you had better kept track of the original positions of the operations. [1] http://lukepalmer.wordpress.com/2009/05/04/lazy-partial-evaluation/ Entry: lazy coma Date: Sat May 9 09:51:51 CEST 2009 So what does the lazy language look like from the point of the programmer? First, it does not distinguish between instantiation and macro definition. That is what the specializer handles. Ok.. so there's and rpn language lazy: now, which has a completely separate namespace (lazy) and defines in terms of (macro). I.e. (pic18-begin ,(lazy: 123 123 +) force) The next thing to figure out is 'dup. If a quoted macro ends up being forced more than once, we (probably) want to memoize it into a word. Since it's not possible in general to replace a forced macro with a word later, this needs some form of backtracking. Something like: default behaviour = instantiate. Whenever this turns out to be problematic (recursion or instantiation of large macros) turn the macro into a run-time abstraction. Entry: Michael Thyer's Thesis about Lazy Specialization Date: Sat May 9 11:02:51 CEST 2009 At page 18 of [1] there is a ``knot-tying'' version of the interpreter. This fully lazy technique seems to be the key to lazy partial evaluation. Time to get familiar with it. The evaluator from page 18: evalProg prog env = (lookup "main" prog') env where prog' = map (\(label,stmts) -> (label,evalStmt stmts)) prog evalStmt [] env = env evalStmt (stmt:stmts) env = case stmt of SAssign var exp -> evalStmt stmts (update (var, evalExp exp env) env) SGoto label -> (lookup label prog') env SIf cond yes no -> if evalExp cond env /= 0 then evalStmt yes env else evalStmt no env Here prog' is the dictionary of evaluated expressions which is built in terms of evalStmt which in turn refers back to prog' in the interpretation of SGoto. If there is some sequence of reductions that will build the datastructure lazy evaluation will find it. See also [2]. [1] http://thyer.name/phd-thesis/ [2] http://calculist.blogspot.com/2005/07/circular-programming-in-haskell.html Entry: peephole and stack Date: Sun May 10 09:04:13 CEST 2009 The peephole optimizer works well for a stack formulation because of the way locality is encoded. This is especially so for the PIC18 which tunnels everything through the WREG. However, for true partial evaluation the program structure needs to be exposed on a more abstract level. On the other hand, the stack gives a very natural way to bind static data early: by placing it on top. So the real question is: how to steer static data towards the top of the stack? How to keep a simple reduction mechanism, but write logic on top of them such that this works best? Examples. With ifte ( cond true false ) arguments, how to specialize this: : when-done [ [ ready-flag high? ] i ] dip [ ] ifte ; This will perform some re-arrangement of the arguments then call ifte. I don't see any problem. This will just yield 3 [mw .] arguments to ifte. Now: ifte needs to know if the condition is available at compile time. It can do this by forcing the condition macro and checking if it reduces to a value. If so, one of the branches can be eliminated. If not, the conditional branch construct needs to be passed to runtime. This doesn't seem so special. So where _is_ the problem? I think it's the coupling run time stack vs. compile time stack. At some point a quotation needs to be instantiated. At this point one also needs to know the input and output argument structure. Somehow it seems to be really natural to do eager optimizations, but quite a leap of faith to do it lazily. Messing with the run-time stacks is not trivial.. 'dup and specialization. When a quotation that's big enough gets instantiated more than once, there's an opportunity for sharing. This needs extra knowledge to be filled in later: what is "big enough"? But apart from that, it needs its type to be able to use it as a function call! Inlined macros don't need type: they can just be executed, but once abstracted the type is neccessary both at the call site and at the definition site to decouple both. next: implement run time abstraction from a pure lazy macro definition of a language. Entry: macro->data / macro->code Date: Sun May 10 10:29:54 CEST 2009 Removed these and replaced them with state->data / state->code. Currenly only the 'unwrap' function in coma/code-unit.ss uses ad-hoc macro evaluation in the core compiler, next to the proper instantiation. The rest is in interaction code only, which is now also attached to (state:compiler). However, some tests are not passing any more. One problem is gpdasm: some bug got fixed which gives different dasm output. Another is probably at a deeper level. Let's first try to rewind gpdasm, or try to update to the new gpdasm with old staapl code. Entry: eta-reduction on rpn-lambda Date: Sun May 10 14:26:08 CEST 2009 Making abstractions with rpn-lambda with a single term at this moment introduces abstraction overhead. I wonder if this can lead to memoization problems due to the fact that (macro: m) != (macro m). To do this purely generative instead of by postprocessing requires a different approach. Currently the information is lost: what you want is to fish out the function that is applied, not the application. Maybe it isn't really a problem: performance wise i don't think it would matter since the compiler is probably smart enough to eliminate the abstraction. Entry: tower of interpreters Date: Mon May 11 09:44:49 CEST 2009 In [1] on page 44 it is mentioned that Futamura projection practically requires a partial evaluator that is both powerful enough to specialize itself and simple enough to be specialized by itself. Now, what is the relation between writing interpreters in Staapl's macro forth to be specialized by its partial evaluator, and writing them directly as towers of macros? I'm confused by terminology here, but I'm thinking practically about protocol parser specification etc.. [1] http://thyer.name/phd-thesis/ Entry: minilisp Date: Mon May 11 18:34:30 CEST 2009 A tussendoortje. I found this[1] when going through some djvulibre code. It might be interesting to keep in mind whenever I'm going to target scat to C. [1] http://leon.bottou.org/projects/minilisp Entry: telling people things Date: Mon May 11 22:37:54 CEST 2009 You can't. You need examples. [1] http://thefarmers.org/Habitat/2004/04/you_cant_tell_people_anything.html Entry: full lazyness Date: Wed May 13 11:06:48 CEST 2009 On page 55 of [1] "full lazyness" is defined as giving each term minimal scope. Then it is remarked that: It is possible to achieve full laziness in a lazy language by transforming the syntactic representation of functions so that the scope of every function is minimal. [1] http://thyer.name/phd-thesis/ Entry: next: documentation update Date: Wed May 13 11:51:13 CEST 2009 * website: this needs to be simplified a bit. * scribble docs: they are probably broken. * papers: less important but best moved to scribble The emphasis should be on the following: * it works: - you can compile .f -> .hex / .dict - you can interact with target using .dict * it's simple: bottom-up macro system using PLT's module system. Then there should be examples examples and examples. Entry: pe reordering Date: Wed May 13 16:34:50 CEST 2009 With eager evaluation (not evaluating under lambda) the trick is to make sure that reducable expressions are exposed in the right order. How to do this in Coma? Entry: interpreting generated code as a stack Date: Wed May 13 20:01:51 CEST 2009 Is this the essential difference compared to expression based partial evaluation? I.e. the Scheme form (+ a b) could be implemented as a macro + which inspects its arguments, and performs an addition if it finds two numbers. In general (without the stack stuff) Staapl is a compiler for a composition language : point free style -> imperative code. It works on the analogy that machine code is actually a composition of functions acting on the machine state. This in itself is not so useful when the only state that can be manipulated is the machine state, but it is very useful when the state is somehow limited. Using grids instead of stacks could lead to an APL like language. I am missing something however. I really don't want to operate on machine code. I want typed lazy values, not evaluated ones. There are two problems I'm solving: [qw a] [qw b] + -> [qw (+ a b)] [addlw a] [qw b] + -> [addlw (+ a b)] The first one is generic, the second one is quite specific to the machine. Is it possible to implement partial evaluation in a more direct style, without having to include specific clauses into each machine-specific transformer? There is one very big problem with my evaluator: it _needs_ to be strict because it is intertwined with the implementation of the run-time parameter stack. This "side-effect" is OK as long as parameter dereference doesn't get postponed and reordered. Basicly, it's possible to delay the popping of the data stack and wrap this action in a quoted macro, but then this popping fixes the order in which these macros have to be evaluated. I have a hunch that if I can describe this problem in a proper way, I can find a solution and move to a lazy evaluator. However, when it is possible to use random parameter access, this mechanism could be used to implement an evaluator for an applicative language, essentially giving Luke's approach. This stuff is over my head. I just went over the rest of Thyer's thesis and I'm quite confused. The problem doesn't seem to be trivial at all. I keep coming back to "lazy pops". Entry: when is eager matching not working? Date: Wed May 13 23:34:51 CEST 2009 It would be interesting to find cases where the eager partial evaluation scheme breaks down. One clear case is the interleaving of orthogonal cases. I.e. 1 dup @ 101 ! 2 + 102 ! This will not evaluate "1 2 +" to "3" because the "1" is not visible at "2 +" because of the machine code generated by "dup @ 101 !". The proper way to solve this is to perform the following rewritings: 1 dup @ 101 ! 2 + 102 ! 1 1 @ 101 ! 2 + 102 ! ; evaluate 'dup' -> 1 @ 101 ! 1 2 + 102 ! ; commute '1' and '1 @ 101 !' -> 1 @ 101 ! 3 102 ! ; evaluate '1 2 +' Let's evaluate this using lazy macros () 1 -> ([mw (1)]) dup -> ([mw (1)] [mw (1)]) @ -> ([mw (1)] [mw (1 @)]) 101 -> ([mw (1)] [mw (1 @)] [mw (101)]) ! -> ([mw (1)] [mw (1 @ 101 !)]) Now, at this point the two macros need to swap, so (2 + 102 !) can combine with [mw (1)]. Basicly, (1 @ 101 !) should be annotated as being independent of run time stack, and so can be tucked away. Wait. This "tucking away" is similar to the self-removing abstractions in Luke's idea. The real problem here is side-effects.. I'm still comparing apples and oranges.. Entry: uniqueness types Date: Thu May 14 01:20:39 CEST 2009 Sounds like an interesting concept. Used in the Clean language. [1] http://en.wikipedia.org/wiki/Uniqueness_type Entry: syntax properties Date: Thu May 14 02:23:53 CEST 2009 Apparently check syntax uses the 'disappeared-use syntax property. Find this in the manual. (Got it from Typed Scheme paper [1]) [1] http://www.ccs.neu.edu/scheme/pubs/scheme2007-ctf.pdf Entry: only macros? Date: Thu May 14 09:13:51 CEST 2009 Two questions for today: * can QW be replaced entirely by MW? * how to massage a composition of macros before evaluating them. At first sight there doesn't seem to be a problem with making all literal values lazy macros that produce a single literal value. The only restriction is that these macros are not allowed to produce code that alters the run-time parameter stack. Maybe I should fix the implementation of the eager evaluator, since it is easy to understand, and consider a preprocessing step that aims at providing a reordering the composition? Also, I'm still annoyed by this: [qw a] [qw b] + -> [qw (a b +)] [addlw a] [qw b] + -> [addlw (a b +)] The latter should really be [mw (a +)] [qw b] + -> [mw (a + b +)] The direction to go in seems obvious: I'm not going to get anywhere without processing of macro compositions. This means that each macro should carry with it a description of its I->O behaviour. The problem is that I don't really have a substrate for thought here.. Entry: partial evaluation: functions, not macros Date: Thu May 14 09:50:51 CEST 2009 Maybe a key element to make this simpler is to stop thinking of compositions as macros, but instead look at them as non-reduced functions. Given a function composition (a b c ...) the goal is to re-order the operations such that the eager peeophole compiler can produce better code. Can this maybe be put into mutual feedback? I.e. stick with eager evaluation, but allow backtracking to search through the space of possible reorderings. So let's fix the stage for now: - Peephole optimizer stays what it is. This works well for manual code writing where there is usually "only one thing going on at a given time", but is non-optimal in general when independent operations that are interleaved. - Try to find a way to preprocess. Maybe it's simplest to get rid of the current control flow compiler for this. See it as a side-track of the Forth compiler. Entry: more algebra Date: Thu May 14 09:59:38 CEST 2009 Is there some way to turn the monoid into a group? Defining operations as invertible would make things a lot more elegant probably. For arithmetic this isn't too hard: add a 2nd "kill stack" that carries information to perform the undo. Alternatively, add the undo information to the values themselves. I.e. something like: (1 2 +) -> (3), (2 -) There should be some application for constraint-based programming here.. Anyways, let's make the basic structure first. Entry: functional PIC18 language Date: Thu May 14 10:07:13 CEST 2009 To make this work in practice, this needs jumps. The control flow analyser and control stack are not necessary so let's keep it simple. What we need are: - function call / return instructions - conditional jump (ifte) Ok. separated out pic18-control-unit.ss making pic18-unit.ss independent of the control^ macro set. Entry: low-hanging fruit Date: Thu May 14 10:59:03 CEST 2009 This partial eval stuff probably needs a rest. It will involve a lot of thinking, but what I need now is some easy success to boost motivation.. Maybe it's time to start porting to the dsPIC. I have some parsed manual lying around somewhere to get to the instruction set. However, it might be best to do it manually. There is one thing I'd like to change though: the assembler should be more compositional. Right now there is a single table with all opcodes, but it's probably best to separate out all function prototypes. This leads me to think that an opcode is really just an argument to a more general instruction. Anyways... Hmm.. dsPIC isn't that low-hanging due to different instruction set architecture. It would be cool as a real platform though. Chips are PDIP and small. Maybe I should take this as a hint: find a generic approach to bootstrapping a new architecture. Entry: new documentation Date: Fri May 15 18:56:19 CEST 2009 - website should contain only 1-sence explanation, with an immediate link to a scribble doc. - the basic idea is a peep-hole optimizer for forth, based on the observation that forth code looks like a stack. - staying close to the way Forth is usually implemented, the core idea is kept and turned into a functional framework. - on top of this a lot of uC specific optimizations can be built Entry: terminology : partial evaluation Date: Sat May 16 21:41:48 CEST 2009 I think I need to distinguish partial evaluation (specialization of _functions_) and peephole optimization. They are very much related (if you look at an instruction like [addlw ..] as a specialized version of [addwf ..]) but I think it's better to use partial evaluation for a language that doesnt distinguish between macros and functions. Entry: types are calling conventions Date: Sun May 17 09:07:16 CEST 2009 http://lambda-the-ultimate.org/node/3319 Entry: removed front page stuff again.. Date: Sun May 17 10:04:27 CEST 2009 I keep on doing this.. Why? Anyways, the new introduction says almost the same as this, but with examples.

Basic Elements

Metaprogramming is about manipulating programs as data. A key element here is the representation of programs. Being a compiler, Staapl has two languages to deal with: a high-level input language and a low-level output language. Both languages are concatenative.

  • Output language programs are represented as lists of instructions [op].
  • Input language programs are represented as output language program transformer functions [op]->[op].

A meaning can be attached to high-level concatenative source code using the following compilation algorithm:

  1. Read the tokens from the input program source, and associate each of them with an [op]->[op] transformer.
  2. Compose all [op]->[op] transformers using function composition, in the order their tokens appear in the input program source.
  3. Apply this function to the empty program [] to obtain the final output program.
From this one could infer that each [op]->[op] function appends a small machine code fragment to the code accumulator, essentially behaving as an assembler macro. However, the function is free to modify the accumulated code in its entirety, performing any optimization it sees fit.

Using this representation the task of building code generators can be split in these parts

  • High level: Composing generators/transformers is expressed using concatenation. I.e. the Scheme form (macro: 123 +) creates a composite code transformer, built from the code transformers 123 and +. The macro: form behaves as quasiquote, facilitating template programming. I.e. the previous transformer is equivalent to (let ((x 123)) (macro: ',x +)).
  • Low level: Creating language primitives as [op] processors, and possibly defining a machine instruction set op that has the double function of representing target semantics and serving as a representation of compile time data.

Macro Forth

By interpreting the code list [op] as a stack and adding an op instruction (QW value) that represents loading of a literal value on the parameter stack at run time, it is possible to implement Forth with eager partial evaluation.

I.e. the function associated to the token word + would normally only compile machine code that pops 2 numbers from the run-time stack, adds them and pushes the result to the stack. Instead it could be made to inspect the code it is passed for any instructions that load literal values on the runtime stack, remove those, perform the addition at compile time, and generate code for a constant instead.

Note that the basic structure of [op] lists and [op]->[op] transformers is more general than stack languages. In fact, it could be used to implement partially evaluating macro languages for any kind of state machine. Entry: previous introduction Date: Sun May 17 10:09:46 CEST 2009 It's sort of ok, but I think the new one is better because it uses examples. Let's keep it around. Entry: i can't read documentation Date: Sun May 17 12:31:01 CEST 2009 i just don't have the patience.. trying to figure out how to build my scribble docs locally to see how they will show up on the planet server, but there are several inconsistencies in my understanding of how setup-plt works. most importantly: the collects dirs need to be clean! it compiles everything it finds, so junk gets in the way. Entry: releases Date: Sun May 17 17:15:21 CEST 2009 Currently the "planet-fileinject" is the best way to test the planet package before upload. ~/.plt-scheme/planet/300/4.1.5.5/cache/zwizwa/staapl.plt/1/8/planet-docs/ Entry: fruit Date: Sun May 17 22:35:40 CEST 2009 HA! i know about some low-hanging fruit to compensate for all this seriousness. an automatic scheme -> rpn converter so rpn can be packaged separately to be integrated with fluxus. Entry: preparing for release Date: Mon May 18 10:49:14 CEST 2009 Everything seems to be working, except for some minor issues with the documentation (the red underline). Time to upload it to PLaneT. So, before releasing, do the following: * check bin/version * make test # run the test suite * make planet-fileinject # install the package locally The "release" target will do these. Entry: unit filteres Date: Tue May 19 12:12:41 CEST 2009 How would one build a unit which filters an api? This would be a very interesting way to override macro behaviour. Maybe what I'm really looking for is objects.. Entry: inline macros vs. partial evaluation Date: Wed May 20 16:21:50 CEST 2009 The thing is: you want sharing. Currently, when using macros, they are always inlined, even when there are compile time computations going on It looks like a good first attempt to proper pe would be to decide when _not_ to inline. Entry: applications Date: Wed May 20 17:00:37 CEST 2009 I need a metronome. And a clock for rummikub. And maybe something that responds to sound? I found the 2 8x8 led matrix displays. This looks like a nice target for some application. Needs 24 pins to make it work comfortably. Entry: ltu thread on the future of programming Date: Wed May 20 20:50:29 CEST 2009 I'm tired so time for wild associations.. But these two posts I found rather striking: http://lambda-the-ultimate.org/node/1439#comment-16473 Since I asked the question.... ...you may have guessed I had a few answers myself but wanted to see what else I could shake loose from you guys first.... Moore's Law and Feeping Creaturism implies ever larger and more complex systems. The solution to which is not ever larger and more complex languages, but simpler languages, both syntactically and semantically that permit tools to assist more in... * Refactoring. * Testing. * Code Generation. * Reengineering, understanding and visualization of very large systems. Thus * The ease of analysis, rewriting and refactoring of the Joy language will make the Joy language the "parent/inspiration" language for the next generations of languages. * Syntax and semantics will become simpler. ie. Towards a simpler version of Lisp/Scheme or Forth. Some other beliefs I have stemming from Moore's Law and creeping featurism driving the exponential growth in software complexity... * Linear logic will supplant GC. * Lazy evaluation is not an optimization (or hard to implement nice to have). It's mathematically the only way to have a consistent, analizable and manipulatable system. * Single Assignment or Joy / Forthlike no assignment will become the norm. * Cyclic dependencies at one or more levels (package / file / class / ...?) will be explicitly disallowed by the tools. * Static typing will lose to Dynamic Typing, and Dynamic Typing will lose to compiler inferred Static Duck Typing. * Languages will provide a transparent "each object has one thread inside it, every method invocation is just another event in that object's event queue" view to the programmer. * OO hierarchies are too reminiscent of pre-Codd hierarchical databases. And SQL ignores Codd on too many points. I expect to see a language that... ** Is fully relational ala Codd Law's. ** The basic Object model is also in at least 4th normal form. Can I prove any of this? No. Not now. Not yet. But it's what I keep my mind occupied with when I'm not earning my bread. John Carter http://lambda-the-ultimate.org/node/1439#comment-40602 The solution to the concurrency problem... Will be a two language solution: Language A: be a sequential language designed from the ground up to be easily parallelizable at the meta level. It will not be a parallel language with parallel programming constructs, neither message passing nor multi-threading nor explicit effects. It will have a lot of restrictions that may seem strange to the sequential programmer but they will all be there for a good reason which is... Language B : which is a declarative meta-language which will allow a different programmer to examine and reason about both language A and it's potential run-time environment and then re-factor it's code to run in parallel on the chosen platform(s). At first most programs in language B will be heuristic and specialized for particular applications, but as more experience is gained with particular algorithms (in language A), and with particular hardware configurations, patterns will begin to emerge and language B will be able to support automatic parallelization across certain categories of algorithms on certain categories of hardware. Language B will have all the nice syntactic sugar parallel programmers love, whether it be in the language of threading, effects, messages or something else. David Minor More gems: http://lambda-the-ultimate.org/node/1439#comment-16479 (canvas vs. http://www.info.ucl.ac.be/people/PVR/flopsPVRarticle.pdf (4 x a 4-layer language) Entry: interaction Date: Thu May 21 10:56:59 CEST 2009 With all this module stuff I actually forgot to test if the interaction is still working. Apparently not. Had to fix a thing or two in the macro evaluation code. Now it seems to work again. "empty" is broken though. FIXED. Entry: simpler live commands Date: Thu May 21 11:44:41 CEST 2009 Currently the way the target: language is constructed is a bit convoluted. In rpn-target.ss there's an unnecessary double indirection. Ok, removed it. Semantics is now a bit clearer. There are 3 cases for identifiers: - target words (executed) - macros (simulated) - prefix parsers Entry: target stack access Date: Thu May 21 12:53:29 CEST 2009 Another thing which has annoyed me for a while is the way target macro simulation works: it slurps the entire stack. How can we know how many elements should be popped? This really could be done lazily, but would require some modification to the code representation (allow lazy stacks). I already have a mechanism to do this: target-value. Just adding thunks that will force stack pops should be enough. This works, but picking out instructions to skip turns out to be more difficult than expected. (i.e. "dup" and "swap" don't work because they dont wrap the target words. This should be done by comparing input and output. -- This problem isn't nearly as trivial as it seems. Before evaluation starts, all pops need to be forced. Even this isn't enough: we need to find _references_ Ok. the trick is this: - first compare in/out to see which operations do not need to be executed. - dummy eval the rest to trigger the lazy pops - use the size of the input stack to determine how many instructions of out need to be taken Entry: lazy bootstrapping Date: Thu May 21 12:56:27 CEST 2009 If lazyness is the ``natural'' way to deal with circular dependencies, isn't there a way to solve a compiler bootstrapping problem using circular programming tricks? Entry: tethered.ss Date: Thu May 21 17:08:50 CEST 2009 I have the impression that there is a lot of redundancy in live.ss that can be eliminated by flattening some layers. There are functions that are accessible as scheme functions, scat functions and interaction macros. Is this really necessary? I don't really use scat all that much except as a vehicle for other things.. Entry: usb Date: Thu May 21 18:09:28 CEST 2009 After three years maybe it's time to finally write the damn driver, no? Funny how this has been problematic for a while. Using picstamp.fm from app/ to get it going. First: getting old code to compile. Simply invoking "load usb.f" gives: include "/data/safe/tom/darcs/brood-5/staapl/pic18/usb.f" include "/data/safe/tom/darcs/brood-5/staapl/pic18/shift.f" reference to undefined identifier: macro/device-descriptor This identifier is present in pic18/usb.ss but still uses an old symbolic interface. Let's see if that module can be revived on its own. The main function is 'usb-compile-device I have one description file: pic18/cdc.usb Let's change the interface such that this file becomes a module which includes some scheme code and used the 'define-usb-device form. OK, it generates this: (: device-descriptor f-> 18 |,| 18 |,| 1 |,| 16 |,| 1 |,| 0 |,| 0 |,| 0 |,| 64 |,| 216 |,| 4 |,| 1 |,| 0 |,| 0 |,| 0 |,| 4 |,| 3 |,| 2 |,| 1 |,| : string0 f-> 23 |,| 23 |,| 3 |,| 68 |,| 101 |,| 102 |,| 97 |,| 117 |,| 108 |,| 116 |,| 32 |,| 67 |,| 111 |,| 110 |,| 102 |,| 105 |,| 103 |,| 117 |,| 114 |,| 97 |,| 116 |,| 105 |,| 111 |,| 110 |,| : string1 f-> 19 |,| 19 |,| 3 |,| 68 |,| 101 |,| 102 |,| 97 |,| 117 |,| 108 |,| 116 |,| 32 |,| 73 |,| 110 |,| 116 |,| 101 |,| 114 |,| 102 |,| 97 |,| 99 |,| 101 |,| : string2 f-> 5 |,| 5 |,| 3 |,| 48 |,| 46 |,| 48 |,| : string3 f-> 10 |,| 10 |,| 3 |,| 85 |,| 83 |,| 66 |,| 32 |,| 72 |,| 97 |,| 99 |,| 107 |,| : string4 f-> 28 |,| 28 |,| 3 |,| 77 |,| 105 |,| 99 |,| 114 |,| 111 |,| 99 |,| 104 |,| 105 |,| 112 |,| 32 |,| 84 |,| 101 |,| 99 |,| 104 |,| 110 |,| 111 |,| 108 |,| 111 |,| 103 |,| 121 |,| 44 |,| 32 |,| 73 |,| 110 |,| 99 |,| 46 |,| : config0 f-> 25 |,| 9 |,| 2 |,| 25 |,| 0 |,| 1 |,| 0 |,| 0 |,| 160 |,| 50 |,| 9 |,| 4 |,| 1 |,| 0 |,| 1 |,| 3 |,| 1 |,| 1 |,| 1 |,| 7 |,| 5 |,| 128 |,| 160 |,| 8 |,| 0 |,| 0 |,| : string-descriptor 5 route/e string0 |;| string1 |;| string2 |;| string3 |;| string4 |;| string-error |;| : configuration-descriptor 1 route/e config0 |;| config-error |;|) OK, now to make the code a bit more modern. I need an abstraction for building tables. Something that can interleave a bunch of atoms with a compile function. I.e.: begin-table , 1 2 3 4 5 end-table Maybe with a more concise syntax? Something like { , 1 2 3 4 5 } Wait, since "[" and "table[" are independent tokens, square brackets could be used for any kind of structured data, allowing s-expressions to occur in Forth style code. Alternatively, "scheme" and "table" could be prefix parsers that allow arbitrary s-expression transformation. scheme: [ define foo [ + 1 2 ] ] \ scheme expression table: , [ 1 2 3 4 5 6 ] \ constant table [ 1 2 + ] \ quoted function Actually, why invent new syntax if s-expressions do just fine.. The reason to not have s-expressions in Forth is that then you don't need to represent them at compile time. Additionally, Forth control words allow to be composed in strange ways by themselves.. I don't like to throw that away. But in scheme it's really easy to do this: (define-syntax-rule (table: separator (item ...)) (macro: ,(macro: item separator) ...)) So for defining tables I do think that Forth-style prefix parsers might not be the right way to go. It's probably easier to write Scheme macros and provide some s-expression based syntax to access them from flat Forth. So, what is necessary is a generic way to create prefix parsers that take an s-expression as an argument and pass it to a scheme macro to produce an opaque coma macro. (with "{" and "}" reserved for module level expressions) (snarf-prefix-macro macro:) Entry: Factor stack effect checker Date: Fri May 22 08:30:47 CEST 2009 http://docs.factorcode.org/content/article-inference.html Entry: usb cont Date: Fri May 22 08:48:06 CEST 2009 So what should usb-cdc.ss produce? It generates a number of Forth procedures, but it does so on top of the flat Forth syntax. It's probably easier to have it define s-expressions. I need a form that abstracts these: register! wrap compile. It probably is in fact easier to build it on top of the Forth syntax.. It has a more direct interface to the control flow graph compilation process (the fallthrough feature). The usb code should eventually become a unit, but now i only have the PIC18F2550 to work with, so let's stick with a module that requires pic18.ss Since this is to be used as an abstraction around generated names, it's probably also best to only provide access to the final table words. Ok. separated parsing and code generation. Now, I need to think a bit more about how to abstract generation of Forth code to scheme. Most flexible is the generation of flat Forth code, since it has full access to all control flow features. I think I found a nice hack: Generate flat Forth code, but do it as a nested s-expression. This keeps the intended substructure intact but can be easily flattened down. Think of this as using the associativity of function composition to annotate code that belongs together. Though, for bootstrapping the purely functional concatenative language on top of the imperative Forth core, it might be interesting to provide some means of escape. In the case of the usb code generator, it doesn't make sense since it already uses lower-level language constructs like jump/value tables. TODO: name hygiene for usb.ss OK. Looks like it's working now. usb-cdc.ss now defines 3 names. Entry: prefix parsers vs. concatenative macros Date: Fri May 22 10:42:02 CEST 2009 * concatenative: deal with compile-time function composition and intermediate code transformation only. * prefix: these are module-level composition tools: they can abstract over more general syntax structure that manipulates name binding. Entry: renaming x -> r Date: Fri May 22 12:54:14 CEST 2009 Entry: dynamic image bases vs. static source based development Date: Fri May 22 13:05:59 CEST 2009 I really like PLT scheme's macro/module system. Sometimes it gets in your face, imposing acyclic dependencies. However, I've found this usually to be a good thing. I've learned to trust the consistency it brings while developing Staapl. However, my debugger/editor emacs _needs_ the dynamic approach: I'm living inside it with lots of state attached whenever I want to change some functionality. How can we combine the easy-correctness of a static tool like drscheme with the convenience of an all-dynamic no-restart environment like emacs? Entry: The f and a pointers -- dynamic scope Date: Fri May 22 13:20:54 CEST 2009 It is really convenient to be able to use the a and f pointers to RAM and ROM respectively. Should this go at a cost of reducing the abstraction level. I.e. are the registers to be saved before use or during interrupts etc? The real question is more general: give a proper style for using dynamic binding. Since there is no lexical scope, dynamic scope is the only alternative. The other question is: do you expose dynamic scope in an interface, or do you keep things referentially transparent? More specifically: the usb descriptor compiler constructs 3 words that provide pointers to binary records. Should they provide them in the 'f' register (where it will have to end up eventually) or on the top of the stack? The latter is probably better practice. Entry: composability Date: Fri May 22 13:32:24 CEST 2009 Designing Staapl is mostly an exercise in not loosing composabiity. Maybe that's what language design is about? Introducing features that don't clash; keeping them orthogonal so they can be composed at will. I find that the simpler I make basic principles, the better this works. The current hurdle is fighting the macro/instantiate divide: I'm writing abstractions that need library functionality (basicly, things that use the hardware stack). Entry: disappeared-use Date: Sat May 23 06:48:05 CEST 2009 I'm trying to get check-syntax to work for prefixed identifiers. These don't seem to do what I think they do: (syntax-property (ns-tx #`(_ (namespace ...) name)) 'disappeared-use #'name)) (syntax-property (ns-prefixed #'(namespace ...) n) 'disappeared-binding n))) Let's ask: Hello, In the following, is there a way to instruct Check Syntax to recognize the #'x identifier in reference and binding position as in case 1, or in its relation to the #'_x identifier as in case 2 ? ---- #lang scheme (require (for-syntax scheme)) (define-for-syntax (underscore stx) (datum->syntax stx (string->symbol (format "_~a" (syntax->datum stx))))) (define-syntax (u stx) (syntax-case stx () ((_ (name val) . body) #`(let ((#,(underscore #'name) val)) . body)) ((_ id) (underscore #'id)))) ;; case 1 (u (x 123) (u x)) ;; case 2 (let ((_x 123)) (u x)) (u (x 123) _x) ---- Cheers, Tom ANSWER: Delivery-date: Sat, 23 May 2009 23:35:03 +0200 From: Chongkai Zhu To: Tom Schouten CC: Sam TH , plt-scheme@list.cs.brown.edu Subject: Re: [plt-scheme] Check Syntax & mangled identifiers Just keep the srcloc and the original prop of the identifier seems to be enough for me. Chongkai #lang scheme (require (for-syntax scheme)) (define-for-syntax (underscore stx) (datum->syntax stx (string->symbol (format "_~a" (syntax->datum stx))) stx stx)) (define-syntax (u stx) (syntax-case stx () ((_ (name val) . body) #`(let ((#,(underscore #'name) val)) . body)) ((_ id) (underscore #'id)))) ;; case 1 (u (x 123) (u x)) ;; case 2 (let ((_x 123)) (u x)) (u (x 123) _x) Entry: fixing "load" Date: Sat May 23 08:41:37 CEST 2009 Problem: 'load doesn't mix with 'expand. There is a simple fix for this: make 'load behave as a preprocessor only, which means it is a reserved word that cannot occur anywhere else in the source code. Maybe this is a bit restrictive though.. It's probably simpler to flatten the current recursive calls for load. Now.. Do I really need the current-load-relative-directory parameter? I'm not using 'load anywhere.. It's probably OK to dump info in the forth search path parameter. Hmm.. let's quick-fix it for now built on existing structure: the first item in the forth path will be the current directory. This is simply changed on begin/end of a loaded sequence. Better: abstract the value of forth-path to include both current directory and search path. OK. it seems to work. There's one thing that's going to bite me later though: the mode (macro/forth) isn't saved.. Maybe this should be implemented as parse-time state also? Just have a single struct that can be easily dumped as an abstract transformer? Anyways.. Road is open to do it properly + it should now be possible to use 'require inside 'load -ed files. Entry: usb cont. Date: Sat May 23 13:07:40 CEST 2009 Next problem: get the code I already have to compile and upload. Entry: load/require interference Date: Sat May 23 13:27:15 CEST 2009 problem: target code generated by a module might be emptied. since a module won't be instantiated again, this will introduce dangling references.. crap.. it's not easy! The real problem is that a module should not have an instantiation side-effect. Or, we should make it so that module code cannot be erased. Or, 'empty' should clear the namespace. Maybe the latter is the best approach. That way modules will get re-instantiated. So.. Application development is separated in 2 parts: - kernel development (as self-contained .fm) - scripts that can accumulate Upon reload the target should be cleared from the point that's marked as the start of the script buffer. Entry: moving stuff to modules Date: Sat May 23 14:29:02 CEST 2009 Ok, with this approach it is possible to have on-demand loading of code using require instead of load. Let's start porting kernel code to modules. Doing this, at least the constants need to somehow be defined in two steps, instead of at application level. Ok. This is going to re-arrange the test code so need to be careful. Moving the test to kernel code, not library code. Serial.f contains some code that can be separated out. Entry: the monitor Date: Sat May 23 16:45:52 CEST 2009 I'd like to put the monitor code in a module, but it needs to be parameterized by read and write. However, this code is not critical in any way so let's turn read/write into dynamic variables. With vectors in ram, this looses a bit of robustness. But nothing a reset can't fix. Also, the kernel size will grow so it won't fit in the 512 bytes any more.. This isn't such a problem either since I'm giving up on programmer-less operation for small devices. Maybe for the usb sticks later, but they have bigger protected boot blocks. ok, i messed up the current code: i thought i was testing it but forgot to upload.. so it was still running old. stupid... there's a problem: "org" doesn't work properly with the way modules generate code. this can be turning boot.f into a module too, to make sure it gets instantiated first. so.. this doesnt work.. org = side effect = not compatible with non-sequential load Entry: order of instantiation Date: Sun May 24 09:27:03 CEST 2009 It is important to: 1. keep assembler-like behaviour for low-level things without actually going to the assembler. 2. keep high-level module behaviour. So, how can this be made more painless? Now wait. There is nowhere in the assembler where "org-pop" is actually executed. Ok, it's in "with-pointer". The problem seems to be here: #x0040 org-push : boot-40 warm ; org-pop The ":" messes up the chain.. Since I can't get this right and can't see the error straight away, there is probably something wrong with the architecture. Ok. Removing the ':' fixes the problem: #x0000 org-push #x0020 jw ; org-pop #x0040 org-push warm ; org-pop So, this worked with "org" but not with "org-push". Can't use "org" any more because the order of instantiation is not predictable. The initial compilation point is defined in pic18.ss and can essentially not be changed. Ok, so I'm going to take the "org!" out of the code: only relative access allowed. OK. Now, why doesn't ':' work inside an org-begin .. org-end ? It calls make-target-split. No idea.. I've eliminated the problem by generating labels dynamically using ">label" and "label:", which fixes the problem. Entry: todo Date: Sun May 24 11:08:12 CEST 2009 FIXME: jw/cw Forth words need to take byte addresses Entry: vectorized receive/transmit Date: Sun May 24 12:29:20 CEST 2009 There's a problem because the interpreter uses the a and f registers, and vectorized access uses them too. No it doesn't. It's a silly bug in tethered (f! instead of a!) Entry: stack size Date: Sun May 24 12:35:09 CEST 2009 Instead of having the interpreter tell the stack size, it's probably better to allow inspection of the current stack pointer, so the host can determine stack size by knowing bottom and direction. This to decouple the interpreter from the hardcoded "ds-bottom" macro. Hmm.. I messed something up again.. I get protocol errors. Something with this returning void sometimes. (define (ts-copy) (let ((it (a>/b (stacksize) #x80))) (if (void? it) '() it))) Looks like (a>/b 0) is not valid. It instead requests a 256 byte string because "0 for ... next" behaves as "#x100 for ... next". Entry: where is the first element on the data stack? Date: Sun May 24 17:26:02 CEST 2009 The problem is that with top of stack in WREG, the stack can't be empty. There is always at least one element, which will be written to the bottom of the reserved memory when a new word is loaded. This element however will be ignored when the stack is displayed. (define (ts-copy) (reverse (a>/b (stacksize) (+ 1 (stackbottom))))) Entry: icd2 serial Date: Sun May 24 18:53:01 CEST 2009 I'm moving away from icd2 serial port. It never really worked well. It's not too much of a problem to have both a programmer and a serial cable attached. The programmer flashes a lot faster too.. So. Preferred way of working for PIC18: * pk2 attached to flash kernel code (which is a .fm module) * hardware serial port for interaction, high baudrate. * .dict holds kernel code, scripts don't get retained * script buffer junk erased on terminal connect. This means: no permanent incremental development. Permanent code can only be flashed as an entire image. It is possible however to construct some kind of fusing mechanism based on interaction. Entry: the synth Date: Sun May 24 18:56:55 CEST 2009 Time to port the synth to module code. With some minor changes it still seems to work. Playing is for another time. Maybe when fixing the docs? Entry: words in namespace Date: Sun May 24 20:54:12 CEST 2009 Moving everything to modules does create the problem that not all words are visible in the toplevel namespace. A simple require will fix that, but still.. There should be a better way.. Entry: interaction macros: not easily composable Date: Mon May 25 00:26:54 CEST 2009 Make some syntax for this. It's currently not straightforward to work on a project and add some live words. Also, think about the role of scat: in this. It's a bit of an unnecessary middleman, no? Is there a way to piggyback target interaction on simulation? I.e. create an opcode with semantics to perform a host -> target remote procedure call? Maybe it should be more dynamic (do-what-i-mean) since it is a debugging tool after all. I've got all the static tight-assness i want in the module system, so let's let it rip in the interaction. Entry: removed old vm commands Date: Mon May 25 09:34:13 CEST 2009 ;; quoted here for later reference. this code is probably broken. ;; Entry point for (syntax-only!) live interaction -> prj code ;; transformation. (define (live->prj code) (define default (predicates->parsers (number? ((n) (n tlit))) (symbol? ((w) ('w tinterpret))))) (apply-parsers-ns/default '(live) default code)) ;; Append a line to a log of lines. (define (log-line str stack) (if (or (null? stack) (not (equal? str (car stack)))) (cons str stack) stack)) ;; DIRECT (provide vm->native/compile live/vm->prj) (define (underscore stx) (->syntax stx (string->symbol (string-append "_" (symbol->string (->datum stx)))))) (define (vm->native/compile code) (define default (predicates->parsers (symbol? ((w) (|'| #,(underscore #'w) |'| _compile macro/default))) (number? ((n) (n _literal))))) (apply-parsers-ns/default '(compile-vm) default code)) (named-parsers (compile-vm) (0cmd ((w) (w))) (|:| ((_ name) (: #,(underscore #'name) enter))) (|;| ((_) (_exit)))) (named-parser-clones (compile-vm) (0cmd pa clear)) ;; FIXME abstract out ns/default thingy (define (live/vm->prj code) (define default (predicates->parsers (symbol? ((w) ('#,(underscore #'w) tf _tlit 'dtc tfind texec/w))) (number? ((n) (n _tlit))))) (apply-parsers-ns/default '(live-vm) default code)) ;; FIXME: find a way to extend the other live commands. ;; map these to their '_' counterpart ;; FIXME: commands that take no args can be simply mapped. ;;(define (_command? x) (element-of x '(ts tss tsx cold ping))) (named-parsers (live-vm) (0cmd ((w) (w))) ;; just use same as native (_0cmd ((w) (#,(underscore #'w)))) ;; special (1cmd ((w) (_t> #,(underscore #'w))))) (named-parser-clones (live-vm) (0cmd commit clear pa ppa cold ping) (_0cmd ts tss tsx) (1cmd p ps px kb)) Entry: do what i mean Date: Mon May 25 09:51:01 CEST 2009 Let's change the semantics of the console interaction as follows: The (target) namespace has the semantics: * target prefix parsers -> expand * target words -> execute * concatenative macros -> simulate * scat: infer type + run * scheme: infer type + run The problem with this is the "infer type" part. For scheme -> scat it's not too difficult to do dynamically using rpn-wrap-dynamic. Ok. added the form (scat-dwim id) The general idea is that for code you want static features, but for the interaction/debugging you really want maximum flexibility. Entry: usb.f and indexed addressing Date: Tue May 26 09:12:13 CEST 2009 Is this really necessary? It is tempting to use, but it will make supporting code for PIC18 difficult. The relative addressing is quite an extensive change. The real problem however is that "struct" addressing needs namespace support in the language. This will be a bigger hurdle. I talked about this before[1]: there is something to say about structs vs. applications in functional programming languages. Can we do the same for Forth? Use the data stack as constructing/deconstructing device? [1] entry://20090322-215126 Entry: MetaML and future of Staapl Date: Tue May 26 09:51:27 CEST 2009 Removing this from the introduction page: Related Work Compared to MetaML, which seems to be the current reference point for staged programming systems, Staapl contains functionality related to MetaML combined with abstract interpretation. However, Staapl is quite different as it is a non-homogenous two-stage system based on flat combinators and dynamic typing, while MetaML is a homogeneous multi-stage system based on the lambda calculus and static ML-style typing. It's not so clear.. The real difference is that Staapl is a bridge between something that behaves as a Forth macro system and the scheme procedure/macro system. The concatenative macros are special in that they do not deal with names, so they could be compared to the constrained code manipulation that's possible in MetaML. But, Staapl also contains a complete scheme-like macro system that can abstract over names (the prefix parsers) bringing it far outside the reach of the static analysis allowed by MetaML. It would be nice to get some discussion going with Walid Taha or one of his students about this. I tried contacting him in an informal way but got no reply. Anyways, I'd like to be able to understand the real differences between macro systems and staging but as far as I can see in the literature, the bridge between them is still being built. Dave Herman's work is interesting in this respect, as is the Ziggurat system. Now, my goals are humble. As arrogance and achiever mentality start to fade, I see what I did in perspective. It's not really rocket science. I'm glad I've got the bridge to PLT Scheme worked out, but there are still things that I don't really like that much: * Inability to integrate true partial evaluation. The basis is there I just don't see the light yet. What I do understand is that PE is more of an art than a science, since it is mainly about avoiding code explosion due to inlined recursive calls. That problem is related to the halting problem. The field seems to be mostly about "bags of tricks" requiring a lot of study to get a good idea about what people have tried and what works and what doesn't. * Separate machine peephole optimizations and generic ones. I.e. the behaviour of '+' should be extendable, to be handled by the main compiler if both arguments are available, and by the target compiler for 1 and 0 available compile time arguments. * A type system. A lot is to be gained by a proper type system that would enable processing of concatenative macro code _before_ handing it over to the eager evaluator. I.e. trying different permutations that are possible due to commutativity of operations. This requires building an algebraic system of combinators that can perform simplifications at compile time. It would probably also help with lifting the language semantics to a bit higher level, and simplifying the peeophole optimizer. * Run-time and compile-time stack interaction. If pops and pushes are made lazy, it is possible to perform data-flow analysis on words with fixed stack effect. However, I currently don't know how to mix the side effects of pushing/popping with re-arranging machine instructions. I'm already using something like this in the live simulator where it is clear how it should work. Entry: standard forth Date: Tue May 26 17:03:38 CEST 2009 What would be the most useful way to incorporate standard Forth into the project? The reason you'd want standard Forth is to use already existing Forth code. There are several problems that prevent standard Forth use at this moment. The most severe ones are: * 8-bit cell size -> some intermediate layer that implements 16-bit access is necessary. * non-standard parser. The latter isn't such a problem. Once it's clear what kind of VM we want for the 16-bit forth, a self-hosted Forth should be bootstrappable with the already existing parser. The question is then: what kind of VM. A subroutine threaded 16-bit VM would work better with the already existing architecture, but an indirect threaded Forth is easier to implement in a machine- independent way. Interoperability is more important than reduced implementation complexity so let's start by creating a subroutine threaded compiler in Forth. To write a self-hosting compiler, the form of the dictionary should be made explicit. I need at least this: [ link | name | CT | XT ] code .... Where the dictionary either contains a pointer to the code, or the code inlined. Probably a pointer is better since then the dictionary could be stored somewhere else (and stripped). So what about this: - FIND returns a dictionary record - REC>XT returns the interpretation semantics - REC>CT returns the compilation semantics A dictionary structure is then [ link | CT | XT | name ] where the name is a padded pascal string. Do strings have a special representation? Also, I want a Forth with space-safe tail calls. Looks like the most difficult part is string comparison. I wouldn't know how to do that without a huge amount of stack shuffling (pointers and sizes). Let's see how this is usually done. I'm starting to think that writing a self-hosted Forth that needs to be bootstrapped isn't such a good approach. It's probably easier to write it top-down and just bootstrap the dictionary. Entry: comparing two strings -> generators Date: Tue May 26 20:03:38 CEST 2009 This is one of the examples which is really hard to do in the current PIC18 Forth. It's a memory to memory operation which doesn't fit well in Forth simply because it uses a couple of arguments (size + address) with some of them double length (address). In scheme, there is always tail recursion for "parrallel assignment": updating all the elements in a state vector simultaneously in terms of the previous values. In Fort due to absence of random access names, this is more difficult, and has to be serialized into a sequence of operations using stack juggling. It's probably best to use an abstraction for this: generators. Entry: more libraries Date: Wed May 27 19:24:28 CEST 2009 Now that I'm going over it again I see how much I've been writing assembler in Forth, using tricks and escapes to get to small and efficient code. One of the examples is the "=" operator... There isn't any! The reason is that I've been using flag based condition macros everywhere. The PIC18 is good at this. Now that I'm writing a string comparison routine where the _only_ point is a destructive test (not a test followed by another operation) i find that I'm lacking proper operations. Let's build some. Actually, they did exist. I discovered them right before the deadline in the Waag project. Entry: rc files Date: Wed May 27 20:21:15 CEST 2009 Extend the staaplc compiler to include an instruction in the .dict file to load the .rc file corresponding to the project. This could then be used to contain interaction scripts. Entry: standard forth Date: Thu May 28 07:36:23 CEST 2009 Before I loose them again, some useful links [1][2]. So, I have most of the difficult decisions made: * Forth is subroutine threaded * @ and ! are RAM, flash has separate access words * dictionary and code are in separate flash regions * dictionary uses Pascal strings * , (comma) goes to RAM first, into a circular buffer Still to solve: * FIND * terminal input * large branches There is one thing I'm not convinced about. The threading model. Maybe it's better to stick to some form of address interpretation. It will make things a lot slower, but won't have double sized call words or chained jumps. Anyways, I already did a lot of thinking about this, the result of which can be found in the pic18/vm-core.f code. I'm again stuck in this loop of not knowing what to choose: fast or flexible? Putting it like this it should of course be flexible. Can't I have both? It should be really simple though: for speed, use the 8-bit Forth. It's always going to be faster. For the one on top of that, use anything that has compatible primitives _and_ has small code size and maximum flexibility otherwise. The VM in pic18/vm-core.ss uses threaded code with an exit bit to implement tail recursion. It also sees the return stack as code. Some links [1] http://astro.pas.rochester.edu/Forth/forth-words.html [2] http://lars.nocrew.org/dpans/dpans.htm Entry: partial evaluation is proper commutation Date: Thu May 28 07:44:28 CEST 2009 It's all about order. I had this dream where it was very clear to me. Of course as dreams go, they only tell you what you believe, not necessarily what's true or how to get there. Makes sense though. I've seen the argument made multiple times that lazyness and good PE are quite related. Lazyness gives you some kind of optimal evaluation order. Entry: debugger Date: Thu May 28 14:40:31 CEST 2009 - scripts - require on command line - arbitrary code compilation on command line (1) - variables crash (1) This is best done by interpreting ':' as a switch to compile mode for the rest of the line. It needs a special prefix parser. Problem is: this is not composable (you can't do anthing _after_ compiling the code), so it probably needs a continuation hook to be able to build stuff on top. Let's define the "compile" word to take another macro as continuation. compile commit . . . . Ha I worked myself into a corner here.. How to call Scheme code from a macro? It's probably better to do it in two steps. I made some abstractions: - slurp the rest of the line: slurp receiver code ... -> receiver (code ...) - perform side effects wrapped as scat functions for calling (eval '(forth-compile-tokens c)) - switch to compile mode for a limited number of words for variable, 2variable, require, ... - switch to compile mode for the rest of the line All compile mode switches immediately upload code. Entry: happy Date: Thu May 28 17:42:52 CEST 2009 I'm quite happy with the way everything composes now. There seem to be no arbitrary limits that limit composition so expressive power multiplies into beautiful flowers. So there is scat that glues everything together, scat's unquote to be able to turn anything into scat code, scheme for stuff that needs local lexical names, and the rewrite system (based on scheme's syntax pattern matching) for any kind of "exceptions" that might arise. The late-bound interaction language also makes things a lot easier to implement. It looks like it's ready to be finalized. Entry: issues to solve Date: Thu May 28 23:25:07 CEST 2009 - command completion - regain flat namespace (import all modules) (1) - snot (1) is now done in code.ss -- it is essentially a reverse name lookup independent of the forward name resolution that is performed using identifiers only. this dictionary could then be used at the debug console. i'd like to fix the disassembler too. Maybe it's better to fix the CFG rep first. I've never really liked the way code lists are linked to labels. It feels artificial. So, is there a way to solve this problem? Maybe the CFG should be defined in a more textbook manner. Looking at it from the pov of the assembler, - labels point to code - have an address that can be modified - code points to labels So, there is a static part (labels + assembly code) which is a graph. To this graph another data structure is associated which maps labels to addresses and assembly code to binary lists. Mabe this should just be abstracted into some datatypes. Ok. this is probably what is necessary: The default data structure on which the compiler operates, and for which there exist printing routines is the CHAIN, which is a list of (label code) lists. In addition, each LABEL (target-word) points to its associated code and its next instruction, however, these are only there for reference and might be implemented separately. Both during compilation and disassembly these might be invalid. Entry: list split Date: Fri May 29 12:58:19 CEST 2009 There is one simple parsing step that keeps occuring over and over in dealing with flat things: introduce structure by grouping based on a predicate of one of the lists. i.e. (x x x x x x x x x) (1 0 0 0 0 0 1 0 0) -> ((x x x x x x)(x x x)) This has to have some kind of name.. This is basicly "regexp split" but then done on multiple sequences. What this needs is two functions: - match - combine Entry: vm stuff Date: Fri May 29 17:47:55 CEST 2009 OK.. it doesn't work any more. somehwere something went wrong. i just switched the Forth to use byte addresses for "address". Let's try to debug it. OK, it was storing byte addresses, but expecting word addresses. Now, how to distinguish words from macros? This requires some restructuring. The problem is trivial when using compilation tokens but i have an inverted dependency somewhere.. Hmm.. i tried restarting 5 times.. There's a problem here since we're using the same namespace. Try again.. The problem is this: my macros already have compilation semantics. The distinction between word/maro is made in the dictionary compilation process. Overriding this is asking for trouble: it's essentially unhiding something that's already abstracted. There has to be a simpler way to do this. I removed pic18/rpn-macro-double.ss to start over. There is already a mechanism for this, but it's also not composable: the way forth-begin is defined in pic18.ss Conclusion: the forth definition code is not re-entrant because it is tied to the (macro) namespace: can't define Forth on top of Forth. However, it is possible to hide this behind namespaces, but I'm not sure if this is worth the trouble. It would get quite confusing too.. The only way to do this is to use nested namespaces.. Entry: forth on forth Date: Sat May 30 09:05:05 CEST 2009 The proper way to have _both_ an anonymous compiler and a dictionary compiler requires some juggling. This is an energy sink.. It's clearly not anticipated in the design, so I wonder if i should persue it. The irony is that a threaded Forth compiler is really trivial compared to one based on composition of macros. What should this be used for? - A standard Forth: yes - A standalone Forth: yes - Use inside native code: maybe The important observation is that a standard Forth doesn't need the "macro:" layer because it can be written entirely in the interaction mode. This is how it was implemented before. In other words: the toplevel namespace module is better suited to a standard Forth than the declarative module one. Let's not waste too much time on this. Conclusion: there is no way around writing a full-fledged Forth parser, which either runs on the target or on the host. It's probably possible to make some minor hacks in target-double: but that's what they be: hacks. Anyway, it really only needs ":" and "variable". Then, maybe there really is no reason to first write a tethered standard Forth and then make it stand-alone? Probably best to have only the standalone version. Entry: Forth mode Date: Sat May 30 11:10:21 CEST 2009 Traditionally, Forth has 2 modes: interpret / compile. The Staapl Forth is modeless. It instead has two languages. "macro:" is the compiler language while "target:" is the command line interpreter language. Both models are similar. The main difference is that in Staapl "macro:" doesn't know anything about "target:", while in Forth words are words.. Entry: byte addressess Date: Sat May 30 11:20:30 CEST 2009 I'm loosing too much time about byte/word addressess. The original idea was to use only word addressess to make the assembler simpler, but that's inconvenient in Forth if there is also to be byte data access. Let's keep the assembler as is, but make the change at the interaction level. The problem seems to be in the function "tfind" that treats code and data the same. This needs to be split into two functions: find-code and find-data :: symbol -> byte.ptr These then should be defined in the toplevel namespace so the interaction code can pick them up there. Entry: Forth VM next Date: Sat May 30 14:07:36 CEST 2009 Need to check if the macros work with the vm: language. Currently the hack to have a bootstrapping Forth console for the 16-bit VM seems to be simple and effective. Ok. Macros work. But.. I simply can't have it that I spent all this time making the Forth layer abstract enough to be able to reuse it for other architectures, but I can't seem to make it work for building Forth on Forth. As mentioned before: I'm probably not going to use that layer much. To make the self-hosted Forth possible, all macros need to be implemented as on-target words. That would be the next step. Now, instead of always looking at host -> target rpc, what about enabling target -> host calls? This would make bootstrapping a self-hosted interpreter simpler: it would enable gradual offloading. That seems to be the real problem to solve! Entry: target-directed to-host offloading Date: Sat May 30 14:51:37 CEST 2009 Figure out a way to make the tethering protocol bi-directional. The target should be able to request a computation to take place on the host. It's better to use the target's code sequencing to debug code than to offload this as simulation to the host. The simplest way to do this is to use a symbolic interface: target sends a string requesting for a command to execute, and waits for ack. Ok, so what needs to be done is to find a way to implement the target Forth's immediate words both as macros (using the host compiler) and as target words. This by abstracting the dictionary and the compile stack. Conclusion: The difficulty in writing a Forth in Forth is to abstract the dictionary, compilation stack and threaded code representation in such a way that a single specification of all immediate words (mostly control words) can be used in different compilers. So, can a very classical way of implementing the control words be implemented in coma? The problem is that coma uses abstract labels, not target addresses, and a real forth would use forward jump patching. But there are only two forms: forward and backward jumps. Both will save some data on the compilation stack and consume it later. PUSH POP empty forward jump + hole here -> hole here backward jump From [1]: Bill Muench designed the original eForth 1.0 for simplicity and portability. It had only 30 words written in assembler and used only BEGIN_UNTIL BEGIN_WHILE_REPEAT IF_ELSE_THEN and FOR_NEXT in its source. The second release reduced the number of code words to 28 and removed the FOR_NEXT constructs from the source code and replaced them with BEGIN constructs. I was pleased to learn this when I was designing a meta compiler to generate a version of eForth for a new target. It meant that there were fewer IMMEDIATE words that were needed in the meta compiler. The meta compiler no longer needed to compile FOR_NEXT constructs. [1] http://www.ultratechnology.com/meta.html Entry: niche Date: Sat May 30 16:00:25 CEST 2009 well put: Programming philosophy: don't bother with any higher-level language -- just write in and extend the operational semantics directly by adding new virtual machine instructions. You'll be forced to think more clearly about what your programs do. You'll encounter fewer "impedence mismatches" where you have to fight against the programming language to say what you mean (e.g., tail calls in C). You'll probably come up with results that are much, much more economical in all but time to market. There's something interesting about modularity in forth: In forth, each word's author is expected to be cognisant of and responsible for the whole state of the machine. For the price of assuming cooperation and trust between components, you get enormous flexibility and power. [1] http://lambda-the-ultimate.org/node/2319#comment-34864 Entry: This week Date: Sun May 31 08:51:21 CEST 2009 With the interaction problems fixed, it's time to get the usb driver going. This has been the subject of procrastination for far too long. Let's start with this idea of Forth, explained in the LtU thread i quoted in the previous post: build a state machine that can solve your problem, and violate locality where necessary (imposed co-operation). Basicly this is component based design. Electronics. Forth is about writing code as you would write hardware: the finiteness is central to the idea. Singletons with interfaces. On the theory side, it's important to start acknowledging the difference between concatenative macros and prefix parsers. So, goals for this week: - main: usb driver - on the side: target -> host rpc Entry: the 14-bit VM Date: Sun May 31 15:22:19 CEST 2009 I'm pasting the source code of the more exotic VM in this post as a backup. Code will be modified in-place to move to a simpler DTC architecture. There are 3 files dtc-control-i.ss on-target immediate words (untested) dtc-control-m.ss same, but using host macros dtc.ss core interpreter ----------- dtc-control-i.ss ----------- #lang planet zwizwa/staapl/pic18 \ -*- forth -*- provide-all \ On-target immediate words implementing the control words. staapl pic18/double-math staapl pic18/double-pred staapl pic18/execute staapl pic18/dtc \ This needs "comma" and a way to back-patch words. The idea is to \ compile to a RAM buffer first, and transfer it to FLASH when it's \ done. staapl pic18/double-comma macro : _address word-address lohi ; forth : _mask #x3F and ; : _lmask _mask #x40 or ; : _compile _mask _, ; \ takes word address as 2 bytes : _literal _lmask _, ; : _0 0 0 ; \ These compile unconditional and conditional jump. : _jump, ' _run _address exitbit _compile ; : _0=jump, ' _0=run; _address _compile ; \ Jumps are proper primitives. They take a single argument which we \ compile as a literal. : _hole _here@ _0 _literal ; : _lpack _>> _lmask ; \ pack byte address as literal : _then _>r _here@ _lpack _r> _! ; \ patch hole : _if _hole _0=jump, ; : _else _>r _hole _jump, _r> _then ; : _begin _here@ ; : _again _lpack _, _jump, ; : _until _lpack _, _0=jump, ; \ COMPLICATIONS: because of the exit bit, jump targets need to be \ protected so the previous instruction doesn't get exit-tagged. See \ -m.ss ---------- dtc-control-m.ss ----------- #lang planet zwizwa/staapl/pic18 \ -*- forth -*- provide-all \ Macros implementing the control words. For a self-hosted \ interpreter these need to be replaced by immediate words. staapl pic18/double-math staapl pic18/double-pred staapl pic18/execute staapl pic18/vm-core macro \ note: XT need to be word addresses, since i have only 14 bit \ literals. return stack still contains byte addresses though, so for \ now it's kept abstract. \ create a jump label symbol and duplicate it (for to and from) : 2sym>m sym >m m-dup ; \ jumps are implemented as literal + primitive (instead of reading \ from instruction stream) : m>jmp m> literal ' _run compile _exit ; : m>0=jmp m> literal ' _0=run; compile ; : _begin 2sym>m m> label: ; \ back label : _again m>jmp ; \ back jump : _until m>0=jmp _space ; \ conditional back jump : _if 2sym>m m>0=jmp ; \ c: -- label1 : _else 2sym>m m>jmp m-swap m> label: ; \ c: label1 -- label2 : _then m> label: _space ; \ c: label -- : _space ' _nop compile ; \ necessary when 'return' needs to be isolated. \ : _for _2sym>m m> label ' do-for compile ; \ c: -- label \ : _next _m>literal ' do-next compile _space ; : _for ' _>r compile _begin ; : _next ' do-next compile m>0=jmp ' _rdrop compile _space ; ------------ dtc.ss ------------- #lang planet zwizwa/staapl/pic18 \ -*- forth -*- provide-all staapl pic18/double-math staapl pic18/double-pred staapl pic18/execute \ ************************************************************************ \ A direct threading composite code interpreter. It has a number of \ small differences to standard Forth. The idea is this will run a \ version of forth without parsing control words, but using quoted \ code instead. \ *** CONTINUE resumes the execution of the VM, more specificly the \ program pointed to by IP. A program is an array of primitive \ instructions. Primitive instructions are primitive code (word) \ addresses + a continuation discard bit (EXIT bit). IP is \ implemented by TBLPTR (the f register). \ *** I want to express iteration using TAIL RECURSION. This means \ the caller needs to pass the proper continuation to the callee on \ the RETURN STACK, discarding the current thread if necessary. For \ this purpose, one 'EXIT' bit will be reserved in the instruction \ field, and the interpreter loop will pop the stack before calling \ the next primitive. \ *** A continuation can be invoked by RUN, so there is no distinction \ between programs and continuations. A continuation takes a data \ stack as argument, just like ordinary programs. RUN is the dual of \ forth's EXECUTE, which is used here to invoke primitives. \ *** The machine return stack is reserved for the underlying STC \ forth / machine code. The VM uses the STC retain stack as return \ stack, to limit interference. \ *** To treat composite code as a primitive, an array of primitive \ instructions needs to be prefixed by a machine code element 'CALL \ enter', which will save the current continuation (IP) and invoke a \ new one. This 'enter' could be duplicated if a large address space \ is spanned, so a short branch can be used. \ *** The interpreter is explicit: this is done so that primitives do \ not need to end in NEXT, as is done traditionally, enabling the use \ of native/STC primitives. All 16-bit primitives are prefixed with \ '_' (underscore) so they are easily mapped and debugged in STC \ forth. \ TODO: some modifications. \ - all data sizes used (literals, primitives, composite) fixed at 14bit \ - interpreter runs on top of memory model: composite code in ram possible \ ************************************************************************ \ IP + RS \ instruction pointer manipulation. only the ones that affect the \ machine return stack and machine flags need to be macros. the rest \ can be functions for ease of debugging. macro : @IP+ @f+ ; \ read bytes from the instruction stream forth : _IP! _<< fh ! fl ! ; \ store to IP : enter \ asm (rcall ENTER) wraps composite code in prim _IP>r TOSL fl @! \ TOS cannot be movff dst, but src is ok TOSH fh @! pop ; : _>r >r >r ; : _r> r> r> ; : _rdrop rdrop rdrop ; \ These 2 govern the format in which threaded addresses are stored on \ the return stack. For return stack tricks to work, this is taken to \ be word addresses. : _IP>r \ save current IP to VM RS clc fh @ rot>>c +r ! fl @ rot>>c +r ! ; : _r>IP \ pop IP from VM RS clc r- @ rot<carry and LIT->negative flags. : exit? c? ; : literal? n? ; : prim@/flags \ fetch next primitive from composition \ clc \ low bit is ignored by PIC @IP+ rot< exitbit ,, ; : _; _exit ; \ utility macros : _c>> rot>>c 2nd rot>>c! ; : _<IP then \ c -> perform exit literal? if 14bit ; then \ n -> unpack literal execute/b continue ; \ execute primitive : 14bit \ interpret doubleword [ 1 | 14 | x ] as a signed value. _c>> \ [ x | 1 | 14 ] #x3F and \ high bits -> 0 1st 5 high? if #xC0 or then \ high bits -> 1 continue ; : _bye pop \ quit the inner interpreter : _nop ; \ trampoline entry. 'interpret' will run a dtc primitive or primitive \ wrapped program. : bye>r enter ' _bye compile _exit : interpret \ ( lo hi -- ) bye>r \ install continuation into dtc code "bye ;" execute/b \ invoke the primitive (might be enter = wrapped program) continue ; \ invoke threaded continuation \ CONTROL FLOW WORDS \ 'run' is the dual of 'interpret'. it takes threaded code addresses. in \ combination with the exit bit, this can be used to implement \ conditional jumps. : _run \ word-addr -- _IP>r _IP! ; \ : _0=run \ flag addr -- \ _run \ or nz? if _r>IP then \ drop ; \ "go" = "run ;" \ i don't want to use the word 'jump', but conditional jump is not the \ same as conditional run. : _0=run; \ ? program -- _>r or nz? if _rdrop else _r>IP then drop ; forth : do-next \ -- ? _r> _1- _dup _>r _0= ; Entry: Simpler DTC Date: Sun May 31 15:39:24 CEST 2009 The 14-bit VM is clever, but it's not simple. It might be used for something else later. The fact that the return stack is executable code might lead to somewhere. However, the 14-bit constants are a pain and the extra effort to make tail recursion work is not worth it. Right now what is important is to get the self-hosted Forth to work and make it reasonably portable. Let's go back to a simple threaded Forth. I'm not sure what the name is for the method I'm using: it's direct threading[1] with primitives that do not use NEXT, but instead use an explicit interpreter loop. (NEXT = procedure return). [1] http://en.wikipedia.org/wiki/Threaded_code#Direct_threading Entry: buffered compile working Date: Sun May 31 20:12:45 CEST 2009 In the end it turned out to be simple, with the right primitives. However, to get there I had to tone down my enthousiasm.. The only way to write lowlevel Forth is this: - write primitives for your problem - test the primitivies - write the high-level code Even for the simplest problems (like moving a buffer from ram to flash after extending its bounds) it pays off to leave the muddy waters of low level machine state as soon as possible. But inevitably this step has to be performed. The testing is what makes this all work. If it wasn't so easy to compile a word and test it, working like this would be quite difficult.. Once the primitives were right, the stuff on top was really obvious. Entry: usb and debugging Date: Mon Jun 1 07:21:38 CEST 2009 Let's get target->host communication working to at least make debug print statements work. The idea is this: whenever a command gets executed, the host waits either for ACK (zero length message) or a command to execute. For now let's just stick to display. It is quite trivial. Apparently "emit" was already defined as : ack1 1 transmit transmit ; The host side then is simple: on every execute, expect printouts before ack (empty message). (define (tslurp) (let ((reply (target-receive/b))) (unless (null? reply) (display (list->bytes reply)) (tslurp)))) (define (texec/b addr) (~texec/b addr) (tslurp)) Entry: usb Date: Mon Jun 1 10:44:27 CEST 2009 So, how to tackle USB. On the PIC this boils down to dealing with endpoint buffers, so let's write some abstractions to deal with those. We can use the simplest scheme: no double buffering, one endpoint and fixed buffers for IN and OUT. An endpoint buffer descriptor is a 4-byte structure: 0 STAT status register 1 CNT buffer elements 2 ADR buffer address This descriptor resides in the USB RAM, which is a dual ported memory bank accessible by the MCU (microcontroller unit) and the SIE (serial interface engine). Ownership is governed by the UOWN bit in the STAT register for each buffer. The buffer descriptor addresses are mapped to buffer descriptor registers when the UEPn bit is set (endpoint enable), or to RAM when the endpoint is disabled. The STAT register's contents depend on wheter the MCU or SIE owns the endpoint buffer. 7 6 5 4 3 2 1 0 SIE mode UOWN - PID3 PID2 PID1 PID0 BC9 BC8 MCU mode UOWN DTS KEN INCDIS DTSEN BSTALL BC9 BC8 3222222634222626262222263 Jun 1 13:21:15 zzz kernel: [415980.152170] usb 4-1.3: new full speed USB device using ehci_hcd and address 95 Jun 1 13:21:15 zzz kernel: [415980.224150] usb 4-1.3: device descriptor read/64, error -32 Jun 1 13:21:15 zzz kernel: [415980.400121] usb 4-1.3: device descriptor read/64, error -32 Jun 1 13:21:15 zzz kernel: [415980.576090] usb 4-1.3: new full speed USB device using ehci_hcd and address 96 Jun 1 13:21:15 zzz kernel: [415980.648072] usb 4-1.3: device descriptor read/64, error -32 Jun 1 13:21:15 zzz kernel: [415980.824042] usb 4-1.3: device descriptor read/64, error -32 Jun 1 13:21:15 zzz kernel: [415981.000135] usb 4-1.3: new full speed USB device using ehci_hcd and address 97 Jun 1 13:21:16 zzz kernel: [415981.408010] usb 4-1.3: device not accepting address 97, error -32 Jun 1 13:21:16 zzz kernel: [415981.480547] usb 4-1.3: new full speed USB device using ehci_hcd and address 98 Jun 1 13:21:26 zzz kernel: [415991.888009] usb 4-1.3: device not accepting address 98, error -110 Jun 1 13:21:26 zzz kernel: [415991.888255] hub 4-1:1.0: unable to enumerate USB device on port 3 It seems to try 5 times to reset the device. Apparently we need to send something back. In [1], section 8.3.3 (usb enumeration) i find this: - host sends USB RESET - host sends GET DESCRIPTOR to find out After URSTIF (6) there's ACTVIF (4). It looks like we need to send something back, but what? Maybe [2] will help. It contains the USB stack for PIC18. [1] http://www.elsevier.com/wps/find/bookdescription.cws_home/714114/description#description [2] http://www.microchip.com/stellent/idcplg?IdcService=SS_GET_PAGE&nodeId=2680&dDocName=en540668 Entry: usb cont Date: Mon Jun 1 16:50:48 CEST 2009 problem was incorrectly initialized buffer descriptors (swapped CNT and STAT) and not setting UEP0 to #x16 (had #x14) to enable CONTROL+IN+OUT Next: handling transactions. Get the PID from the buffer ID. This can only be \ 0001 OUT \ 1001 IN \ 1101 SETUP Ok, the next step is to handle all the requests. What needs to be done is to make a simple interface that maps the requests to Forth functions. How to make this readable? This makes for some incredibly boring code... The SETUP buffer contains the following values: Entry: accessing structures Date: Tue Jun 2 09:51:41 CEST 2009 I'm moving back to the previous approach, which is a pattern I've seen a lot in Forth code: - set "current context" - operate on it This isn't so bad, given that there is no other way to have any form of data locality. For this we're not going to use the extended instruction set: just some uniquely named accessors that operate on the current object in the a register, without changing the pointer. OK. Replied to GET_DESCRIPTOR, now there's a SET_CONFIGURATION coming in. Problem is now that probably again i'm not properly acknowledging this request, since there is a STALL coming in, and the host gives timeouts. It would have been so much simpler if they'd just made it into a single flat namespace and a uniform RPC mechanism instead of all this if-whatever-then-set-this-else-do-that crap. I guess it doesn't get much muddier than this. Interfacing with hardware that's designed for minimal _hardware_ complexion sucks. You get all the shit.. So it looks like i don't understand replies yet.. When to set the DATA toggle for instance. Enough for today.. Anyways, it looks like this problem is general enough to be solved with humility and acceptance[1]. It's one of those problems that seems to be not there. Maybe it's so hard because it actually does something really important: throwing away the right information. [1] http://zwizwa.be/ramblings/staapl-blog/20090602-110800 Entry: syntax-directed translation Date: Tue Jun 2 12:10:26 CEST 2009 In [1] chapter 9 there is a section on syntax directed code generator generation. (a.k.a. Graham-Glanville). It seems what I'm doing is related to this, only Staapl uses RPN instead of PN, and pattern matching is ordered (eager). [1] http://www.elsevier.com/wps/find/bookdescription.cws_home/677874/description#description Entry: usb next Date: Tue Jun 2 15:30:26 CEST 2009 Simple: build better abstractions. The first thing to do is to abstract the buffers better. Each buffer should be an object + methods: - claim / release vs. send / receive - buffer chunking - data toggle - interrupt acknowledge (= transaction request queue) Second it might be interesting to write some highlevel interface on top of bit access. It's important to have the low-level interface for when speed counts, but in general initializations are not speed-critical, but they comprise the bulk of the (tediously explicit) code. Entry: documentation Date: Tue Jun 2 16:30:54 CEST 2009 file:///usr/share/plt/doc/scribble/srcdoc.html Might be better for writing docs.. Much of the code in Staapl is quite straightforward and readable. Especially for macros it might make more sense to just look at the expansion instead of some description. Entry: more standalone forth Date: Tue Jun 2 19:14:44 CEST 2009 NEXT: the dictionary. I already settled on using a 2-part dictionary with metadata and code stored separately. Things to figure out: * Where to put it * What to put in it * How to store head pointers. Note that recursion is difficult in standard Forth, because the word's semantics is only known _after_ the definition is compiled. One could say that immediate words don't really need recursion so storing the address at the invokation of ":" makes recursion work. I don't think the standard says anything about this.. Actually, I'm wrong. RECURSE is used to make sure that words can be redefined keeping the previous behaviour reachable for delegation. Let's just keep it simple and use the abstract "_," to create the dictionary entries too. This leads to reasonably portable code. [1] http://www.taygeta.com/forth_intro/recurse.htm Entry: VALUE and TO Date: Tue Jun 2 21:21:17 CEST 2009 Reading issue 1/4 of the Forth Dimensions explaining the TO concept. This can be combined with VALUE to create words that dereference by default but can be escaped for assignment. Doing this in Staapl requires a bit of modification to the way variables are wrapped. I'm not sure if it's actually possible since the reflection that TO performs might not be available. Names are associated to a single behaviour. Entry: smarter bootstrapping Date: Tue Jun 2 21:46:51 CEST 2009 So, now that I have an idea about how to make the primitives work, maybe it's possible to modify eForth to use the buffered compiler, then bootstrap using gForth. Once this works, the same could be done for a set of primitives written in Scheme. This could then be extended into a working ANS Forth that runs in Scheme, that can be used to bootstrap standard Forths for other architectures on top of the Staapl Forth. Summarized: - eForth + buffered compilation - bootstrap self-hosted Forth for PIC18 using eForth86/gForth - bootstrap eForth on top of Scheme for bootstrapping the microcontroller Forths without eForth86/gForth It does look like eForth is manually bootstrapped: there is just an ASM file which contains manually compiled threaded code. This means it can't be metacompiled easily. I'd like to make this a bit more convenient. With the new parser architecture it might be possible to bootstrap both the target forth and the metacompiler directly from the same source code using lazy circular programming, manually breaking cycles in .f code if they occur. That sounds like an interesting challenge. The control words don't seem to be such a problem. The parsing words are. Closing the loop there is the core of the problem. However, there is a neat trick in eForth: : COMPILE ( -- ) R> DUP @ , CELL+ >R ; COMPILE-ONLY This makes it possible to avoid parsing words in the compiler, but it requires the code to be threaded, not natively compiled. This makes me think that trying to bootstrap by instantiating a 16-bit binary image in Scheme might be feasible. Once it is resolved and relaxed, it can be directly transferred by mapping the primitives. The parser can segment the code such that names can be mapped to tokens. From this, immediate words need to be identified so they can be used in the compilation of the code. This seems to be the essential circle to break. Parsing words defined in .f code are then no longer part of the circle, and are written bottom-up in threaded code. The "R>" trick might interfere with bootstrapping though.. Unless the whole memory is lazy such that "@" effectively compiles the next word in line.. There is probably going to be a problem where lazyness and state (here) will interfere. In this respect it seems that a 2-pass algorithm is simpler: 1. Construct an interpreted version of the compiler as a code graph based on scheme functions that string together the primitives. This means the compiler cannot inspect its threading mechanism (it's not there!). 2. Use this version to compile the source again. Hmm.. Direct execution might not work though. Maybe simulating the threading is better.. I do wonder if it's possible to use the second pass to go over the tokens one by one and instantiate them. If the immediate words are runnable, they can generate the correct _number_ of tokens, but might not yet be able to resolve them. Lazy approach might work after all. Wait... If the Forth could somehow use single assignment lazy bootstrapping would work just fine. The .f file is then a specification of a string of bytes. Maybe all this needs is a re-interpretation of "!" and "@" ? Entry: more bootstrapping Date: Wed Jun 3 09:48:28 CEST 2009 Ok I have mcf.ss parsing the code. Now, how to fire it up? Can it be done fully circularly? I.e. do "literal" and "," need to be unrolled or can the be taken from the source? Let's try to solve the first problem: to simply interpret some code. I've implemented the eForth[1] primitives before. They simulate a DTC binary machine. Now the two need to be hooked together. [1] http://www.baymoon.com/~bimu/forth/ Entry: it's a trap! Date: Wed Jun 3 10:32:21 CEST 2009 Combining the Staapl language tower approach and the self-hosted Forth with immediate words is confusing stuff. I keep seeing ways to bootstrap more directly using the compiler infrastructure, but this requires unrolling the immediate words. Now, in essence, that is not so difficult. There are only a couple. What keeps escaping me however is how to do this automatically. Given a single (standard) Forth file, lift the immediate words (and their dependencies!) to the macro side. What this requires is a shift in perspective: Forth macros are lowlevel macros, so they should really correspond to _scat_ words instead of _macro_ words. So to keep things in perspective: * circular bootstrapping seems interesting, but might not be necessary if the forth code can be unrolled just enough to compile itself. the challenge here is to find the right primitives (prefix parsers and compiler words) to do this.. Entry: old comments about bootstrapping Date: Wed Jun 3 10:42:16 CEST 2009 I'm running in circles.. Good I write things down, now i just need to read them from time to time! -- ;; I've been looking for a long time to find a solution to writing a ;; frontend that's ANS Forth compatible. I'm still not sure wheter it ;; is really useful at this point, but if it is not too difficult, it ;; might be a nice addition that enables the inclusion of Staapl into ;; a more traditional Forth based project. ;; The problem in itself isn't very difficult: ;; ;; * find a Forth written in Forth + a small set of primitives ;; * implement the primitives. ;; * bootstrap the compiler ;; ;; However, I'd like to do it in a way that enables some more ;; flexibility. ;; After pondering this for a while, I think this might be an ;; interesting approach: Write a simulated Forth, use it to generate a ;; memory image, and translate the compiled threaded code to run on ;; top of Scat's Forth / Coma. ;; Doing this in a way that enables gradual offloading to the target ;; is not that simple. Wanting more control over dictionary format ;; and execution model (i'd like to use STC primitives) makes things ;; quite challenging. ;; http://lars.nocrew.org/dpans/dpans.htm ;; core wordset: ;; ! # #> #S ' ( * */ */MOD + +! +LOOP , - . ." / /MOD 0< 0= 1+ 1- 2! ;; 2* 2/ 2@ 2DROP 2DUP 2OVER 2SWAP : ; < <# = > >BODY >IN >NUMBER >R ?DUP ;; @ ABORT ABORT" ABS ACCEPT ALIGN ALIGNED ALLOT AND BASE BEGIN BL C! C, ;; C@ CELL+ CELLS CHAR CHAR+ CHARS CONSTANT COUNT CR CREATE DECIMAL DEPTH ;; DO DOES> DROP DUP ELSE EMIT ENVIRONMENT? EVALUATE EXECUTE EXIT FILL ;; FIND FM/MOD HERE HOLD I IF IMMEDIATE INVERT J KEY LEAVE LITERAL LOOP ;; LSHIFT M* MAX MIN MOD MOVE NEGATE OR OVER POSTPONE QUIT R> R@ RECURSE ;; REPEAT ROT RSHIFT S" S>D SIGN SM/REM SOURCE SPACE SPACES STATE SWAP ;; THEN TYPE U. U< UM* UM/MOD UNLOOP UNTIL VARIABLE WHILE WORD XOR [ ['] ;; [CHAR] ] ;; The problem with implementing this in a way that it can be ;; simulated on the host and moved to the target lies in 3 parts: ;; * INPUT: ACCEPT ;; * DICTIONARY: FIND WORD VARIABLE CONSTANT CREATE POSTPONE ;; * THREADING: Entry: lifting immediate words to staapl Date: Wed Jun 3 10:44:49 CEST 2009 I get these words with the simplistic code upto now: compile ! >buf hole r> >r here@ ?jump , ; jump The "compile" hack probably needs to go. So.. This needs a compiler that compiles threaded code. The wrapper macros will compile (dw addr). More later. It's confusing me again.. This is something for the morning (16:53). Entry: TTL video Date: Wed Jun 3 14:43:47 CEST 2009 Time to get the hands dirty. Too much programming and thinking lately.. Here's one of the things I'd like to do. I have 2 or 3 old TTL monochrome monitors that I'd like to give a new life. The pinouts are in [1]. This could use the approach I used before for TV, but with syncs on separate lines. I used the SPI output port for sending video data. On a 452 this i pin 24 (RC5/SDO). The D-SUB Female: 5 4 3 2 1 9 8 7 6 1 GND 6 INTENSITY 7 VIDEO 8 HSYNC positive 9 VSYNC negative Let's start with connecting VIDEO to the SDO, and the hsync/vsync to two other ports. I used RA4 for Composite. Ok, sending out stuff on SDO gives some signal. Let's figure out syncs. The horizontal frequency is supposed to be 18.43kHz. That's 542 ticks of a 10MHz clock. The divisor should thus be 137 * 4. Ok. Hsync works with a pulse of about 1.5 us wide. It would be nice to be able to get at some timing data to see how the vsync works (duration and nb of lines..) [1] http://pinouts.ru/Video/mono_ttl_pinout.shtml [2] http://en.wikipedia.org/wiki/Color_Graphics_Adapter [3] http://en.wikipedia.org/wiki/IBM_Monochrome_Display_Adapter [4] http://www.seasip.info/VintagePC/mda.html [5] http://en.wikipedia.org/wiki/Motorola_6845 Entry: simpler debugging Date: Wed Jun 3 16:34:38 CEST 2009 Some things that need to be figured out: - profiling. Why is loading the (compiled) image so slow? Is it really the scat vm? - find a proper way to recover from chip lockup. Currently it will exit the app after "cold" fails (relic from pk2 external reset). -> fixed: won't exit now. - give up on dictionary as "top" files. it hurts composition (frameworks don't work for debugging: you want top control to script things) maybe do this: when generating a .dict also generate a .live file which simply loads the .dict file. the user can then use this file to add debug commands, and it could be used to add a sandbox etc.. - figure out a simple way to make the disassembler display the right addresses. currently it interprets everything as code addresses. Entry: structured procrastination Date: Wed Jun 3 23:14:12 CEST 2009 I think I've been doing this[1] for a long time now. For me it's always been network system administration, improving debugging tools, starting new projects from crazy ideas and generally structured talking out of my ass to see what it brings.. Actually, apart from te net admin it's maybe all true procrastination.. Lately I've been wanting to study more PL theory, but I still find it more interesting to re-invent the weel following subtasks in Staapl. The Forth bootstrap has been quite a funny example here.. It's an interesting problem. Not only for nostalgic reasons (yeah those 8 bit thingies) but I like building non-trivial things on simple systems - to think, where mere brute force programming pattern duplication would be much more economic with time.. But that, is excepting defeat. Anyway.. What am I saying.. This is not a blog.. It's about the code, no? The real problem I'm avoiding at this moment is partial evaluation and the functional concatenative language. I also promised myself to write the USB driver first, and write the documentation. But instead I'm bootstrapping a standard Forth on top of a system that is distinctively non-standard for very good reasons. Why? Because it's fancy and shows off the flexibility of the system. The PE is difficult.. Maybe mostly because the problem is not well-defined. It's difficult to define "optimal". People do seem to have tried (and maybe succeeded?). Hence I didn't read too much yet either.. The USB driver is difficult for a different reason: it's extremely tedious and error-prone to write in a direct style. I'm trying to do it a bit more abstract so I can learn a thing or two for the next complicated driver I need to write.. I'm thinking about specializing in driver writing, and write some tools for that. It's a difficult problem worthy of some attention. It's also particularly unglamorous, so might make me some money in the process. Writing documentation is difficult. Writing clearly tout court is difficult. I'm starting to gain more and more respect for good teachers. And for good manuals. The PLT Scheme manual is a good example. So, what for tomorrow? Maybe finish the bootstrapping.. :) [1] http://www.structuredprocrastination.com/ Entry: multiple interpretations Date: Thu Jun 4 00:54:32 CEST 2009 The "unrolling" bootstrap compiler points to a problem in the rpn semantics interface: it is currently impossible to attach more than one interpretation. Wait. Why not simply use two rpn-parse forms? Ok, that's silly simple.. Entry: TI chips Date: Thu Jun 4 02:43:39 CEST 2009 On the wikipedia[1] page it says the C6000 floating point processor is code-compatible with the C62x. I'm not sure how that works though.. C6000 Series * TMS320 C6000 series, or TMS320C6x: VLIW based DSP's. o TMS320C62x fixed point /2000 MIPS/1.9 Watts o TMS320C64x fixed point - code compatible with TMS320C62x o TMS320C67x floating point - code compatible with TMS320C62x However, it does please me I didn't get the DM6446(C64x+)-based system for nothing. Wow.. This is not a simple chip. Maybe I should stick with the TI toolchain and libraries. I think my time is best spent elsewhere.. [1] http://en.wikipedia.org/wiki/TMS320#C6000_Series Entry: hardware want Date: Thu Jun 4 03:38:50 CEST 2009 After the USB interface is done, it's time to move to different architectures.. I want them all of course, but what would be best? - LLVM. Shouldn't be too hard. Ideal for the 32-bitters. ARM THUMB and MIPS shouldn't be too hard either. I don't have much use for this though.. - dsPIC. The architecture is not too complicated - very PIC like, with same/similar peripherals. The DSP part however is more RISC-like with lots of registers - might warrant a special sublanguage. Best for personal use as I have lots of samples. Low pin count PDIP packages. Entry: mcf and syntax parameters Date: Thu Jun 4 11:35:26 CEST 2009 Expanding with multiple semantics is maybe best solved with syntax parameters. So.. The lifting: instead of fishing out dependencies to see _what_ to be defined in the (scat) namespace, it might be simpler to just define everything and provide stubs. This might also help for more general source-level Forth simulation and analysis. [1] http://docs.plt-scheme.org/reference/stxparam.html Entry: USB sucks Date: Thu Jun 4 14:01:29 CEST 2009 The problem with USB is that it is a bit of an all-or-nothing protocol. Incremental development is hindered by not being able to access the host side's primitives directly. You just have to set it up and let it go through a sequence. It has this in common with physical real-time systems. Because you can't stop or slow time, often you just have to setup a test rig and log the behaviour for later analysis. Now, simulation does make this easier.. Is it possible to run the linux side of the chain in a step-by-step way using qemu or so? Seems to be difficult to set up. Probably logging[1] is the better approach. [1] entry://20090217-100852 Entry: dsPIC Date: Thu Jun 4 14:55:53 CEST 2009 The overall PIC24/PIC30 architecture isn't too different from the PIC18 to make most code work similarly. However, the assembler will be quite different due to presence of addressing modes that were previously unavailable. Solving this will also make porting to different architectures simpler. The first problem to solve is how to express them syntactically. The Staapl PIC18 assembler uses a flat syntax which is no longer suffcient. See table 4A, page 5 of the migration guide[1]. It's probably best to implement addressing modes as argument transformer functions that live in the Scheme namespace. However, this requires them to fill multiple fields. Maybe the complexity of the assembler should be raised a bit to be able to access multiple fields from a single argument. I.e. start with the PIC18 GOTO opcode. How does LLVM expres addressing modes? [1] http://ww1.microchip.com/downloads/en/DeviceDoc/39764a.pdf Entry: next actions Date: Thu Jun 4 17:38:03 CEST 2009 USB: usbmon trace MCF: build primitives, try to compile, make memory model DOC: port old pic18 forth doc to new doc DSP: addressing modes in assembler TTL: find MDA vsync timings SNT: metal box + drill stand Entry: MDA Date: Thu Jun 4 17:40:56 CEST 2009 MDA timings pixel clock: 16.257 Mhz (882) line frequency: 18.432 kHz (370) refresh: 49.81 Hz visible 720x350 This timing data comes from some obscure .h file i found googling for "720x350". This is probably correct as the pixel clock is standard, and the total w x h seems to correspond to other documents. Looking for 16.257 Mhz shows that this is a standard rate. I found it referenced here[1] gives 883 pixels: PC MDA (Mono Display Adaptor) B&W character-only display 80x25 with 9x14 font, so 720x350 pixels 80x25 (=2000, not quite 2k) chars of each 2byte (1 char, 1 attrib) = 4k RAM 50 full-frames/s of 368 lines (18.43kHz/54.3478us), 350 shown uses non-square pixels, clocked 16.257MHz, 883pixels/line, 720 shown not compatible with TV technology so can deviate in signalling, uses pure TTL 2bit digital brightness, and 2 separate 1bit H and V sync 3 gray levels (BI: 00=black, 10=normal (light gray), 11=bright (white)) DB9 connector 1+2 GND, 3+4+5 nc, 6 Intensity, 7 Brightness, 8 HSync, 9 VSync HSync positive, VSync negative active I checked with the monitor that 1.5 us hsync pulse works. How long should the vsync pulse be? I found timings for VGA text mode (640x350) which gives hsync 3.77 us vsync 60.us Ok. I got a stable image. Problem was type which meant D3 didn't get actuated. I have it working now with 380 lines + 1 line for the vblank pulse. Next: interrupt operation to make it jitter-free. Turn the code into a state machine, and call it from the ISR. This requires some robust way to set isr vector. As used before for TV, the hold time of the hsync pulse could be used to compute state machine dispatch for the timer interrupt. Since I'm interested in making a dedicated (black box) circuit for driving some TTL/VGA/TV monitors, this behaviour could be appropriate. [1] http://neil.franklin.ch/Projects/SoftVGA/Design/Video_Signals Entry: PIC CRT display controller Date: Thu Jun 4 19:11:56 CEST 2009 This requires the following elements: - High-priority ISR for jitter-free screen updates. - Background task for manipulating the frame buffer + communication. The problem is that the shift register i'd like to use for sprite/character drawing is also used for normal UART comm, so it would be necessary to construct an SPI circuit. This is an interesting test case for building a distributed system, where debugging console is daisy-chained. Maybe use an 18F2620 for the video driver? It has full 4k RAM and a lot of ROM for characters and graphics. It can be I2C/SPI slave and drive the universal serial output as a synchronous master port at the same time. Maybe using a CAN interface will be better, since I'd like to build such an interface for my car anyway. With CAN, the device would have to be an 18F2680. It has the same EUSART and should be able to use CAN in parallel. Probably best to go for I2C first as it is quite a bit simpler. Entry: multiple devices Date: Thu Jun 4 20:04:47 CEST 2009 Before getting into other busses, I need a way to access 2 separate chips from the same debugging session. I tried communication protocols before, but without a proper way to control _both_ sender and receiver testing becomes quite difficult. Some observations: * Async serial ports simply work. There is no substitute for interfacing to a PC. * Staapl monitor communication is host directed rpc / half duplex. Slaves are quiet unless addressed. This means all slave outputs could be wired together. * Bit-banging serial _output_ data is a lot simpler than input. The debugger could send out a protocol extension which sends the slave mask before the message. * The router could perform the reply OR in software as its inner loop. It doesn't need to understand the protocol, just combine bits. Roadmap: - build the router. - use it to bootstrap monitor protocol over I2C. Entry: Staapler Date: Thu Jun 4 20:25:01 CEST 2009 Objective has changed. The Staapler is no longer a PIC programmer, but a serial <-> network interface to distribute the monitor protocol to different chips. The current architecture has a 18F1320 which should be adequate. No.. It doesn't have xtals.. Let's take a comfortable 18F2620 @ 40 Mhz. Let's bootstrap this incrementally, just like one would do in an experimental setup. The setup is pk2 connected to staapler. Is it possible to use the pk2 to program other chips, with staapler acting as a router? I'm sort of back to square one here.. For convenience I got rid of bootloaders, at least i got rid of "standard" bootloaders. It's simpler to always start from scratch, and the monitor code just doesn't seem to stabilize.. Plus you have full control over the boot block without chance to mess up. Because routing the programmer signal is less trivial I might have to go back to bootloaders for multi-PIC experiments. Is that so? The problem is that 12/14 bit non-self programmable PICs are impossible to use then.. The first task would be to shield the staapler from the program signal. This could be done using a passive switch. To route the programming would require active switches. Hmm.. i'm lost already. Too many ill-specified conflicting requirements.. KISS. PIC18 only. No programmer routing, unless for all-equal code. What about this: start with multi-PIC projects where code is the same so PICS can be driven in lock-step. (SIMP ;) Entry: dictionary files Date: Fri Jun 5 09:20:10 CEST 2009 Problem: two conflicting behaviours wanted. - from a terminal "mzscheme project.dict" should fire up a readline Forth terminal. - from a test framework, the dictionary should be accessible as data. Maybe it's time to learn something from Taha and friends: once you call something "code" it should really be opaque. Writing out a Scheme file that's supposed to be executed, and then inspecting it afterwards is not a good idea.. If it is data (open to different interpretations) it should be written as such: tagged + unevaluated. Slogan: Data is code parameterized in its interpreter. Or the more obvious one: Code is data bound to its interpreter. Entry: metacircular forth unrolling Date: Fri Jun 5 10:44:08 CEST 2009 Ha.. Again, it's not so simple as I thought: Scat needs to be extended to make the macros based on "branch" and "?branch" work. I believe here lies the "almost a new thing" part of unifying the prefix parser and the inner interpreter. I've been trying to make this explicit for a while. The idea is this: * Parsing words read the next word from the input stream before the normal interpreter has a chance to interpret it in the default way (lookup + execute/compile) * "doLIT", "branch" and "?branch" implement control flow exceptions for the inner interpreter. Note that in both cases a proper quoting mechanism can reduce the number of words that do this to one. This is what the (quote _) form does in the rpn syntax: it loads an atom from the input stream on the stack, upon which normal semantics can be used to manipulate it further. So.. There's something simple hidden here. It's all about stacks. Prefix parsers in Staapl use the input stream as a stack. Forth's parsing words don't do this, neither does the inner interpreter. The reason Scat seems to not need a stack is because it implicitly uses a tree instead (Scheme's closure structure). What is a procedure call? It prefixes the current continuation (a code list) with a code list. There is something disturbingly non-circular about this.. A trap I fell into many times before.. There are really 3 stacks, and they correspond to the 3 registers of the CEK machine: * Code (input threaded code) * Environment (data stack) * Kontinuation (return stack) But there is this interesting relation between C and K. Anyway.. I'm getting confused and need to re-establish contact with the concrete. Practically - how to unify rpn-parse with rpn-lambda, writing parsers in terms of scat code that accesses the code stack? It is important to see that (prefix-parsers ...) is also a stack language. The problem however is that it is really arbitrary what function these stacks have.. It's only the number of stacks that is important, and whether they are used as stacks or as streams. This story is really about parsing and rewriting.. Is there some theory about this? Is a 2-stack machine fundamentally different from a 1-stack machine or a 3-stack machine? One of Chuck Moore's slogans is: you need stacks, and you need at least two. So why is one different from two? I'm missing a lot of theoretical knowledge to know where not to look.. I'd say in general the introduction of an extra stack would help you to save whatever you _were_ doing with N-1 stacks on the Nth stack, and solve a subproblem that needs N-1 stacks. It might be an interesting problem to define all functionality in scat in terms of an N-stack machine. It is already very much like that, just not explicit: - prefix parsers (rewrite rules) use the input stream as a stack. - prefix parsers use the dictionary (a stack of stacks) - the CFG compiler uses a set of stack of stacks. the basic structure of Forth is the interpreter: D D E x x x D D D D where "D" are tokens to be interpreted in the default way and "E" exception tags that change the meaning of subsequent terms. By interleaving the "D" with their semantics you get: (C D) (C D) (E x x x) (C D) (C D) Where "C" is the default compilation action. But I'm moving too far away into the abyss.. Entry: don't lift everything Date: Fri Jun 5 13:10:31 CEST 2009 Ok I see what the problem is: I'm trying to implement _all_ of a .f file describing the compiler as a compile-time entitiy. That is the mistake. The compile-time functionality usually does _not_ contain any conditional code, so straight-line execution is enough. Can this be made more precise? The phase separation bootstrapper for metacircular Forth compilers can be made a lot simpler if it makes the assumption that immediate words, implemented in terms of other code _do not_ use any immediate words themselves directly or indirectly through their dependencies. In other words: the .f file should be expressable in two phases. The reason for this is that control words are incompatible with purely compositional semantics of the scat language. It wouldn't be impossible to unroll more phases, but this would require phase 1 (scat) to support assumptions made about code threading, which isn't the case. The primitives "doLIT", "branch" and "?branch" are reflective: they know about threaded code and change the nature of the interpreter. Entry: is rewriting lowlevel or not? Date: Fri Jun 5 14:24:05 CEST 2009 Funny how the scheme pattern-based rewriting mechanism is considered highlevel, but if you try to use it to do anything complicated you miss composition. Essentially, you're using a low-level machine without procedure calls. Then what is called lowlevel macros in scheme (those that get their hands dirty with manipulating syntax as data) do have composition so comprise a highlevel machine. They are something like different local projections of the space of composition of programming methods. The thing is: the human brain (at least mine) has trouble juggling multiple orthogonal abstractions, so languages usually tend to limit the number. However, expressive power is related to the number of abstractions you can multiply. This is a basic idea in math, and maybe is the biggest reason why i am not a mathematician: i need to keep my feet in the mud: i'm not willing to lift feet off the ground to be able to fill my head with abstractions. I want to _see_ what they do. Entry: CTM Date: Sat Jun 6 08:35:42 CEST 2009 Section 3.4.3 p.140 mentions the use of Definitive Clause Grammar (DCG) [1] to hide explicit threading of accumulators. Immediately after is mentioned that they no longer use this and prefer explicit state instead. Section 3.4.4 p.141 mentions difference lists[2]. These can be used to prevent consing when manupulating the head of some list. Also it is mentioned that this can be used to append in constant time when the tail of the difference list is an unbound variable. Apparently used a lot in Prolog programming. [1] http://en.wikipedia.org/wiki/Definite_clause_grammar [2] http://en.wikipedia.org/wiki/Difference_list Entry: Multiple PICs Date: Sat Jun 6 10:03:18 CEST 2009 Plan: I2C support for monitor. Probably I2C is best for debugging since it's a bus, while SPI is a pipe. Goals: * Use the eusart shift register on the PIC18 for video generation. This requires communication over a different channel. * Multiple target debugging for testing other network code. The simplest way to get multiple targets connected is probably daisy-chaining them. This would require protocol extensions to include addressing. The simplest way is to use decrement addressing, and send replies to #xFF. Next: implement this. Ok that wasn't too hard. Now, can I make a cable / bus that attaches to the standard serial port connector? Entry: bitbanged serial Date: Sat Jun 6 11:24:36 CEST 2009 So it looks like daisy-chaining should be feasible. However, I'm actually more interested right now in getting bitbanged serial to work. Both for debugging and for MIDI apps. It doesn't seem to hard to have a single channel going, but how do you do multiple? Looks like busy-looping + oversampling is the best approach. This way each channel can have its own phase counter. Now since I'm just playing anyway, maybe the E2 protocol should be revived? A low-bandwidth single channel 4-phase protocol: 1234 WR10 3 provides power 3->4 is the sync edge 1 write time 2 read time I did have some trouble to get this to work though.. What might be better indeed is a simple modulated serial line. This would allow standard receiver hardware to work with just a modification to receive and transmit to (de)modulate the signal. The idle signal should be encoded such that frame errors can be detected when you plug into an active line. Luckily idle=1 so this should be no problem. Simply using the odd / even bits should work. Entry: CRT terminal board Date: Sat Jun 6 12:45:30 CEST 2009 It would have to be a 2620 with oscillator. The 1220 doesn't have an I2C/SPI device. So let's build it. Entry: debugger interface Date: Sat Jun 6 12:51:14 CEST 2009 Problem: I've got my boards standardized on the 1x6 serial port header for the FTDI. I'd like to keep this working because it's damn convenient. Now the problem is, 6 pin headers are not really standard.. You can find 2x5 everywhere. And I have plenty of flatcable to go with it.. How to combine? A 2x5 header could exhibit these 10 signals: POW VDD POW GND SER TX SER RX ICD MCLR ICD PGD ICD PGC ICD PGM I2C SD I2C SC SPI is best carried over a separate channel since it has 3+1 lines. It's more for fast comm anyway.. This bus is for debug. Can this be made compatible with ICD and TTLSERIAL pinouts by plugging them in some fashion? It can be made to work for either the SER or the ICD, but not both. Ser is probably more important. I can use an adaptor for ICD. GND PGD MCLR PGC VDD PGM RX SD TX SC Can the brown line (MCLR) be driven from the TTLSERIAL? This would enable target reset. It's CTS (see [1]) which is an input, so this won't work. OTOH, this means that it won't be asserted so it's ok to connect it to a reset pullup. Doing it like this would enable the use of a 2x3 header to convert to the ICD connector: x x ICD x x x x . . . . x . SERIAL . . x . x . x . o This looks quite acceptable. [1] http://www.ftdichip.com/Images/ttl232rsch1.jpg Entry: Neil Franklin's OS ideas Date: Sun Jun 7 19:05:40 CEST 2009 This[1] contains a description of a microcontroller OS. If it weren't for the comments about not wanting to use forth postfix notation, I'd say he's been reading my blog ;) What Neil is describing comes quite close to my objectives. Though Staapl's stress is on getting the Scheme side right (first). [1] http://neil.franklin.ch/Projects/Sketches/Microcontroller_Oper_System Entry: profiling Date: Sun Jun 7 19:15:33 CEST 2009 Maybe have a look at the mzscheme profiler. It would be nice to speed up compilation a bit. (define (start) (instrumenting-enabled #t) (profiling-enabled #t) (profiling-record-enabled #t)) (start) (define (stop) (output-profile-results #t #t)) (start) (require (file ...)) ;; make sure there are no compiled files! (stop) This gives more info: (syntax->datum (syntax-case (get-profile-results) () (stuff #'stuff))) Probably best to get the source loc info from that.. Now, how to up the sample rate? Entry: I2C Date: Sun Jun 7 19:51:07 CEST 2009 Is there a problem with running the controller in slave mode all the time? I.e. host doesn't know _when_ a reply will arrive, only that one _will_ arive. Maybe it's best to switch ownership. Ok I2C supports bus arbitration[1], but this is probably not necessary. [1] http://en.wikipedia.org/wiki/I2C Entry: more bootstrapping Date: Mon Jun 8 12:58:39 CEST 2009 This is extremely difficult to pin down! But it's a good exercise to understand the dependencies better and make Staapl simpler, preparing for multiple targets. Let's see how far we get with simple examples. Ha, there was something missing in the garbage collector. Let's abstract GC first, it might be useful later. Ok.. This took some time: made it reusable.. Next: build it to figure out the dictionary prototypes necessary.. This idea needs a break though. Entry: Emulators & Data Flow Date: Mon Jun 8 16:01:52 CEST 2009 What about emulators? This idea disappeared when I was rewriting the parser and module structure. What this needs is a way to specify a "mother machine" that can emulate anything, and a translation from assembler to this machine language. This translation could be interpreted or compiled. To start, primitives in Scheme should suffice. The hardest part is probably the ALU. Memory emulation is really just arrays, and memory-mapped I/O can be implemented using channels. The real challenge is in combining the PE mechanism with a simulator. However, this would kill the infinite precision types used. (Which means semantics "projection" should be better defined.) Specifiying the ALU and other hardware should be done in a declarative single-assignment language like Oz[1], equipped with a parallel + blocking statement semantics. This is closer to reality and closer to concrete HDLs (which are closer to reality).. Next: build an interpreter in Scheme that can execute this parallel code. [1] http://en.wikipedia.org/wiki/Oz_(programming_language) Entry: DFL Date: Mon Jun 8 17:53:54 CEST 2009 Started staapl/machine/dfl.ss -- the essential idea is to use Scheme's scoping mechanism to build a DFG, and then simply execute that graph. Compilation can be performed by recording a successful trace. Actually, code can be split up even more: 1. recursively create and connect nodes 2. serialize (try to resolve the dependencies) 3. abstract The first two steps can be done statically. The structure this produces (a bunch of nodes and a sequential program) can be abstracted into a state update function that can then be optimized to use registers more effectively. But: this solves the real problems: - PRIMITIVES are implemented as primitive scheme n->1 functions. - COMPOSITION is graph construction (where nodes are lexical variables), reducable to a scheme procedure. Hard to express, but what I find remarkable is that for a DFL, compilation seems to be natural. Maybe because composition naturally decomposes into a BIND and a RUN phase, because of its graph nature. Graphs expressed linearly (as a list of ops) always require 2 passes to close the references. Note that currently, macro-expanding everything isn't necessary: it is possible to re-use subnets, by abstracting them. The registration should thus be captured by the "abstraction" construct. In other words: how to turn a composite back into a primitive. Entry: more DFL Date: Thu Jun 11 10:48:02 CEST 2009 Let's factor it out a bit: computations don't need to be executed during resolution phase. 1. BIND (construct the network) 2. RESOLVE (find a workable serialization) 3. EXECUTE (run the serialized program as a function) More specificially: primitives only need to be defined up to signature (number of in/out). Ok. I've simplified it a bit and come to the conclusiong that the composition mechanism should yield scheme functions (thus serialize all computation). Also, the dependency resolution is now separated from computation so it can in principle be done at expansion time. This poses an interesting problem however: the syntax uses scheme's lexical variable binding, but at expansion time this isn't available yet. So syntax should be lifted to a next stage. I ran into this pattern before. There is probably something interesting hidden here. Entry: DAN 60 Date: Thu Jun 11 23:10:18 CEST 2009 http://www.cs.indiana.edu/dfried_celebration.html Last year I really enjoyed the talk of Anurag Mendhekar[1] "Aspect-Oriented Programming in the Real World". I'm watching it again. Some point to take home: * You can't write generalized weavers. That's a bit of the point of aspect-oriented programming: your design (the aspects) determines the implementation of the weaver. * Practical advice: your users will think language = syntax. Use XML to trick them into thinking they already know what you're telling them. Next to a standard syntax, also try to give your language a dumbed down semantics. (This is the rule of least power[2]). * Be _really_ careful about the abstractions you build. Know your stuff (language design, macros, source tx, compiler design) * Specific techniques: abstract interpretation, monadic semantics and Hindley-Milner type systems (constraint solver). I don't find any introductory work on Modular Monadic Semantics. Googling for it does yield some papers though.. About this syntax vs. semantics thing. Indeed, I've run into this too. Semantics seems to be something that is "hidden" when people not experienced with how computers (interpreters) work think about programming. The direct contact is with the language syntax. They will absorb the semantics through example. It seems that this implicit notion of language semantics is quite central to the way natural languages are used: semantics is created by your _own_ brain, by observing how _another brain_ interprets the syntax. Starting from an explicit model of semantics is really confusing, as it puts abstraction before experience[3]. [1] http://video.google.com/videoplay?docid=1875973989673234551&hl=en [2] http://www.w3.org/2001/tag/doc/leastPower-2006-02-23.html [3] entry://../quack/20090613-140830 Entry: DFL + generating C code Date: Wed Jun 24 19:40:05 CEST 2009 Maybe it's time to start using this dfl language together with some mechanism to generate processors that can be embedded in Pure Data. Entry: sheep synth in a cherry ps2 keyboard Date: Wed Jun 24 22:49:03 CEST 2009 I didn't find a proper box, so i'd thought I'd give it a try to put it inside a cherry keyboard. But what to do with the keyboard itself? Should I leave the controller in and try to use ps2 commands, or should I build a new scanner? 4 pins: CLOCK,DATA,GND,+5V The clock frequency is 10-16.7 kHz. 11 bit protocol: start-8data-parity-stop. Data changes on falling edge and can be sampled on rising edge. [1] http://www.computer-engineering.org/ps2protocol/ Entry: TTL mono Date: Thu Jun 25 12:23:03 CEST 2009 I have a bunch of 13.5 MHz xtals that I'm not going to use for anything. Maybe they can serve for the TTL project? 54MHz should work for the 48MHz parts. Entry: IDC debug pinouts Date: Thu Jun 25 13:12:14 CEST 2009 So, what's the standard way of numbering 2x5 ribbon cable? Judging from the blades, pin 1 is either in lower left or upper right. I'm taking the layout I find in standard IDE cables, which puts pin 1 in lower left with the gap on the bottom. --------------- | 2 4 6 8 10 | | 1 3 5 7 9 | ------ ------ The connector that's mounted on top of this maps the pins to this pattern to connect to the ribbon cable with pin 1 marked Red, and the other pins Grey. 2 4 6 8 10 1 | 3 | 5 | 7 | 9 | | | | | | | | | | | R G G G G G G G G G Let's fix it like this: pin 1 of the IDC is mapped to pin 1 (GND) of the TTL serial. The 1x6 connector seems to fit in the 2x5 ridged header plug. 1 GND 2 PGM 3 MCLR 4 PGC (green) 5 VDD 6 PGD (blue) 7 RX 8 SD 9 TX 10 SC Maybe one more modification. To have the programming lines accessible through a 2x3 is maybe not so important as having the power + I2C lines available? [1] entry://20090606-125114 [2] http://en.wikipedia.org/wiki/Insulation-displacement_connector Entry: Component Order Futurlec Date: Thu Jun 25 16:52:57 CEST 2009 IDC 2x5 connectors + sockets. 90deg angle 1xn male headers 1xn female headers (stamp board) http://www.futurlec.com/ConnIDC.shtml Entry: logic analyser Date: Fri Jun 26 13:50:18 CEST 2009 It shouldn't be too difficult to perform sampling, but the problem is getting the data to the host. Currently I have about 200kbit for the serial line.. My previous conclusion was that USB is the only viable option as it can transfer 12Mbit. Maybe I should re-get going.. Entry: why? Date: Sat Jun 27 10:48:58 CEST 2009 I got hit by a suddon jolt of loneliness in doing all this. But hey, it's what I wanted: to have time to try things out for myself. I ran into this[1] last night and started to read up on plan9[2], inferno[3] and oberon[4]. Then I ran into a plog post about low-level programming[5], which made me see why I really enjoy that kind of work: simple, possible to get it right, room for cleverness hidden behind clean interfaces. The fact that hardware needs some extra massaging I guess isn't so bad compared to having to deal with bloated towers of arbitrary brokenness :) [1] http://www.loper-os.org/?p=8 [2] http://en.wikipedia.org/wiki/Plan9 [3] http://en.wikipedia.org/wiki/Inferno_(operating_system) [4] http://en.wikipedia.org/wiki/Oberon_operating_system [5] http://www.yosefk.com/blog/low-level-is-easy.html Entry: PIC overclocking Date: Sat Jun 27 14:00:34 CEST 2009 Picstamp is configured with XTPLL, but the 2550 for TTL mono generation has a 13.5 MHz XTAL attached, so it needs to be HS. Doesn't work with 2550. Could be I do something wrong with my config, but I guess it's because the PLL is fixed frequency. Tried with 4x HSPLL on a 2620 and this seems to work, so I'm no longer trying the 2550. Entry: further bus bootstrapping Date: Sat Jun 27 14:46:38 CEST 2009 Connector seems to work. In order to get the ttlmono to work with I2C i need to bootstrap that first using daisy-chained serial. Serial daisy-chaining works too. I've added a debug connector to the 452-40 board. A little tweaking on the host side is still necessary to keep everything lock-step + add some error recovery. Now, how to manage different images on the host? As long as they have the same macros this isn't such a problem. Images could then carry symbolic data so they can be mapped to bus ids. Entry: fix PLaneT package Date: Mon Jun 29 23:59:16 CEST 2009 I forgot it's broken. Just tried install under windows, which works fine except that it doesn't compile .hex files. Ok.. I get a strange error building the docs. Same one that happened when moving 32 -> 64 bit. Maybe some files need recompile? tom@zni:~/staapl/doc/scribblings$ make scribble --pdf ../../staapl/scribblings/staapl.scrbl match: no matching clause for # make: *** [staapl.pdf] Error 1 I think this is different versions of structs.. Something wrong with the paths. Ok, there's only one proper way I can see to fix this: catch the "match" error and translate it to an error which prints the current context somehow. I've isolated the problem to this: @section{PIC18 Forth} @forth-ex{ planet zwizwa/staapl/pic18/route } @ex[()(code)] @forth-ex{ forth : one 1 ; : two 2 ; : three 3 ; : four 4 ; : interpret-byte route one . two . three . four ; } @ex[()(code)] It seems to confirm the idea: the code required by the forth statement re-instantiates the compiler struct.. Let's test that idea again by adding a print statement near the struct def. Running just the demo module there is indeed a double instantiation. box> module demo.ss (/home/tom/staapl/staapl/pic18/demo.ss) define-struct compiler box> (forth> "planet zwizwa/staapl/pic18/route") define-struct compiler Entry: more errors and warnings Date: Tue Jun 30 11:53:43 CEST 2009 It's high time this gets automated.. setup-plt: WARNING: duplicate tag: (mod-path "(planet zwizwa/staapl/macro)") setup-plt: in: /home/tom/.plt-scheme/planet/300/4.2.0.5/cache/zwizwa/staapl.plt/1/10/scribblings/staapl.scrbl setup-plt: and: /home/tom/.plt-scheme/planet/300/4.2.0.5/cache/zwizwa/staapl.plt/1/9/scribblings/staapl.scrbl setup-plt: WARNING: duplicate tag: (mod-path "(planet zwizwa/staapl/pic18/demo)") setup-plt: in: /home/tom/.plt-scheme/planet/300/4.2.0.5/cache/zwizwa/staapl.plt/1/10/scribblings/staapl.scrbl setup-plt: and: /home/tom/.plt-scheme/planet/300/4.2.0.5/cache/zwizwa/staapl.plt/1/9/scribblings/staapl.scrbl setup-plt: WARNING: duplicate tag: (index-entry (mod-path "(planet zwizwa/staapl/macro)")) setup-plt: in: /home/tom/.plt-scheme/planet/300/4.2.0.5/cache/zwizwa/staapl.plt/1/10/scribblings/staapl.scrbl setup-plt: and: /home/tom/.plt-scheme/planet/300/4.2.0.5/cache/zwizwa/staapl.plt/1/9/scribblings/staapl.scrbl setup-plt: WARNING: duplicate tag: (index-entry (mod-path "(planet zwizwa/staapl/pic18/demo)")) setup-plt: in: /home/tom/.plt-scheme/planet/300/4.2.0.5/cache/zwizwa/staapl.plt/1/10/scribblings/staapl.scrbl setup-plt: and: /home/tom/.plt-scheme/planet/300/4.2.0.5/cache/zwizwa/staapl.plt/1/9/scribblings/staapl.scrbl setup-plt: rendering: /zwizwa/staapl.plt/1/10/scribblings/staapl.scrbl setup-plt: error: during making for /zwizwa/staapl.plt/1/10/comp setup-plt: define-unit: undefined export macro/address in: (define-unit machine-test@ (import) (export machine^) (compositions (macro) macro: (code-size 1))) setup-plt: error: during making for /zwizwa/staapl.plt/1/10/pic18 setup-plt: default-load-handler: cannot open input file: "/home/tom/.plt-scheme/planet/300/4.2.0.5/cache/zwizwa/staapl.plt/1/10/pic18/dtc.ss" (No such file or directory; errno=2) Ok.. added a 'allss' target which compiles all .ss files in staapl/ tree. This should make it easier to create a garbage-free dir. Entry: blink-a-led and non-interactive code Date: Tue Jun 30 12:08:55 CEST 2009 The chip macros depend on "baud". I've removed these but it will probably break the rest.. I just tried if this worked: 0 org-begin : def1 1 2 3 ; : def2 123 ; org-end Apparently it does. I'm surprised. This is an indicator that I need to simplify the CFG generator, since I don't think this should work... Ok, what I think this does is create a single block that cannot be reallocated. This is the correct behaviour. Omitting an org statement leaves the flash allocation to the compiler (later, currently it's just concatenated at 0x44 for PIC18 = boot block + jump over init code). Fixed dependency on "baud" and "fosc". Entry: improving build times Date: Tue Jun 30 14:50:35 CEST 2009 on zni, building all objects takes time make clean all-modules real 0m46.524s user 0m43.923s sys 0m1.708s I suspect the reason for this is too much "for-syntax" requirements. Hmm.. maybe not.. Note that I did remove "poke" (say takes one second), but the result is now: real 0m42.222s user 0m40.039s sys 0m1.544s A little but not so much.. Entry: fixing the disassembler Date: Tue Jun 30 17:11:40 CEST 2009 The dasm needs info about the type of the operations to perform proper reverse lookups. Where can these be inserted? Damn it's too hot to think.. Entry: Against the Toplevel Date: Wed Jul 1 10:39:18 CEST 2009 I find myself quite in the middle! I agree wholeheartedly that programs should be cast in stone, with a clear structure where all dependencies (all links) are explicit. But. For quick-and-dirty stuff in a world that's not perfect (electronics) a little extra power does wonders. [1] http://calculist.blogspot.com/2009/01/fexprs-in-scheme.html Entry: parsers Date: Sat Jul 4 23:41:14 CEST 2009 One of the things that need to change in the next iteration is the unification of the prefix parser with the code CFG generator. both do essentially the same thing: transforming a linear structure into a list of things (parsing: definitions, CFG: basic blocks). Entry: complex hardware Date: Mon Jul 6 16:35:27 CEST 2009 Why are computers (hardware/software) so complex? - standards: adhering to standard interfaces makes it easier to use component based design, which enables the economy of scale. however, there are _lots_ of standards, and some of them go back a long time. - optimization: a good example is the memory hierarchy. fast memory is simply too expensive to have a lot of, and cheap memory is complex to read and write. if you can eliminate those two, you end up with really simple things. Because the big enemy in the long term is complexity, is there a way to eliminate these two evils? Especially on the hardware side, reducing complexity might yield lower production cost for smaller components. Entry: small projects Date: Tue Jul 7 22:25:44 CEST 2009 Get a small scheme running on the wrt boxes, something with a simple FFI. Then metaprogram it from PLT. Using tinyscheme. I had to simply substitute gcc for mipsel-linux-gcc from the OpenWRT buildroot. Let's see if the same works for scheme shell. The problem is that there are multiple bootstrap stages. I wonder if qemu can be used to run openwrt binaries. Currently it chokes on finding the libraries.. This made me discover scratchbox2 - which is a QEMU based simulator build system, built around autotools. Entry: multi-pic Date: Sat Jul 11 15:11:22 CEST 2009 It looks like the simplest approach is going to be to use the _same_ language for all chips involved, which means that the set of macros are shared. However, each chip should have its own _dictionary_. It is ok for one or more chips in the chain to use a _different_ language, as long as it is turned off. Now.. Maybe it's better to make sure that this isn't necessary: to be able to use one language per host, and simply share the macro namespaces if necessary. Entry: C parsing needs the preprocessor Date: Tue Jul 14 14:46:20 CEST 2009 I'm thinking about solving a problem that needs solving: PLT Scheme needs to be able to parse C code and CPP code. This involves: 1. understanding Dave Herman's c.plt package 2. implementing CPP In general I would like to keep on refining Staapl to bring it closer to integration with standard tools (mostly GCC and binutils) and figure out ways to bring gradual typing to a dynamic language. On the other hand I would like to write a system that is not disruptive in a GCC toolchain. Write something that is immediately useful without having to move to Forth and Scheme. Keep the metaprogramming in the C domain, but provide a decent library interface and make sure popular scripting languages can benefit from this. Entry: Bootstrapping through simulators Date: Wed Jul 15 17:35:03 CEST 2009 Check out this[1] and this[2]. It might be an interesting way to get going with the Spartan-3A, figure out a good way to make porting Staapl to other architectures simpler and get Staapl gegenwind. Next steps: - figure out how to build System09 using the xilinx tools. it looks like there are makefiles provided so now i just need to get the general idea of how a fpga design is built up from source files. - get a 6809 simulator[5] to hook into Staapl for testing a code generator. the Forth machine model could be gotten from MaisForth[4]. Looks like[6] Hans is looking into porting to the Spartan-3A board!. Actually, this is old news. It is ported (as can be read on [1]). Dikke vette kluif! [1] http://code.google.com/p/rekonstrukt/ [2] http://members.optushome.com.au/jekent/system09/index.html [3] http://en.wikipedia.org/wiki/6809 [4] http://home.hccnet.nl/anij/xedni.html [5] http://koti.mbnet.fi/~atjs/mc6809/ [6] http://netzhansa.blogspot.com/2009/03/rekonstrukt-progress-midi-drum-machine.html Entry: HCC - AVR Forth (dutch) Date: Wed Jul 15 17:58:41 CEST 2009 http://www.forth.hccnet.nl/forthvanafdegrond12.html Entry: Assembler with addressing modes Date: Wed Jul 15 19:11:34 CEST 2009 PIC asm is quite simple due to lack of addressing modes. I need to find a way to support addressing modes for supporting the 6809 and dsPIC architectures. First step: get inspiration for syntax: find an s-expression based representation and write it as code modifiers. Let's think about this a bit. What is an addressing mode? It is a register list semantics modifier. It doesn't necessarily modify a single register, but can also modify groups (i.e. relative addressing). I don't have enough data points so this needs to be implemented using a relaxation approach: just do it and see where it gets at.. There are two ways to look at it: - how to represent it textually - how the modes are encoded Typically, encoding is a bit vector that expresses the type of the arguments. I.e. for dsPIC. The reason why I find this so difficult is because I currently have assemblers modeled as _functions_ while this doesn't work for addressing modes, unless you see an addressing mode as an opcode modifier instead of an argument modifier. In some sense, I need to undo some of the cleverness that's used in the design of the assembler language. Now in general there seems to not be so much structure so it looks like a simple expansion step on top of the current implementation is what is necessary. Entry: dsPIC instruction encoding Date: Thu Jul 16 08:25:03 CEST 2009 HA! I didn't see this one before: in the dsPIC programmers reference table 6-2 on page 6-10 contains the instruction 4bit x 4bit encoding matrix. Entry: 'instruction-set macro Date: Thu Jul 16 08:37:10 CEST 2009 Updating the 'instruction-set macro to include expansions so this: ;; byte-oriented file register operations (addwf (f d a) "0010 01da ffff ffff") (addwfc (f d a) "0010 00da ffff ffff") (andwf (f d a) "0001 01da ffff ffff") can be written as: (fda (o f d a) "oooo ooda ffff ffff") (addwf (f d a) (fda "0010 01" f d a)) (andwf (f d a) (fda "0001 01" f d a)) or in curried form: (fda (o f d a) "oooo ooda ffff ffff") (addwf (f d a) (fda "0010 01")) (andwf (f d a) (fda "0001 01")) Entry: instruction decoder (asm - dasm - sim) Date: Thu Jul 16 08:46:30 CEST 2009 This assembler business is a serious pain in the ass.. What I really want is: - assembler (s-expr + forth RPN syntax) - disassembler + standard dasm pretty-printer - simulator All generated from the same piece of code. In some sense, the manufacturer's asm syntax is disposable. As long as it can be pretty-printed there is really no need for it. What counts is the binary machine code syntax. The main reasons to include the simulator in the loop are: - development of sim+asm/comp itself is made testable - partial evaluation will no longer need ad-hoc semantics Bottom line: currently Staapl is completely defined in terms of machine semantics and ad-hoc infinite precision eager evaluation at compile time. It needs some semantics to at least provide a safety net for this cavalier way of dealing with compile time computations. Basicly, there should be only one '+' in the whole chain. A central piece in this is the instruction decoder. If it is possible to write this as a bijective function, a 1-1 map between parsed opcodes and a binary vector, the rest is just connecting up logic elements to registers. Entry: Bijective functions Date: Thu Jul 16 09:07:13 CEST 2009 I've seen invertable functions before in PLT: in the web server: stuffer[1]. (struct stuffer (in out)) in : (any/c . -> . any/c) out : (any/c . -> . any/c) A stuffer is essentially an invertible function captured in this structure. The following should hold: (out (in x)) = x (in (out x)) = x Then it defines a composition operation for these. Another thing to think about is constraint based programming from SICP[2], and in the Guy Steele thesis[3][5]. See also thread on LtU[4]. The CP approach in a nutshell: - define primitive constraints as a collection of directed functions. - define constraint composition (network) - find a way to transform an undirected constraint network into a directed data flow graph So it looks like the current dataflow graph code in Staapl could be re-used for representing the more general constraint problem. What would be the primitive constraints for a decoder? Essentially, how can it be factored? [1] http://docs.plt-scheme.org/web-server/stateless.html [2] http://mitpress.mit.edu/sicp/full-text/book/book-Z-H-22.html#%_sec_3.3.5 [3] http://www.ai.mit.edu/publications/pubsDB/pubs.doit?search=AITR-595 [4] http://lambda-the-ultimate.org/classic/message4990.html [5] md5://4226c96cd6ac34fd4eea1c38de1ecad4 Entry: DAG sorting Date: Thu Jul 16 10:37:47 CEST 2009 I wrote down an algorithm for this in the car a couple of weeks ago that uses only vector element transpositions. example: 1 -> 4 -> 3 -> 2 1 2 3 4 2 3 4 . . 3 2 4 . . 4 2 3 . . 3 2 --------- 1 4 3 2 Entry: PIC18 decoder Date: Thu Jul 16 10:47:10 CEST 2009 The way to approach this is to start with the smallest opcode field. For PIC18 this is the top 4 bits of the 16 bit instruction word. (Which becomes the top 3 in my representation, as bcf/bsf are modeled as a single parameterized instruction to ease the optimization of logic complement). ;; p: 0/1 (bpf (p f b a) "100p bbba ffff ffff") ;; bsf/bcf (btfsp (p f b a) "101p bbba ffff ffff") ;; btfss/btfsc (btg (f b a) "0111 bbba ffff ffff") The question is then: how to specify this as an equation (bidirectional function) and how to attach semantics to the fields to build the simulator. Currently the letters used as parameter names have a semantics: this should be defined better as connections to machine state (i.e. R means PC-relative word addressed). These should all be equations. This really hints at the proper solution though: all this is so intimately related that it should be solved as a whole. It is really not much more than a single morphism between the two spaces: ( machine-statevector, binary-instructions, computation-logic ) ( msv-rep, highlevel-instruction-syntax, simulator-primitives) The problem is representing all this in a form that is accessible for metacomputation. Maybe it's best to start with a ball-of-mud and then disentangle it piece by piece? Entry: split bit fields Date: Thu Jul 16 10:59:41 CEST 2009 First, I'd like to fix this: (_call (s l h) "1110 110s llll llll" "1111 hhhh hhhh hhhh") This could be written as (call (s (h l)) "1110 110s llll llll" "1111 hhhh hhhh hhhh") where the concatenation of the bit fields is automatic. But, this notation should not interfere with type specs. Entry: constraints -> dataflow Date: Thu Jul 16 11:16:56 CEST 2009 Is it possible to implement a constraint network such that it can mostly be used as a directed dataflow network, but the possibility is kept open to run some functions in reverse? The current compile-time DAG sorter could be generalized to a constraint sorter, but then the primitive functions need to be constraints.. What it does look like though is that this can probably be extended without too much trouble. Entry: the relative conditional jump Date: Thu Jul 16 11:21:55 CEST 2009 Start with the most difficult instruction: one that uses machine state and a typed operand (an addressing mode): a relative conditional jump. Branch if carry flag is set: bc R 1101 0010 RRRR RRRR elements: C carry flag R operand type instruction format PC program counter PC <- (if C (+ PC (* n 2)) PC) So, what is "R"? It is some simplified representation of the semantics "PC <- (if C (+ PC (* n 2)) (+ PC 2))". This is a very important question to answer. Distinguish: - code syntax - symplified semantics (type) - concrete semantics (simulator) Entry: curry-howard correspondence Date: Thu Jul 16 11:37:15 CEST 2009 I have a chance here to understand some theory better. Type systems are simplified semantics: The morphism between typed programs and proofs is a well-specified version of simplification. In Pierce 9.4 p.109: linear logic <-> linear types modal logic <-> partial evaluation and run-time code gen I did not know this for modal logic[1]. Probably a point to expand on a bit.. [1] http://en.wikipedia.org/wiki/Modal_logic Entry: roadmap Date: Thu Jul 16 14:05:17 CEST 2009 I got a pretty big stack now, started with the idea to port to 6809.. Let's disentangle (goal followed by dependent goal indented). * 6809 port + rekonstrukt on the Spartan-3A avnet evl * asm - dasm - sim specification (would also enable dsPIC port) * constraint / dataflow specification of a CPU arch (would enable proper Staapl semantics useful in static processing: checks + PE So, how to tackle this? As I already mentioned before, an example of the problem I'm facing concretely is how to unify types (very ad-hoc metadata for asm parameters) with the concrete machine semantics. I should clarify this and grow the disassembler into a simulator by figuring out how to represent the two stages: assembly/binary instruction syntax and run-time machine state operations. Entry: type checking vs. abstract interpretation Date: Thu Jul 16 14:35:00 CEST 2009 I would like to understand the connection between abstract evaluation and type systems[1][2][3]. Damn.. LtU is always a good place to get re-humbled.. "Conventional type-checking is big-step evaluation in the abstract: to find a type of an expression, we fully `evaluate' its immediate subterms to their types. Our paper[2] describes a different kind of type checking that is small-step evaluation in the abstract: it unzips an expression into a context and a redex." [1] http://lambda-the-ultimate.org/node/2208 [2] http://okmij.org/ftp/Computation/#small-step-typechecking [3] http://okmij.org/ftp/Computation/#teval Entry: machine specification Date: Sun Jul 19 09:09:57 CEST 2009 It looks like this diverging/converging semantics problem really needs a good way to specify bottom line operational semantics. Can this bottom line be expressed in terms of a proto Forth machine instead of lisp or dataflow? I am very confused atm... It seems this is all somewhat straightforward: i don't see any _real_ problem other than encoding the vague description in a collection of data structures.. However, I don't seem to see how to start. This makes me think I'm underestimating a certain element. Let's pick up the data flow language again, and build an ALU simulator to see how exactly the specification would go for the dasm <-> alu connection to make the simulator. Entry: DFL language implementation : Summary Date: Sun Jul 19 09:25:47 CEST 2009 See here [1][2][3] for font line notes about developing the DFL implementation in Staapl. I got distracted by the trick used (using "eval" at expansion time). The goal of this language is to have a functional / combinatorial description of a machine (in terms of a connection of functions), to ease the propagation of machine semantics to higher levels. The main goal is to use this for code analysis, starting with compiler correctness and possibly guiding the development of application-specific type systems. Funny.. I got distracted again! Don't worry, I'm employing the method of structured procrastination[4]. [1] entry://20090608-175354 [2] entry://20090611-104802 [3] entry://20090608-160152 [4] http://www.structuredprocrastination.com/ Entry: Abstract Interpretation and Higher Order Macros Date: Sun Jul 19 11:16:28 CEST 2009 Let's define HOMs as macro generating macros, not macro versions of higher order functions (comprehensions?). The problem with alternative interpretations and macros is that there really should be one evaluation step (syntax -> semantics). Can this be done by making a macro generate both code and a macro? I can't reach it at the moment. The abstraction level is too high... I wrote to the plt-list about this. Entry: next Date: Sun Jul 19 18:48:59 CEST 2009 Try to understand Matthew's reply[1] and see how to apply this to other problems (the DFL compiler, the Forth bootstrapping, the Scheme snarf for Scat, ...) Entry: using the macro-generating-macro trick for dfl Date: Mon Jul 20 13:05:01 CEST 2009 I think I understand the point, but I don't quite see how to put it in the DFL macro. I guess this is for after the break. Going to do some more layed back stuff to recover from last couple of day's sprints and insomnia. Entry: back to basics Date: Tue Jul 21 09:11:58 CEST 2009 So what happened? Things were going great and all of a suddon I get overly ambitious because I start explaining (I'm looking for clients atm, and prospects want to know wtf I'm doing), and because I'm looking for direct application. I think I have to be honest and re-focus on the small things for the next sprint. Forget about C and LLVM. It's better to solve that problem when there is an actualy need, as there is plenty of material and people to get me through it. I need to stick to the original idea: * stick to Forth (small machine) and Scheme (untyped lamda calculus) as a _simple_ substrate for the idea (a beefed up macro assembler) * incorporate abstract interpretation / static analysis / verification on top of this. Entry: parsing Date: Tue Jul 21 09:14:30 CEST 2009 The rewrite in march-april this year solved some problems. The parser is now static and composes better. However, there is still duplication at several points. The idea of dictionary and prefix parsing is so central to forth that it shows up everywhere - especially when you unroll some reflection - so maybe this deserves a better abstraction? Also, current parser seems too verbose to me: it should be written in Forth sooner than CPS Scheme. Another point is that it might be better to perform control flow analysis in a second step. Doing everything at once complicates matters, and for targets that do it themselves (later: LLVM) it's completely unnecessary. Should i just remove it, and go back to a more straightforward low-level semantics? The way compilation works atm, it needs to solve the following problems: - implement `org-begin' .. `org-end' - macro local exit The former allows you to place blocks of code at fixed addresses. I could try this pretty much in isolation, since the compiler is a unit.. Looking at it, the current implementation is certainly not too complicated. The `state-update' function helps a lot to make it more readable. The trouble is however that the control macros do things behind the scences. Another problem is that conditional branches are not counted, making the resulting data structure not a proper CFG. Maybe finding a proper naming scheme and a better central data type would help? Maybe it's just too lowlevel.. Another problem is the use of assignment later to build a "non-standard" graph structure (as opposed to a flat tree). A flat structure can be turned into a tree using names to be bound, i.e. a lambda expression. Can this be used to represent the data structure better? Entry: semantics and static analysis Date: Tue Jul 21 11:43:02 CEST 2009 The more I see the structure of what I'm trying to do, the more "normal" it all becomes. This is good. Looks like I'm about to find a bit more connections to existing approaches. 1. a well-defined operational semantics (relative wrt a single abstract point, not the target's concrete machine language) will help for "interpretation" based analysis: either full simulation or simplified abstract interpretation. 2. this can be used to assist (by verification and test) in building locally-specified higher level semantics for sublanguages, consistent with the operational semantics. The idea is to start from the machine, and simply increase the abstraction level at certain points + provide a set of tools to do this. Entry: growing higher level semantics Date: Tue Jul 21 12:42:23 CEST 2009 The overall problem is that it is quite straightforward to go from high -> low by mapping a high level construct in a low level one, and proving this map preserves the algebraic/logic structure. Going the other way is not possible in general because a lot of constructs will lie outside of the range of the mapping. (I.e. the lower-level semantics does not have a structure to hold on to..) So, in general, how do you build higher level semantics given an operational one? Entry: macro assembler Date: Tue Jul 21 12:38:10 CEST 2009 Going further with this: if Staapl is to be a beefed up macro assembler speckled with application-specific local static semantics, then it needs to be able to talk to existing assemblers. Maybe I've been focussing on the wrong foe? It's not C I need to target, but machine assemblers. It's probably best to hide the staapl asm/dasm/sim behind an interface that can be easily implemented by external tools. Entry: RIDL Date: Tue Jul 21 14:20:12 CEST 2009 Real-Time Functional Reactive Programming [1] http://www.texbot.org/ Entry: daisy chain buggy? Date: Tue Jul 21 20:23:32 CEST 2009 i get dropped bytes.. shouldn't happen really.. maybe bad wire? lowered to 115200 still problems.. hmm.. maybe it's a buffering problem? i changed `chunk-receive' to #x20 buffers, now it works?? weird. Entry: multi-pic debugging Date: Tue Jul 21 23:01:56 CEST 2009 instead of switching between target consoles, it might be simpler to start working in a host + client approach and go to client consoles when checking local things. the monitor should have the semantics of an RPC server. Entry: dfl-compose without eval Date: Wed Jul 22 12:32:40 CEST 2009 (define-syntax (dfl-compose stx) (syntax-case stx () ((_ formals body ...) (let-staged ((nodes (dfl-graph formals body ...))) ;; (*) #`'(dfl-sequence formals #,(dfl-sort-graph nodes (syntax->list #'(body ...)))))))) ;; The construct(*) behaves as: ;; ;; (let ((nodes (eval #'(dfl-graph formals body ...)))) ___) ;; ;; but doesn't use `eval'. Instead it uses a 2nd macro stage to ;; trigger evaluation of the syntax form. The function ;; `dfl-sort-graph' takes both a syntax and a value `nodes' derived ;; from it and produces a sorted program. Entry: DFL next Date: Wed Jul 22 12:41:53 CEST 2009 Implement primitives. Ha! primitives are multivalued functions :) The basic idea is this: I'm not going to bother with making dataflow networks that can be processed as data structure. For this it's better to use techniques similar to dfl-sort-graph that operate on source code directly. The point is eventually to be able to run the DFL statements, so it looks like it's simplest to make sure the dfl-compose / dfl-sequence operation maps scheme functions to a scheme function, so DFL programs mesh well with other scheme programs. Which is exactly what it already did.. Entry: Scheme's call/cc and C Date: Thu Jul 23 09:07:11 CEST 2009 [1] http://lambda-the-ultimate.org/node/1422 [2] http://repository.readscheme.org/ftp/papers/sw2000/feeley.pdf Entry: Extreme Forth Date: Wed Jul 29 09:26:11 CEST 2009 Reminds me of the dataflow vs. `thread oriented' programming idea mentioned before. In order to make DSP work for Forth, some kind of local naming needs to be used to 1) make the random-access work. and 2) introduce concurrency in a serial language. I'd say there is really only one way to properly program DSP code, and that is using a dataflow language (DFL). The point is that using Forth as a _frontend_ syntax isn't such a bad idea. The syntax allows _programmers_ to perform extreme factoring. DFL code is tedious to write due to explicit naming of outputs. See the following table: named-IN named-OUT - forth / / - expression x / - dataflow x x So, if you add some kind of primitive to forth that represent (local) DFL nodes as names (one could `constant' semantics for inputs and use the `->' operator as node single assignment) A B + -> C Looking at the C18 described in [1] the local names are actually _streams_ or more concretely, autoincremented memory access. The rest of the article talks about distributing an app over multiple cores. [1] http://www.ddj.com/hpc-high-performance-computing/210603608 Entry: Forth and `singletons' Date: Wed Jul 29 10:20:29 CEST 2009 Related to `electrical engineers love macros' and the use of global structures inside Forth abstractions for embedded programming. If I'm allows to simplify greatly: most embedded programming can be thought of as consisting of two steps: - building abstractions - instantiating + connecting Because the latter part contains a significant amount of _you really only need one_ the abstractions can usually be macros _that make this assumption_. Write something intelligible about this. Entry: electronics is the world of leaky abstractions Date: Wed Jul 29 17:18:18 CEST 2009 How much we'd like the world to be perfect, when it gets physical there is no way around hitting the wall from time to time. A lot of embedded programming is essentially about resource management and getting the order of things right. Due to cost issues contemporary hardware often isn't smart enough to not step on its own toes. In many cases it is the driver code that is responsable for most arbitration. (Simply put: most driver code makes sure you don't turn on this thing before you turn off that other thing.) Considering the way the human brain deals with such a mess of details, there seems to be little that can be done other than `test driven development'. TODO: expand this idea a bit more.. Basicly: I'm convinced any kind of code can be made pretty in some sense by using the right abstraction. Driver code is usually highly stateful and ugly: most useful information is hidden behind the order of operations. Can this be done better, i.e. expressed in terms of the actual dependencies i.e. a state-flow graph cfr. a data flow graph? Maybe I'm just looking for CSP? Concretely: can the PIC USB driver be used as a test case for this? Entry: NEXT: study Date: Fri Jul 31 12:37:40 CEST 2009 I'm in a quite confused position right now. Frankly I'm lost. I need to read up on some ideas that are key to the further evolution of Staapl. At this moment the current core is a dynamicly typed macro system which has _names_ right: scheme's lexical scope + hygienic macros. This makes Staapl into an assembler with a fancy preprocessor. The next stage is semantics of stuff to build on top of the untyped/dyntyped infrastructure. The focus should be on _examples_ of how to tackle a particular problem using constrained metaprogramming (typed DSLs). 1. bottom up semantics (ziggurat) - typed macros (dherman) Figure out what the central idea is and try to see the connection with writing DSL on top of the macro forth. 2. partial evaluation - abstract evaluation - staging - semantics http://okmij.org/ftp/Computation/#teval http://redex.plt-scheme.org/ 3. incorporate some special purpose static semantics (i.e. CSP/occam) Entry: generators Date: Fri Jul 31 15:40:11 CEST 2009 Some ideas that are coming together: * Gluing C and Scheme code it is often data aggregation that gets in the way: creating scheme lists from C makes it difficult to automaticly wrap the C and Scheme world. (idea: replace lists by generators). * Universal traversal interface[1]. * Lots of threads and channels[3]: eliminate datastructures (which are really "snapshots") by always directly connecting consumers and producers. There should be a link between this and linear data structures. * Can data allocation be completely eliminated from C primitive code? I.e. write in single assignment style[2]? [1] http://okmij.org/ftp/Streams.html#enumerator-stream [2] http://en.wikipedia.org/wiki/Oz_programming_language [3] http://en.wikipedia.org/wiki/Occam_programming_language Entry: linear data structures vs. dataflow communication channels Date: Fri Jul 31 15:52:01 CEST 2009 A linear data structure can be consumed only once. This seems suspiciously similar to threads communicating over synchronous channels: one read per write. Combine this with Forth (stack languages) as a vehicle for linear data structures it should be not too far from a very nice morphism between linear forth + communicating processes. Damn I need to write this down formally. The hint is: delimited continuations = tasks such are the link between data structure cursors (data) and traversal routines (code). essentially: the shift/reset can turn the code into a data structure (= a piece of the return stack). So when you start thinking in terms of: the data structure _is_ the traversal, it can be made to disappear. Essentially: datastructures are future computations. Entry: Delimited Continuations in C Date: Fri Jul 31 16:21:35 CEST 2009 This means: implement shift() and reset(). I assume that setjmp can be used for this, or at least to implement the part that captures the C-stack. reset() would be a first setjmp() while shift() is a second setjmp() followed by an operation that saves the whole stack somewhere, followed by a a longjmp() that jumps to the reset point. Entry: Studying the Factor compiler Date: Wed Aug 26 18:20:53 CEST 2009 Time to give it a read[1][2]. [1] http://docs.factorcode.org/content/article-compiler.html [2] http://www.mail-archive.com/factor-talk@lists.sourceforge.net/msg03573.html Entry: Rewriting in a nutshell Date: Thu Aug 27 09:36:02 CEST 2009 This is how I understand it now, all in the light of language transformations. - If there is a unique rule to apply in every case, rule application is a _function_ and you end up with a small-step semantics. - If there is more than one possible applicative rule, you no longer have a function. However, if your set of rules is _confluent_ (like the lambda calculus) then you can _construct_ an algorithm that _is_ a function by simply picking a reducable term (that might be favourable in some sence, or not, but the rule you pick is otherwise not essential to the final result) i.e. the graph reduction algorithm used to implement Haskell. - If your rule set is non-confluent but monotone (no loops) you essentially no longer have a unique result, but reduction is still possible. This can still make sense if there is a way to project the results onto something that gives them back their uniqueness. I.e. if you have a an onto `meaning' function that maps syntax -> semantics. If there might be ``more than one way to do it''. To turn this back into a solvable problem you need an extra measure to sort out the multiple (syntax) results. I.e. pick the one with shortest length. - If the rule set is not monotone (results can grow) you can't do much with it from a reduction p.o.v. I.e. such rule sets are more akin to BNF descriptions of grammars and are more useful to use ``in the other direction''. Summarized, this is: - small-step operational semantics - lazy evaluation - program optimization - grammar specification Entry: Constraint Programming Date: Mon Aug 31 15:50:06 CEST 2009 Exploration ----------- In good in-house tradition, I'm going to try to re-invent it before I look up the implementation again in SICP[?] or Steele's Thesis[?]. The idea is the following: constraint networks are multi-way functions built up from primitive constraints. You can do the following: * compose networks * assert inputs / receive satisfiability errors * query outputs / receive un-asserted errors The tricky part is going to be to stage the control flow into a predictable real-time C-program, but first let's write a dynamic version and see where local algorithms fail. Start with 2 primitive constraints (possibly the only ones I will need). a + b + c = 0 a * b * c = 1 To make constraint satisfaction into a local algorithm, a rule needs to be an active element: when receiving an assertion it needs to: - propagate it if possible - store it if no propagation is possible - raise an error if there is a conflict In general, a rule is an ordered / named list of nodes, and a governor that implements the behaviour above. Nodes in a rule can be asserted or floating. Let's represent asserted nodes by a number and non-asserted ones by #f. A rule -> node link needs to be bidirectional to propagate values through the network. For this a `slot' structure can be used, referring to a rule and an index (all rules have position-encoded nodes). Control Flow ------------ - a node is asserted, propagating a signal to its associated rules. - a rule is asserted: - underdetermined: stop - determined: propagate - overdetermined: error Remarks ------- Q: Find out why this can't solve sets of equations. (or can it? what's the relation with triangulation?) A: One answer is that it's possible to have N inputs and M equations, which will give a propagation when there are N-M inputs defined. OK. Control flow seems to work. Now, what can we actually do with this? * Adding linear functionals is a trivial extension. Adding general M x N systems requires some more code-gen magic to turn the equations into directed equations, but doesn't add any significant other difficulties. * In general, adding a particular N nodes, MO DFL program. So, I wonder.. What value does this add? - it abstracts control flow (directionalizing equations + sequencing) What about the following form: - set of equalities - set of inequalities + actions Abstract Interpretation ----------------------- Let's try to capture the static part using abstract interpretation. Essentially, turn the current implementation into a staged macro implementation. An interesting point: can you stage `amb'? An interesting property of directional constraints is that you don't need to make choices before you do tests. In discrete constraint satisfaction problems you do need to do that. Maybe I can make a combination of both? Continuous and discrete constraints, and use a staged `amb' to compile it. So what you do in the abstract evaluation is working with the _availability_ of parameters. Then you can serialize the control flow and cast it in stone, leaving the _values_ of the parameters unspecified. This means that DFL probably becomes a special case of constraint programming. Let's call it ``staged constraint programming''. Or ``staged prolog''. So.. What is the abstract version of a constraint? It's really simple: a constraint propagator. So, in the staged/ae version there are really only 2 problems: - constraint propagation based on availability, resulting in a sequential program. - constraint -> function conversion based on the sequencing (constraint directionalization) This pattern is quite neat: starting with an exotic control flow paradigm, fixing control flow through staging, which then allows binding optimizations (storage) and compile-time evaluation. Entry: DFL/CPL : Control flow staging Date: Thu Sep 3 10:45:38 CEST 2009 Context ------- In the application I am looking at, the guiding priciple should be the following: The resulting code should be free of unbounded recursion. This means the structure of the C code is can be flattened to a finite size combinatory network. Within that framework, how can the specifications be abstracted in useful ways such that they can be reduced to this form using information available at compile time. DFL: data flow language CPL: (deterministic) constraint propagation language Before implementing it, let's look at the usefulness first. 1. For DFL it's not disputed: allowing parallel presentation of operations is an advantage over having to specify the order manually. There are some degrees of freedom in how the DFL -> function/procedure is implemented, but this is the classical _inline_ vs. _call_ debate, and can be mostly related to static inlining vs. run-time function calls. Note that in this case, the use of recursion is an _implementation issue_ mostly related to memory use and memory locality for both code and data. DFL specs with code sharing lead to directed acyclic graphs 2. Adding unspecified directionality (equations instead of directed data flow operators) brings us to constraint propagation based networks: at compile time, constraints specified as equations can be translated into data flowoperators, in addition to fixing the order of DFL operator evaluation. One problem with constraints propagation networks however is that they are _sparse_. I.e. they cannot solve parallel constraints (systems of linear/nonlinear equations). It is probably my bias towards these kinds of constraint specifications that makes me think there is a problem here.. So is there? In order to augment N x 1 constraints over N variables to M parallel constraints, one needs a static `directionalization algorithm' that turns the equation into a function from the known parameters to the dependent ones (which can then trigger further constraint evaluations. Open issues ----------- Is local consistency[1] useful, or are we looking at more global constraints like sets of equations? [1] http://en.wikipedia.org/wiki/Local_consistency Entry: DFL -> LCL -> ? Date: Thu Sep 3 14:42:28 CEST 2009 This is work-in-progress. The application that drives this is a small DSL to express safety constraints. Other than that the specifications are unclear. Anything that's higer level than straight C or some syntact sugar around it is probably fine. The main problem is to not make it _too_ powerful. Step 1: a data flow language (DFL) One step up from C by omitting sequential order of evalution. The code transformation performs abstract evaluation on ready/not-ready values and outputs a sorted list of functions satisfying data dependencies. The Scheme code is here (comments and procedure names might contain confusing terminology though, and some of it is about modularity). http://zwizwa.be/darcs/staapl/staapl/machine/dfl.ss Step 2: local constraint language (LCL) One step up from DFL by omitting directionality of the operations. For this I have no code yet. The idea is that again using ready/not-ready values, all sequencing can be done statically like the DFL, but in addition ``directionalizing'' (better word?) the constraints can be done at compile time. Step 3: global constraints I think the story then breaks down because of data value dependencies. In some cases the global constraints can be turned into local constraints (i.e. linear equations with M unknowns and N < M equations) but these are special cases. However, it might still be useful to add some constraints that need a search-based approach as long as the inputs for them are also provided (or they could be slow-changing, like once per day switching on/off a particular input). Discrete inputs could then switch between modes, or could trigger re-compilation of the constraint checker. Entry: Practical: constraint.ss Date: Fri Sep 4 11:13:10 CEST 2009 I have a first draft of a staged local propagation constraint language. Now the plan is to clean this up by eliminating scheme phase issues (i.e. rule classes are phase 1) and find a way around multiple outputs. OK Phase issues are fixed. Let's leave multiple outputs to later. How to make this more interesting? One of the requirements is definitely going to be inequalities. Maybe a `range' type would be good. I.e. the MAX rule: m = MAX(a,b) when it receives values for `a' and `b' it can compute a value for `m'. however, if it receives a value for `m' and `a' things are more complicated: a > m -> ERROR a = m -> b \in [-\infty, a] a < m -> b = m for floating point values that are part of measurements, the equality doesn't make much sense.. so ranges aren't really necessary. Entry: More about rules Date: Sun Sep 6 08:53:59 CEST 2009 Looking closer at the problem, this isn't half as trivial as I first thought.. Mostly: I can't see how to use MAX or other comparisons other than in one direction. Can I work around this by simply making this a compile time error? Max only works in one direction? There is one vital assumption that is maybe not so valid: is it useful to fix the direction at run time? Maybe it's simpler to use an event-driven / FRP approach? No, let's stick to the current description: the language is going to need a certain complexity to be useful. It's probably not so that this could serve as a simple code example -- only as a design principle. I found one more PhD thesis[1] linked from the wikipedia article[2]. [1] http://www.ps.uni-sb.de/Papers/abstracts/tackDiss.html [2] http://en.wikipedia.org/wiki/Local_consistency Entry: Other abstract values Date: Sun Sep 6 09:07:07 CEST 2009 It looks like it's necessary to allow general purpose computations at compile time (i.e. full elimination of rules). This would allow the introduction of `amb'. Note that backtracking will probably need a state-threading approach to the code generation problem: the current assignment-based approach is probably a dead end. Hmm.. this is definitely not going to be a simple example. I guess it's best to separate out the DFL implementation. The useful directions seem to be: - eliminate implicit code threading (pure functional generator) - learn more about constraint propagation It would probably also help to read Jacques Carette's paper[1] about monadic generators. Once the generator is purely functional, backtracking can be added. Instead of a synchronous networ approach, it might also be interesting to directionalize equations from the pov of a single variable to its dependencies. The concept of finding abstractions that allow staged control flow is apparently a bit more broad than I first thought. In my first approximation of understanding, constraint propagation seems to exploit the sparseness of systems to yield local inversion which can then be chained. OK. Pure functional generators. Two kinds of effects need to be eliminated: value assignment and code `emit'. The latter can simply thread the state, while the former can use a node -> value finite function to be passed around. For backtracking purposes, it seems possible to combine parameters with partial continuations as long as the parameter values are saved/restored upon continuation capture/invokation. The _values_ stored in the parameters however need to be sharable: the current in-place update doesn't work, and needs to be replaced with something purely functional. Let's see.. [1] http://www.cas.mcmaster.ca/~carette/publications/scp_metamonads.pdf Entry: Too much freedom Date: Mon Sep 7 08:55:44 CEST 2009 OK, I have amb running with parameter swapping, so if non-determinism (search) is needed when generating code it's there. But what to do with it? It looks like I can't go any further without proper specs. So, what is the problem? - Allow for a number of inequalities to be specified over the data. - Because they are essentially free, allow for equalities too - Define unsatisfiability: what actions are to be taken? The real question is: why isn't a simple directional data-flow approach good enough? If I have so much trouble trying to understand why one would abstract out directionality in the constraints, it's probably not worth the bother.. The missing link is probably that this really needs an _event_ based approach: the safety monitor needs to 1. be able to tell something isn't right 2. appoint blame to the event that broke constraints 3. if possible, correct or prevent a setting What about this: for each collection of inputs that can be given as an event, produce a program that says if this action is valid or not according to a number of constraints. This is actually a lot simpler than propagation: it needs only to associate each collection of inputs to the collection of rules they are connected to, and check each rule. The constraing propagation could then be used for directionalizing equalities that serve only to express extra relations. Another point: the wikipedia page talks about propagation as restricting the domains of nodes after eliminating constraints. Entry: Ranges Date: Mon Sep 7 10:00:04 CEST 2009 The problem is the domain of the nodes. Currently, they are limited to values, and propagation performs assignment. Let's clean this up first by allowing subsets. First: constraints need to perform propagation themselves. This should not be handled in the driver routine. EDIT: equations and inequalities: equations useful for defining intermediates on which the inequalities are specified. When looking from the point of individual get/set operations, the algorithm performs directionalization of the equations + selects the appropriate constraints to check. Entry: A safety monitor Date: Tue Sep 8 11:08:54 CEST 2009 A language with two constructs is probably good enough: - inequalities to express safety constraints. when not satisfied, these trigger actions. i.e. "totalpower < maxpower" - equalities to add extra internal nodes to make the inequalities easier to express This could then be used to create 2 classes of imperative safety monitor programs: * system monitor: keeps an eye on a system's output. this is a stream processor: all nodes updated at the same time, and all constraints are checked (serially, conceptually in parallel). * operator monitor: for each allowed input event (a tuple of set points) one can construct a check that can accept/reject the setting, or limit it to some extent. The useful part would indeed be not to just tell an operator wrong, but to _adjust_ a request to a valid region. This needs extra knowledge however: how to express an intermediate point (i.e. some set points might _really_ make no sense: this then should not result in somehow erroneous behaviour). Entry: Prolog and Pattern Matching Date: Tue Sep 8 09:47:46 CEST 2009 Looks like I missed some important point about Prolog. In my head it's simplified to `amb' + constraints in the form of horn clauses. But then what? As a guide, let's take [1] and [2]. In [2], chapter 22 talks about `amb' (or `choose'). Relatively straightforward. So what is `cut', explained in 22.5? Essentially, eliminating part of a search tree. This is straightforward in depth-first search, and requires marking of the resume stack. Then 22.6 talsk about breath-first search, implemented by replacing the resume stack with a queue. Chapter 24 talks about prolog as ``rules as a means to construct a tree of implications, which can then be searched using non-determinism.'' Or prolog as a combination of pattern matching, nondetermism and rules. A rule has the form if then . The basic idea is that if the body matches a fact, its head can be added to the collection of facts. When asked for a fact, we can recursively sift through the facts and heads of rules, essentially picking out rules and finding evidence. The search ends when there is evidence found, or when all rules and facts are exhausted. So, what about binding? Find all occurences of variables satisfying a certain property. It looks like what is necessary to implement this in Scheme is to map Scheme's binding mechanism to the rule binding. In the following rule, (if (and (tree x) (summer)) then (green x)) which is the binding site, and which is the reference site for `x' ? It's not obvious, so maybe the answer is it depends on whether this is interpreted as a constructor or a destructor, or both. Unification? Indeed. [1][3]. One thing that confuses me is the 2 directions of information flow: to see if a parameterized fact is true, one needs to run the rules backwards, but for every value found, the value propagation runs the other way. So it looks like control flow is going in one direction, while data flow moves the opposite way! Let's try to organise the control flow first, and see how the data flow can be incorporated in that solution. Given a set of rules, construct functions that are indexed by proposition. ;; rule (rule (and (tree x) (summer)) (green x)) (rule #t (summer)) (rule #t (tree pine)) (rule #t (green algae)) Suppose `(green x)' nondeterministically binds x. How does this look as in expression form? I'm stuck at expressing binding recursively. Let's try to make it more difficult first (rule (and (tree x) (brown y)) (browntree x y)) ... [1] http://mitpress.mit.edu/sicp/full-text/sicp/book/node92.html [2] http://www.paulgraham.com/onlisptext.html [3] http://en.wikipedia.org/wiki/Unification Entry: Resolution (Logic) Date: Wed Sep 9 09:05:16 CEST 2009 Reading CTM[1] chapter 9 about relational + logic programming. The basic idea is that you want theorem proving: given a logic sentence, the system needs to find a proof for it. On page 634 it remarks: - theorem prover is limited (Goedel's [in]completeness theorems) - algorithmic complexity: need a predictable operational semantics - deduction should be constructive This is ensured by restricting the form of axioms so that an efficient constructive theorem prover is possible. For Prolog these axioms are Horn clauses, which allow an inference rule called resolution[2]. Additionally, Prolog allows to add hints to map a logic program to a more efficient operational semantics. [1] http://www.info.ucl.ac.be/~pvr/book.html [2] http://en.wikipedia.org/wiki/Resolution_%28logic%29 Entry: Relational programming Date: Wed Sep 9 09:29:28 CEST 2009 I was wondering if it makes sense to allow infinite choice sequences. Probably not, unless they appear at only one place in the tree. For depth-first this should be on top, and for breath-first at the bottom. Otherwise the infinite branching will prevent some children from ever being reached. Also, allowing `cut' operations helps. But, it does seem usefult to at least keep an algorithmic description of choice points without having to resort to explicit lists. Additionally, these choice points could be results from another search problem, generated lazily.. So, what about reformulating choice as taking an enumerator by default, instead of trying to shoe-horn procedural sequences back into explicit lists. Design principles: - lazyness is good - enumerators (HOFs) are better than lazy lists (data structs) So, `amb' takes a function. A choice point is then a continuation (a hole) and an enumerator. The contination can be plugged into the enumerator directly. (struct choice (k enum)) -> (enum k) Now, does this compose? I.e. can the driver just execute the enumerator and nest the backtracking that way? Hmm.. OK: settled to: - amb takes enumerator - some enumerator <-> list / stream / sequence transformers (enum.ss) - solutions presented as enumerator Next: cut. Marks should be installed at `amb' time, while cut Entry: unify.ss Date: Thu Sep 10 16:41:29 CEST 2009 Unification algo seems to be working. Used it in one way (as a pattern matcher) in database.ss About prompt tags. It looks like it's necessary to name the tags, because other uses of partial continuations would interfere with the marks used for backtracking. The idea seems to be that for each abstraction built on top of partial continuations, you use a new prompt tag. Then you just need to worry about those tags being properly nested (i.e. nested backtracking). Entry: Algebra of Programs Date: Fri Sep 11 12:30:28 CEST 2009 Let's focus on the basic idea: a create a collection of combinators that satisfy some mathematical laws. Make it almost trivial: no power in the language, easy manipulation. Play with the manipulation and try to grow some context to understand current literature about this. The basic one is the interaction between `map' and fuction compositon. (f) map (g) map = (f g) map Now, construct a bunch of meaningful programs that use just this, and then construct all possible rewritings. Then, try to associate a _cost_ and _feasibility_ to the different versions, i.e. register pressure, cache pressure, ... [1] entry://../staapl-blog/20090911-110748 Entry: Prolog Date: Fri Sep 11 12:37:47 CEST 2009 Looks like I really need Prolog, and more general, some kind of theorem prover / constraint system. It will probably pay to build one and have an idea of the tradeoffs. Let's finish what I have now. In the previous attack, mapping rule inference to lambda abstractions doesn't work, because it requires _unification_ which is a symmetric binding construct, while pattern matching and construction is asymmetric. So, given a rule P(X) & Q(X,Y) -> R(Y) this can be used in a query for R(Y) as follows: - the query R(Y) leads to a store with one unbound variable Y. - extend the vocabulary with the variable X - find a stream of solutions for Q(X,Y) - filter it with P(X) Entry: How the query system works (SICP) Date: Sun Sep 13 14:48:35 CEST 2009 In [1] the stream-of-frames implementation is used to implement the query system. In general, the query evaluator uses the following method to apply a rule when trying to establish a query pattern in a frame that specifies bindings for some of the pattern variables: * Unify the query with the conclusion of the rule to form, if successful, an extension of the original frame. * Relative to the extended frame, evaluate the query formed by the body of the rule. Notice how similar this is to the method for applying a procedure in the eval/apply evaluator for Lisp: * Bind the procedure's parameters to its arguments to form a frame that extends the original procedure environment. * Relative to the extended environment, evaluate the expression formed by the body of the procedure. In my implementation, the stream-of-frames is the result of `solutions', which can then be fed back into `amb' recursively. It seems that the real problem is `alpha conversion' : to map variables in a query to variables in a head/body combo. A simple way to do this is to represent the rules's parameters using functions (higher order abstract syntax) HOAS[2]: ``In the domain of logical frameworks, the term higher-order abstract syntax is usually used to refer to a specific representation that uses the binders of the meta-language to encode the binding structure of the object language.'' However, in order to avoid explicit renaming and piggy-back on Scheme's lexical variables, some deconstruction is necessary at compile time. It seems this needs to do the work twice. Use explicit renaming then? (as in SICP). Alternatively, we can give up the direct representation as lists and only use the abstract one where the unification control flow is expanded. Maybe that's a better way: you don't need `eval' or any direct interpreter as long as you have lambda and hygienic macros. In this case it looks like a store needs to be a run-time entitiy, but the unification match is compile-time. But, let's do renaming first. A working implementation is worth more it seems.. This is the core of [3]: (define (unify-rule store pattern rule) (bump-rename-count!) (let* ((rule (map rename-variables rule)) (store (add-free-variables store (rule-head rule)))) (for/fold ((store (unify store pattern (rule-head rule)))) ((bpat (rule-body rule))) (unify-pattern store bpat)))) (define (unify-pattern store pattern) (unify-rule store pattern (choice/enum (rules-db)))) (define (solve pattern) (solutions (query (bindings (unify-pattern (add-free-variables (empty) pattern) pattern))))) So indeed, it is quite straightforward. Rules can be partly staged + renames can be avoided, by piggy-backing on a directed pattern matching binding form. Currently I have this implemented as a clause translator: the head match produces either a fail, or a body where the rule's variables are substituted by the corresponding sites in the input pattern. Instead of constructing a list, this could also just construct a recursive query invocation. [1] http://mitpress.mit.edu/sicp/full-text/sicp/book/node94.html [2] http://en.wikipedia.org/wiki/Higher-order_abstract_syntax [3] http://zwizwa.be/darcs/staapl/staapl/machine/prolog.ss Entry: quotation vs. prefix Date: Sun Sep 13 17:44:06 CEST 2009 To distinguish variables from symbols some form of syntax tagging needs to be used. In SICP and On Lisp, the `?' character is used for this. I picked explicit quotation, which would translate to PLT's old match form. This choice is quite arbitrary as it's straightforward to translate, so I'm going to switch back to the SICP/OL style. EDIT: Funny, apparently in SICP this is also just surface syntax: variables are tagged as (? ) for efficiency. Entry: Linear equations Date: Mon Sep 14 09:52:06 CEST 2009 So... This constraint business. The current algorithm uses N x 1 constraints. This needs to be generalized to N x 1 linear constraints, then N x M systems. Let's start with the most general first. I.e. a 3 x 2 system: a x + b y + c z = d q x + r y + s z = t In general this can be solved using gaussian elimination[1]. Given a collection of known variables, a square system can be constructed which then needs to be inverted. Since all coefficients are known at compile time, the sequence of computations can be generated in-line to produce a dataflow program, which can then be sequenced. Pivoting[2] can be used to yield optimal condition. Full-pivoting searches for the element with largest absolute value in the unprocessed rows. OK.. since this is all quite straightforward computable at run-time, it might be good to do it numericall. Otoh, allowing rational numbers might be interesting too (i.e. for sparse systems). Yes, let's go for the rational numbers. Wait, since there are going to be square roots of 2 and 3 in the constants, maybe also use field extensions? It looks like it might be a good idea to abstract the _coefficient domain_ of the equations using some kind of unit interface (like the functor approach in [3]). [1] http://en.wikipedia.org/wiki/Gaussian_elimination [2] http://en.wikipedia.org/wiki/Pivoting [3] http://www.cas.mcmaster.ca/~carette/publications/scp_metamonads.pdf Entry: Abstract number domains Date: Mon Sep 14 12:29:29 CEST 2009 What: build a staged expression evaluator for abstract number systems. I.e. flatten an expression to ANF + allow for complex numbers, normal numbers, other fields/rings, ... So.. I've separated the ring-sig.ss interface, and will provide algorithms in terms of mathematical rings/fields. This seems the cleanest separation. Entry: Using associativity Date: Wed Sep 16 10:04:05 CEST 2009 I'd like to use the associativity laws to shorten the critical path using binary subdivision. This is only useful for the staged version. It would probably be best to hide this behind the `ring^' interface since it relies on representation. The naive subdivision won't work: it produces this: box> (sum (syntax->list #'(a b c d e f g h))) ;; (v8 (+ a b)) ;; (v9 (+ c d)) ;; (v10 (+ v8 v9)) ;; (v11 (+ e f)) ;; (v12 (+ g h)) ;; (v13 (+ v11 v12)) ;; (v14 (+ v10 v13)) which is depth-first. Breadth-first is probably better, but it uses more intermediates. Looks like the trade-offs here are implicit. Anyway: it is a different problem. This could be done in two steps: first make sure the _dependencies_ are a binary tree, then in a second iteration schedule the operations to allow for pipeline delays. The latter can probably be handled by the C compiler. So: compilation is EQUATIONS -> DATAFLOW COMBINATORS -> DATAFLOW SCALAR -> SEQUENTIAL Actually, once you start combining different forms, because of specialized mul/add for 0 and 1, the tree of operations gets unbalanced. Postponing until the computation is finished and then re-balancing (based on algebraic laws) might be an interesting optimization. Ok. I've added memoization. This doesn't do commutative ops though. Ok, this is now also handled. Entry: Echelon form Date: Thu Sep 17 11:21:52 CEST 2009 Seems to be working. The only problem is the staging of pivoting, since this has an influence on the data flow. Ok, what I want is not echelon form, but gauss or gauss-jordan elimination. The inner loop does this: . | .... . / 0 | .... P | .... -> P / 1 | .... . | .... 0 | .... There are two parameters: wheter to scale the pivot (P/1) and whether to eliminate the top rows to zero (gauss-jordan) or not (gauss). Ok. moved to gauss-jordan elimination. Entry: Matrix combinators Date: Thu Sep 17 14:30:11 CEST 2009 Interesting how not trying to optimize prematurely makes the matrix code look quite simple. Because the matrix structure can be completely eliminated at compile time, the only thing that remains is the data-flow dependencies the algorithms introduce (i.e. pivoting makes choices wrt. dependencies where the goal is to optimize for better numerical stability). What I wonder is if it's possible to reconstruct some grid operations from the bulk of the data flow network. I.e. if the higher order operations as I've specified them now can be used to map to some intermediate language that _doesn't_ get rid of the grouping relations (i.e. to recover loops from the flat code). From afar this does look like it's not a good idea, as it creates an artificial search problem. But.. Tagging the nodes with this information might make reconstruction easier.. Entry: Constraints Date: Thu Sep 17 15:32:23 CEST 2009 Ok, so give a set of linear equations that determine intermediate variables, and a set of inequalities, determine: - running check for autonomous state - event-check for parameter updates So.. what are the advantages over doing it numerically? The algebraic approach supports sparse equations. Entry: AI - Summary Date: Thu Sep 17 18:03:37 CEST 2009 I have this feeling there is much more hidden below the surface I just scratched. Abstract interpretation seems to be a really neat trick. It makes some things look really easy. Entry: Equations Date: Fri Sep 18 14:19:26 CEST 2009 So, how to use this, starting from a bunch of equations. Essentially, what we want to do is 1. to figure out which are the variables, 2. see if they make up a linear system. For 2. what is needed is for every symbolic multiplication, at least one of the variables needs to reduce to a constant. Entry: Type Checking and Abstract Interpretation Date: Fri Sep 18 14:54:58 CEST 2009 When reading stuff in [2], I missed this[1] comment by Greg Buchholz[3]: ``I'm toying with a interpreter/compiler for a Joy-like language, and of course the issue of typing came up. But instead of having static type inference or latent type checking, I've been interested in executing programs with types instead of values.'' Then moving on to the small-step abstract interpretation[4]. When dealing with (control) effects, small-step operational semantics are easier to work with than coarser big-step / denotational semantics, while the latter allow for syntax-directed techniques (structural induction[5] over expressions). [1] http://lambda-the-ultimate.org/node/834#comment-7658 [2] entry://20090716-143500 [3] http://kerneltrap.org/blog/6714 [4] http://okmij.org/ftp/papers/delim-control-logic.pdf [5] http://en.wikipedia.org/wiki/Structural_induction Entry: Abstract Interpretation: Lattices Date: Fri Sep 18 19:09:10 CEST 2009 While intuitively, abstract interpretation for staging (symbolic computation) is quite straightforward to do, the mathematical definition relies on order-preserving functions on lattices. A canonical example of a lattice is the set of subsets of a set, together with union and intersection (a boolean lattice[1]). I.e. arithmetic can be approximated over the lattice made up of the subsets of {+,0,-}. {+} add {+} -> {+} {-} add {+} -> {+,0,-} = T x mul {0} -> {0} ... So.. The missing links and terminology seem to come from denotational semantics[2]. As far as I get it now, an abstract semantics is related to a full denotational semantics in some structure-preserving way. Attempt: In the following diagram L and L' are lattices, f is a concrete function, f' an abstract function, a : L->L' is the abstraction function and c : L'->L is the concretization function. The a and c are ``semi-inverses'' in that they preserve order: they form a Galois Connection[5]. The abstraction is sound when whenever c acts as a ``semi-homomorphism'' f . c < c . f' respecting the order relation instead of an equivalence, i.e. the following diagram commutes: f L ---------> L ^ ^ | | | c | c | | | | f' L' ---------> L' In other words, soundness means that for any concrete operation forall x, f x = y, if x \in c(x') then y \in c(f' x') meaning that for all x, if x' is an abstraction of x and f' is an abstraction of f then f'x' is an abstraction of y. It's convenient to picture the original semantics L as the powerset of some set |L (i.e. the natural numbers |N): already containing all approximations. In this case the abstract representation L' is related to |L by mapping elements l' to subsets of |L. The order relation in L represents ``level of abstraction''. I.e. the element + in L' could map to the element {0,1,2,...} in L. Now, from the first lecture of Cousot[7], we find: Abstract interpretation is considering an abstract semantics that is a _superset_ of the concrete semantics of the program, hence it covers all possible concrete cases. This leads to the requirements: 1. sound (cover all cases) 2. precise (avoid false alarms) and 3. simple (avoid combinatorial explosion). So, conclusions: what I'm doing at the moment is a combination of several techniques: it produces a trace (abstract evaluation of the interpretation step function) + approximates values by values U variables / expressions. It would be interesting to try to formalize this. Soundness seems ``obvious'' in my case, so te proof shouldn't be too difficult (probably trivial once there's a formal model). In general however, it does seem like a good idea to try to keep the theory in mind next time I need to do some analysis, and use specific applications as an example. [1] http://en.wikipedia.org/wiki/Boolean_lattice [2] http://en.wikipedia.org/wiki/Denotational_semantics [3] http://santos.cis.ksu.edu/schmidt/Escuela03/WSSA/talk1p.pdf [4] http://santos.cis.ksu.edu/schmidt/Escuela03/home.html [5] http://en.wikipedia.org/wiki/Galois_connection [6] http://santos.cis.ksu.edu/schmidt/Escuela03/WSSA/talk3p.pdf [7] http://web.mit.edu/afs/athena.mit.edu/course/16/16.399/www/ Entry: Contracts Date: Sat Sep 19 10:26:15 CEST 2009 Time to tag some contracts.. I've tried to use typed Scheme, but the absence of inference makes it tedious. Also, I'm not quite used to the degree of precision required for typed programming.. This is really something to do gradually: from the point you know what you're doing. (I wonder, is there something like a Scheme -> ML translator that lets you use the ML typechecker and inferencer?) Anyways, I'd like to explore the route: untyped -> contracts -> typed-scheme Entry: C extensions using standard syntax. Date: Sat Sep 19 12:04:36 CEST 2009 Continuaton of [1] in staapl/staapl/c. The idea is to embed macros in C source code as a way to sneak metaprogramming techniques into the embedded C world. Let's try to build a framework for this. What is needed? - c.plt input wrappers (cpp, mfile) OK - (pretty) printing to C - conversion from c.plt data structs <-> code processor structs c.plt still needs the C preprocessor to perform expansions, so that's wrapped using cpp.ss (which uses ../tools/mfile.ss). NEXT: get a grip on c.plt data structures. The AST is here[2]. Ok, I have a first draft of a partial naive pretty printer. Looks like we're there. [1] entry://../meta/20090919-112256 [2] http://planet.plt-scheme.org/package-source/dherman/c.plt/3/2/ast.ss Entry: Join libprim and staapl Date: Sat Sep 19 17:37:51 CEST 2009 Reason 1: the cplt analysis tools can be used in the libprim build process. Entry: VLIW + GP Date: Sun Sep 20 12:33:13 CEST 2009 What if.. a GP processor performs address calculations for a VLIW? I.e. concentrate on subdividing the problem in kernel loops that execute on the VLIW, and buffer management code that runs on the GP, performing address calculation to feed the VLIW with data. Entry: Loop TX Date: Sun Sep 20 16:18:43 CEST 2009 In order to get a better idea of loop transformations, it might be interesting to look at the ``algebra of loop transformations''. I'd be surprised if this doesn't exist in such an abstract form. - interchange (transpose) - splitting/peeling (boundary conditions or segments) - fusion/fission - unrolling - invariant motion (move independent statments outside loop) - reversal - tiling/blocking - skewing - vectorization - software pipelining - unswitching (moving conditionals out) - inversion (while -> do/while) Apprently there is an abstraction called the ``Unimodular Transformation Framework'' which deals with representing these oparations as matrices. Muchnick 20.4.2. From this it seems reasonable to limit the possible loops in a language (combinators) to get a better-behaved algebra of loop transforms. [1] http://en.wikipedia.org/wiki/Loop_optimization Entry: MAP is easy, FOLD is not. Date: Sun Sep 20 17:54:45 CEST 2009 So, when writing data combinators, MAP can be used to fill the gaps, while FOLD needs special care. Most commonly the operator that is to be folded is associative. This allows for the fold to be broken up into independent pieces which can be combined in the end. Comparing this to bananas, lenses, ... [1] the problem is different. You want to treat MAP specially because of parallellism. The map fusion (loop fusion) is like a non-local particle going through your program. It has many degrees of freedom. Another difference is that comllex anamorphisms are rare in DSP. They mostly take the form of constant lifting (combine a constant to each element in a loop map/fold) or simple weight generation (lines & sines). The catamorphism (fold) however is very important, with the inner product taking the lead, possibly followed by min/max. Constant lifting corresponds to moving variables out of loop scope. It seems that the idea of loop scope is going to be an important one. This idea of loop fusion being a ``particle'' moving around in a space of constant energy is stuck in my head. The same can then possibly be said for fold fusion. (map $) . (map @) = map ($ . @) Something not to ignore is that boundary conditions are important, and significantly complicate manual code compared to truly symmetric map and fold. As I mentioned before in [1]: for image processing you want the data types to be a bit more abstract than recursive types. I.e. picking an implementation is picking a data type + recursion schemes that handle pre/loop/post cases. [1] entry://../compsci/20090911-125525 Entry: Combinations of 2D filter masks and mappable scalar ops. Date: Sun Sep 20 18:51:45 CEST 2009 Main rationale: boundary conditions significantly complicate expression of folds over images. Write an algebra of fusable operations, first ignoring boundary conditions (infinite fold), and second taking that into account. Parameterizing composition of 2D filter masks will yield a large class of useful programs, and will probably give some idea about how to move on to more serious folds. So... How to express data types? Or can this be avoided? Moving to the simpler problem of 1D filter masks over the stream s, which consists of an infinite list of elements s = e*, this is about the operator Z wich delays the stream by one time instance. Q: what does `Z + 1' mean? Here Z is an operator: s -> s this makes 1 also an operator: s -> s and + is an operator combinator: (s->s, s->s) -> ((s,s)->s) It looks like it is actually this shift operator that makes things so problematic. The reason is that it ``looks inside'' an iteration over a stream. What I mean is that, while it is easy to lift + to a stream operation (s,s) -> s and feed it with two streams s and Zs, it is less trivial to turn the operation into an operation f : s->s, where f is obtained from +,1 and Z. More specificially, the function that provides arguments from the input stream to the scalar + : (e,e) -> e somehow needs to retain memory. This ``state maintenance'' seems to be the central problem related to combining maps + folds and stream shift operators. In other words, there is a difference between the following lift types, depending on lift over streams, or lift over s,Zs. S-lift : ((e,e) -> e) -> ((s,s) -> s) Z-lift : ((e,e) -> e) -> (s -> s) Making these functions explicit and providing laws for them is probably what needs to be done. The lower one can be separated in as ((s -> s), (s -> s), ((e,e) -> e)) -> (s -> s) i.e. taking two stream transformers and a binary scalar op, and feeding the transformed streams into the binary element op. Filling in 1,Z,+ gives the filter function f mentioned above. So.. the interaction between map2 : ((e,e)->e, s ,s) -> ((s, s) -> s) map1 : ((e->e), s) -> (s->s) Z : (s -> s) streamZ s = (s -> (s,s)) map2(+, s, Z(s)) = map1(+,streamZ(s)) : s->s It's important to distinguish between a stream of tuples and a tuple of streams. It looks like making the types work is going to bring us half way there (in the light of Wadler's free theorems[1]). [1] http://homepages.inf.ed.ac.uk/wadler/papers/free/free.ps Entry: Monads in a concatenative language Date: Mon Sep 21 12:31:00 CEST 2009 How would you express them? Probably best in terms of unit/map/join, because absence of lexical variables makes the do notation of the bind form impossible. What do monads look like in the map-formulation? Entry: Constraint Language Date: Thu Sep 24 11:23:37 CEST 2009 1. convert to normal form (confluent rewrite system) 2. collect constraints into propagators (i.e. collect linear equations) 3. perform constraint propagation (find a distribution strategy) Putting it like this makes it look a lot more structured. Especially the first one is something I didn't think about. The third one appeared after reading chapter 12 in CTM. The first one can maybe be done directly with syntax-rules. Normalizing the relations is simple: move stuff to the other side until there is a comparison to 0. The next simplification is conversion to sum of products. This expands every + nested inside an *. How can conversion to normal form be formulated as an alternative representation? I.e.: + concatenate - (map negate) . + * convolve It's actually quite simple: 1. recursively reduce all multiplications 2. flatten all additions Entry: Stream Combinators : Z Date: Sun Sep 27 10:42:38 CEST 2009 The problem that needs to be solved for the image processing loop transforms is a way to move `Z', the single-element delay, inside and outside a loop. The general idea: Z (delay) should be a high-level operation. In all implementations I've found, Z is always explicitly implemented wrt. the representation of streams / sequences, leading to a complicated (too much detail) and inflexible (too specific) encoding of knowledge. DSP code should be expressed in terms of polynomials. The mapping of this representation to code should be automated, possibly parameterized. Essentially, translate a Z on the inputs into a Z of the fold state. This is probably related to paramorphisms. I'm looking for a _form_ where the operation of moving the effect of Z inside a loop is clear. Suppose s is an infinite list. In scheme notation I'm looking for the transformation between: (lambda (s) (let ((s1 s) (s2 (cdr s))) (map + s1 s2))) and (lambda (s) (let filter ((s1 (car s)) (s (cdr s))) (let ((s2 (car s))) (cons (+ s1 s2) (filter (cdr s) s2))))) The first is a map, while the second is a kind of fold (it uses threaded state). The right framework to see this is probably memoization / dynamic programming. The delay operation `cdr' here is memoized, which in practice means that an expensive memory fetch will be cached by keeping the variable in register. A paramorphism does something similar: its binary operation takes a pair: the input and output of the iterated function. From[1]: A paramorphism for the natural numbers: h 0 = b h (1 + n) = n @ h n For lists: h Nil = b h (Cons (a, as)) = a @ (as, h as) Here `@' is the binary function to be iterated. The next step seems to be to look into Meerten's idea[2] and find a way to express it for Z-based computations. However, note that a paramorphism has a truly recursive call tree: there is a reverse data dependency: the binary operation @ receives the _result_ of the recursion. What I'm looking for is tail recursion with all memoized state / accumulators passed in as extra arguments. Indeed the problem is different: the specification might deal with more general recursion patterns on recursive data types, but in the end one needs to construct loop kernels with local state connected to a data access pattern. Maybe the reason for confusion is the difference between abstract definition with low arity function, and the ``splatted'' nature of pipelined loop kernels for RISC/VLIW machines with lots of registers. Maybe the question can be reformulated as: find a transformation that memoizes accesses. In [3] elimination of redundant loads is mentioned as one of the possible optimizations that bring accesses to registers. If I'm about to move the abstraction level up, this kind of optimizations becomes very important. What are the other patterns? - elimination of redundant loads (i.e. offset indexing implemented using register shuffling) - loop unrolling enables array -> variable translations. - data prefetch and proper allocation in memory hierarchy: spatial vs. temporal locality, i.e. FIR input/output vs. FIR coefs. [1] http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.41.125 [2] http://www.springerlink.com/content/h1547h551422462u/ [3] isbn://1558607668 Entry: Stream combinators continued Date: Tue Sep 29 12:43:22 CEST 2009 So, starting from the primitive delay combinator Z : s -> s, which takes a stream and maps it to the stream with one operation shifted, we need algebraic laws such as: (map fn s (Z s)) = (loop_1_Z fn s) (map fn (Z s) s) = (loop_Z_1 fn s) Where `map' is an element-wise morphism, or fn : e -> e, and `loop' has threaded state. The objective is to find an abstract way to characterize the functions `loop_1_Z' and `loop_Z_1' so the rewrite rules work in both directions. This depends on the representation of streams (which will ulimately be machine memory arrays). Let's start with: ((map fn) (I s) (Z s)) = ((loop fn I Z) s) ((map fn) (Z s) (I s)) = ((loop fn Z I) s) The objective is to start with a LHS representation in terms of combination of streams, and turn it into a single iteration over a number of primitive streams. This is essentially loop merging. Later we need to give up the functional notation early and use data-flow variables: each expression contains a number of inputs and outputs. For now expressions are simpler. The rule above needs a more general composite transformation rule that can merge two loops. Essentially what one wants is algebraic rules that relate stream and function operators. Let's use the following notation: s : [e] [.] stream type constructor f : e^n -> e^m elementary function S : s^n -> s^m stream transformation F : f -> S elementary function transformation ((F1 fn) (S1 s)) = ((F2 fn) s) +-----+ +----+ +-----+ S s S Laws are needed that have S1 in terms of an F, like: ((F1 f1) ((F2 f2) s)) = ((F3 f1 f2) s) What I can gather from this is that: 1. types are important to make this managable. It might be wise to put this in a categorical framework to map out the territory. 2. the rest seems pretty straightforward derivation of theorems that proove equalities. Let's try this with the important players: + * : e^2 -> e - : e -> e I,Z : s -> s What's next is a syntax and semantics for `loop' parameterized by the elementary operators Z^n, n>=0 with I = Z^0. TODO: build loop on some substrate of streams of elements (i.e. a circular vector of exact numbers, to make verification of transformation rules simple.) Got loop.ss test skeleton. The first thing I notice is that I need an abstraction for threading delayed values: It's not so simple to do this with unary fold. Does for/fold allow for multiple values? Yes. OK, I have the first TX test working: (quickcheck (lambda (s) (map + (I s) (Z s))) (lambda (s) (loop_I_Z + s)) 1 10) Next is to find a way to generate loop_I_Z from a specification (I Z). This should generate pre/loop/post code (which is the most annoying part doing manually). Another thing: if there is no output dependency (i.e. IIR filters), it is possible to run loops with delays backwards. This is necessary for making operations parallel. Take this as a hint that the best description probably takes direction simply as a parameter. The real deal is this: in the end, there is an inner loop that accesses a mask over the data. Find a way to describe this on a higher level. In order to properly represent loops it might be best to adhere to a representation that uses arrays and indices directly. The problem here is that references are pure, but of course assignments are not. Let's move to single-assignment vectors (this way coverage can be tested). In short: separate abstract formula manipulation from implementation (single-assignment vectors). Functional vs. dataflow. An interesting property is that it is difficult to talk about vector operations without allowing (single) assignment: iteration loops seem to scream for explicit mention of the storage/send of outputs. Wait.. Maybe a read/write formulation would work better? I.e. delimited continuations etc.. Q: can these transformations be expressed easier in terms of delimited continuations? Q: Essentially, memory already behaves as a stream (very clear in DSP: using DMA from DRAM to SRAM to feed a processor..). Maybe I should reformulate loops as combinations of IO + coef + state + loop? Q: What about seeing code as a description of a circuit and then re-interpreting as loops (operational semantics?) I.e. a 3-tap fir filter is a description of a network that connects adders and multipliers from an input stream to an output stream. Combinators essentially duplicate a basic pattern to a global one. Looking at it this way (pure functional kernels) it might be simpler to derive loops and pre/post code by reducing the pure kernels by combining them with a traversal strategy. That's really the essence, right? Declarative dataflow. What is declared? The _structure_ of the computational network, _not_ the map to a serial machine. I.e. instead of thinking about delay lines, think of a little more highlevel concept of spans: transform loop bodies, such that the final loop construction takes the form of a universal pre/loop/post combination that takes a description of the span as input. Some remarks: - using a limited set of core operations, issues of associativity and commutativity could be left until later (when discovering ILP). - the remaining problem is the conversion of an input specification to a (memoized) data flow program. in order to have things map easily to serial C, the memoized form uses `let*' terminating in a `values' form that produces the output. Let's try this. The effect of 'z' needs to be pushed through to an input variable, to be mined later to determine the signature of the function. [1] http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.68.5409&rep=rep1&type=pdf [2] http://eprints.kfupm.edu.sa/45250/1/45250.pdf Entry: Filters cont.. Date: Wed Sep 30 18:18:26 CEST 2009 So, I have a notation that can push a z operator through operations to end up as a variable operator (all operations work on current values, and delays are implemented by the driver loop). (ttx r/z #'(+ a (z (+ b (z (+ (a b))))))) => (+ a (+ ((z 1) b) (+ (a ((z 2) b))))) Next: do the same for operator specifications (polynomials in z). It's probably best to separate operatiors and other things using a type system, however, an operator is something where z can be an argument to an operation (where the operation is then an operator on an operator). Entry: Constraint cont. Date: Mon Oct 5 13:14:46 CEST 2009 Starting from [1]. What I have now is a specific propagation engine (the idea is to use real numbers / floats, so sets of integers are currently not yet supported). The normalizing should work too. So what I need is a specification and propagate the problem description to a solution. Let's start with something trivial first: (constraint (= (+ (* 3 x) (* 2 y)) 10) ;; define intermediate nodes (< (+ x y) 3) ;; safety constraints ) This should lead to the following behaviour: generate code for the following functions: set_x() set_y() This produces directed equations + relevant check of constraints. So, I have a routine now that produces a list of equalities and inequalities. In the first iteration, the equalities need to be arranged into a matrix. This means the unknowns need to be discovered. If the equation is in normal form, we can simply gather them from the '*' forms. Next was sorting terms according to order. [1] entry://20090924-112337 Entry: 2-level semantics Date: Mon Oct 5 15:50:00 CEST 2009 Concrete: whenever an identifier appears as a literal in a syntax-case expression, it has a compile-time semantics thats _not_ programmable. There is a subtle difference between this, and providing a syntax binding for the identifier. The latter allows for lexical scope. Currently, using syntax case with identifiers for normalizing arithmetic expressions is probably good enough. EDIT: While writing an (algebraic) meta-processing language, some of the identifiers (operators) have a compile-time semantics (i.e. the associativity law which allows re-arranging expressions, and has little meaning at run time, when the program has lost all its mathematical meaning and is merely a sequence of instructions for a serial/parallel computer.) In short, `+' and `*' are _not_ scheme! They have two identities that should be distinguished with the utmost care: at compile time they are _formal_, and are there _only_ to steer formula transformation. At run time they are functions that operate on values. Once this separation is clear, it is possible to start being flexible with the idea of "run time" : if some of the run-time reductions/operations can be performed in a separate transformation stage, further reductions are possible. On the other hand, variables might be scheme (i.e. they should respect hygiene). However, they are quite type-restricted: the compile time transformation (formal manipulation) needs to be compatible with the run-time semantics (model theory). Entry: Filter language Date: Thu Oct 8 11:09:27 CEST 2009 So, goal is this: given a collection of streams and a number of delay operators, construct a kernel routine. Start with 1 (audio), move on to 2 (image) and finish with 3 (video). The kernel routine is _declarative_ : it merely states the relation between neighbouring pixels. To give an operational semantics, a serialization needs to be implemented that constructs a nested loop in terms of a physical data stream. The problems are thus: * specification -> normal form polynomial * polynomial -> imperative loop Parameterizing the choices to be made in the last step will give an implementation / optimization strategy. An interesting heuristic for (artificially constructed) school exercises is to ask: did I use all the axioms/equalities? It might be a good idea to first focus on transformation rules for simple algebraic expressions in s-expression form, i.e. generic associativity. There are two important operations: - arbitrary -> right rotated binary tree (rrbt) - rrbt <-> flat This works really well. A large collection of simple tree / graph operations will probably be the right approach. NEXT: loop body construction. This involves obtaining delay information for the variables, and constructing delay assignments and pre-roll code. This seems quite straightforward. Requires some core routines (id hash) so can be done later. Ok: I have something that generates this: box> (ttx loop-body (r/z #'(+ q (z a) (z (z b))))) (begin (begin (set! b_1 (~ b 1)) (set! b_2 (~ b 2)) (set! a_1 (~ a 1))) (for ((i (in-range n))) (set! b_0 (~ b i)) (set! a_0 (~ a i)) (set! q_0 (~ q i)) (set! (~ result i) (+ q_0 a_1 b_2)) (set! a_1 a_0) (set! b_2 b_1) (set! b_1 b_0))) Now, let's simplify the generator so it becomes more composable. I.e. generator (postponed generated code) should become an algebraic object on which certain transformations can be performed. Can it be turned into a group with a certain generator/... representation? I.e. the code above was directly constructed from a dictionary of variables + their associated maximal delay/offset. Is it possible to factor this into separate operations of memoization (dereference) and matching between adjacent loop iterations? Maybe it's important to see that LOAD/PROCESS/STORE/PROPAGATE is only important as a final view. Memoization can be kept out of the picture for all transformations. [1] http://docs.plt-scheme.org/reference/dicts.html#%28def._%28%28lib._scheme/dict..ss%29._make-custom-hash%29%29 Entry: Linearization Date: Thu Oct 8 15:57:18 CEST 2009 It might be interesting to look at linearization / automatic differentiation in the current implementation of the staapl/algebra code. I.e. useful for deriving optimization kernels for nonlinear models. (Also see Carette's paper and Haskell links in [1]). [1] http://del.icio.us/doelie/autodiff Entry: Loop TX Date: Fri Oct 9 17:38:36 CEST 2009 Seems to be centered around the idea of linear transformation of iteration space, which is an integer lattice. Dependence vectors are integer lattice elements representing spatial dependencies. The 1D case is called a ``distance vector''. ``Optimizing Compilers for Modern Architectures: A Dependence-based Approach''[2] allegedly gives a good introduction about these techniques. EDIT: This is an interesting book. It starts from a lower-level point: iterations with arbitrary dependencies, and tries to reconstruct parallellism from that based on dependence. However, from the point of constructing higher order combinators, the distinctions between maps and folds is really important: transformation laws can be expressed on a higher level. But, on the implementation level of course loops are going to be ever present. [1] http://suif.stanford.edu/papers/wolf91b.pdf [2] isbn://1558602860 Entry: Next Date: Sat Oct 10 11:26:34 CEST 2009 It's getting complicated. Time for some design decisions. * It seems that keeping a loop in reference form is the best approach for now: memoization and pre-fill are easy to generate if not a bit tedious. Some abstraction will help here. * For loop transformations I need to read [1] and reuse the representation. * With the syntax in place, it should be possible to start making and parameterizing source transformation rules using higher level constructs. I need a problem to drive this. Simple straight-line code isn't so interesting as it seems that problem is solved. Today seems to be a low-key day, so I'm going to do the bookkeeping routines: generation of C code from let* form. [1] isbn://1558602860 Entry: statements/dataflow vs. expressions Date: Sat Oct 10 12:41:27 CEST 2009 So, the `let*' SSA translator seems to work. The problem now is to make the bridge between a syntax for a dataflow language, and C statements that implement the assignments: simple expressions cannot do this. I wonder if it's a good idea to use Oz's dataflow/logic variables approach as a basic framework, instead of syntactically differentiating inputs and outputs. The latter also allows non-directed interpretation (as a constraint network, instead of directed dataflow). Let's stick to the simpler approach (dataflow) but postpone the decision until it's necessary. Ultimately this is about having specification as directed equations (single assignment / dataflow) or relations. Anyways, back to the point: integrate the let* form with a loop body generator so it generates runnable code. Next: array references. OK. Looks like everything is in-place to make the first simple generators for 1D streams + Z. Next: compiler for stream operator expressions -> C array code. What is needed? 1. simple references 2. delay memoization It's best to start with the simple for(i=0;iC translation (an imperative subset containing some statements and C expressions). This allows testing of generated code directly in Scheme. Next: make a specification syntax for a (currently sorted) DFL language with a z operator, and translate it into a Scheme/C expression. Entry: spread & cleave Date: Sat Oct 10 13:42:52 CEST 2009 Or Factor and dataflow intent instead of stack shuffling[2][3]. It looks like I've missed a lot of good stuff recently[1][4] (Betweem the noise look at Nowak's replies and related posts). Essentially: `cleave' takes an argument and passes it to a sequence of quotations (a fan-out), and `spread' will apply a list of functions to elements on the stack (zipping like an inner product). Moral of the story: Stack languages are ``too sequential'' (as are monads), and expressing any kind of parallellism without _state isolation_ leads to problems. The `cleave' and `spread' combinators are not parallel because they thread the stack through the iteration. [1] http://tunes.org/~iepos/joy.html [2] http://docs.factorcode.org/content/word-cleave,combinators.html [3] http://docs.factorcode.org/content/word-spread,combinators.html [4] http://tech.groups.yahoo.com/group/concatenative/message/4283 Entry: Z -> C Date: Sun Oct 11 09:53:57 CEST 2009 Given a specification of a filter in terms of stream operators, derive the routine that implements the loop. When done in several steps, this is quite straightforward. 1 convert a `z' notation to normal form by pushing the operator through the expression / network to end up at stream offsets. 2 generate loop code in imperative Scheme, converting `~' to `ref' 3 convert imperative Scheme to C AST and concrete syntax. Step 1 produces code like this: ((~ v1 0) (+ (~ a 0) (~ a 1))) ((~ x 0) (* (~ v1 0) (~ v1 0))) representing a dataflow network. (In the case above we could specify that the stream `v1' won't be observed, meaning it can be implemented as a scalar.) Ok, full circle: box> (ast-emit (stx->c-stmt (dfl/z->scheme #'((v1 (+ a (z a))) (x (* v1 v1)))))) for ((i = 0); (i < n); (i)++) { (v1[(i + 0)] = (a[(i + 0)] + a[(i + 1)])); (x[(i + 0)] = (v1[(i + 0)] * v1[(i + 0)])); } The memoized + loop shift version also +- works, but it needs more infrastructure to connect to, to get an idea the requirements. In any case, the memo code seems straightforward / tedious so I'm going to leave that alone for now, and concentrate on the loops themselves. One thing is quite clear though: building more elaborate compilers as a sequence of steps is very doable, but once the number of transformation steps gets larger, it's probably best to switch to a different representation form (i.e. a typed language), to make the invariants more explicit. Currently I'm using just scheme syntax objects (s-expressions with marked identifiers). I.e. the code explained here uses the following forms (dfl = dataflow Scheme). - dfl with stream variables and `z' operator - dfl with `~' operator (stream variable + offset) - imperative Scheme expressions with `set!', `for' and `ref' - C ast (c.plt) - C concrete syntax So: basic infrastructure is mostly there. Next: this should rest until after I did some more reading on dependence-based compilation and loops. Entry: Tool decoupling (embedded ML) Date: Mon Oct 12 14:14:49 CEST 2009 Compiler compilers. Essentially, can Staapl be formulated in such a way that the compiler itself is generated? If it can be written as an ML-style data transformer, conversion to a fast and simple implementation (i.e OCaml) should be possible. Rationale: removing PLT Scheme from the dependencies (might be difficult due to dependence on module system though) and making compilation a fast operation. Entry: PIC24/30/33 addressing modes Date: Mon Oct 12 16:37:46 CEST 2009 Assembler addressing modes. The problem with the assembler syntax as specified by Microchip is that it uses a syntax that's not so easily expressed as s-expressions. However, addressing modes really are just names mapped to bit representation so they might be handled as constants. I'm just not quite sure about how this will work with transformation / optimization. So, let's just try it: setup the infrastructure (assembler + stub compiler interface) and fill it in. Next: implement 'drop' as "MOV [--W14], W0". The encoding: 01111 wwww B hhh dddd ggg ssss w: offset B: byte mode h: dst address mode d: dst reg g: src address mode s: src reg The problem here is that an s-expression syntax would probably need to provide a bit of sugar, since the instruction format itself is quite spartan. In the current framework, an instruction is a function with a number of binary inputs. This needs to change to something more general, preferably representable as combinators. Also, `MOV' is used for a lot of different instruction tags. Addressing modes: 000 Ws ;; Register direct 001 [Ws] ;; Indirect 010 [Ws--] ;; .. post-dec 011 [Ws++] ;; .. post-inc 100 [--Ws] ;; .. pre-dec 101 [++Ws] ;; .. pre-inc 11x [Ws+Wb] ;; Offset / Unused (RESET) Proposed s-expr syntax: Ws (* Ws) (*-- Ws) (*++ Ws) (--* Ws) (++* Ws) The troubling element is offset addressing. Instead of representing it as (+ Wb Ws) it needs to not refer to offset register, as this is a global entitiy. Entry: Giving up on the assembler? Date: Mon Oct 12 18:22:30 CEST 2009 Looking at the rest of the assembler syntax for dsPIC, I'm starting to think that it might be better to: * use an external (textual) assembler * use some form of textual syntax for pattern matching. I.e. something that can still match operands, but uses standard asm representation in source code. Problems: * pattern matching then requires a parser and an AST rep for the assembly language. * ``target values'' : the current system depends on opaque expressions that depend on address values. Resolving the latter is part of the assembler relaxation process, and ``outsourcing'' them requires translation to the assembler expression language. This is ok for limited semantics, but requires quite a change wrt. the current expressivity. Maybe this isn't really needed? So, external assemblers are currently not possible in a straightforward way. Looks like the only simple solution is to use a matchable s-expression syntax, and extend the idea of assembler pattern matching. I.e. currently the `patterns' syntax uses scheme's `match' syntax for its leaves. This could be extended to allow for assembler expressions in tree form. A better integration with vendor syntax might however be a good idea, if only for verification of the internal assembler against the vendor implementation... Not too urgent though. Entry: Staapl in ML? Date: Mon Oct 12 18:34:24 CEST 2009 The problem is that an untyped approach allows for several ``higher order'' tricks that are more difficult to do in a strict type system. I.e. the assembler is implemented as something akin to am algebraic data type, but it is still possible to also match the opcode parametrically (which would be a constructor in a straightforward implementation in ML). I wonder, is this good or bad? It's good that I can express more, and manually add some constraints at compile time, but it's bad that some corner cases escape this ad-hoc type system. Entry: What is an assembly language? Date: Wed Oct 14 15:50:26 CEST 2009 In staapl, it has the following function: - macro semantics are defined in terms of assembly language transformation / generation - assembly language is either compiled to binary code, or interpreted in other ways (simulator) It's important to distinguish the two language levels (transformation and interpretation). Entry: staapl marketing Date: Tue Oct 20 19:00:56 CEST 2009 I'd like to capture the nature of Staapl in a single meaningful phrase, to then try to explain it word-by word, and allow for exceptions. The `patterns' language is the specification language of a concatenative(1) code transformation(2) language with local(3) actions. (1) Concatentation of `words' (syntax) denotes composition of code transformation functions (semantics). (2) The code being transformed is the machine language of an abstract or concrete machine. (3) Because the code transformations are local, they act as the operations of a stack machine. If this locality is _also_ reflected in the machine language objects the transformers manipulate, a 2-stage stack language can be constructed. The system behaves as a concatenative macro assembler that performs reductions (computations) in addition to expansions. Extensions: Giving up locality allows the construction of more general combinator languages, not necessarily stack-based. This sort of behaviour can be embedded inside a stack-machine. Entry: PIC18F1220 with direct speaker attachment Date: Thu Oct 29 11:57:37 CET 2009 It's possible to connect an 8 ohm speaker directly to the PIC output pins as long as you switch it fast enough. I'm going to use this to build a bridged circuit for a burgler alarm. P1A / B3 - pin 20 P2B / B2 - pin 19 On the CATkit board these lead to R7 and R8. The software: I'm copying synth-1220-8.fm to alarm-1220-8.fm The app can reuse most of the synth lib. It just requires some different config data and boot code. R: I need to document standard practices. Building a library of code and running a console with access is straightforward. But how to do the boot process? So, it's sputtering. Let's make the wiring optional. I've changed it so that by default it drives the speaker in full-bridge, but switches it off in the `engine-off' word to avoid DC current. Entry: Synth: the control layer Date: Sat Oct 31 09:45:01 CET 2009 A lot of thought went into building the control/virtual-sample 2-task structure, but apparently not into documenting it, except for doc/pic18-synth.pdf (from .tex). I'm just using `sync' from the synth module to provide a time base for top-level control (duration & beep frequency). The fancy task switching is not used. FIXME: there is no other task.. How come `yield' still works? Entry: Haskell on hardware Date: Fri Jan 15 08:34:32 CET 2010 Read this [1][2]. Is there an alternative? I.e. can the operating system be eliminated such that Haskell runs straight on hardware, and the hardware's physical model is somehow represented without resorting to sequential programming tricks like the IO monad? It seems there is an opportunity to try this out in Staapl. As the concept of application / program / physical interaction all somewhat blur on a bare metal microcontroller as there is little historical baggage to carry around (i.e. operating system) except for the actual design of the machine. Can this last step be eliminated by designing a machine that's less like a sequential computer, and more like a bunch of functions and events? What about a graph reducer in hardware? Or first, a hardware-assisted GC? [1] http://conal.net/blog/posts/can-functional-programming-be-liberated-from-the-von-neumann-paradigm/ [2] entry://../compsci/20100115-080242 Entry: Roadmap Date: Mon Feb 8 14:18:57 CET 2010 After a long time off the project, I'm thinking about the following goals: * Make sure the low-level part is standard. Staapl is essentially a macro-assembler based on PLT Scheme language towers. However, it does not have a proper interface for interacting with external assemblers and binary tools (i.e. binutils). Currently the PIC18 assembler is an essential part of the system. This should be separated. I still think it's a neat idea to take the assembler into the workflow, but most of the work should be off-loaded to external tools, as writing an assembler in itself isn't very productive. For this, the assembler expression language in Staapl needs to become simpler (atm. it's scheme). * Offload high-level components to Ocaml/Haskell. See libprim[1] and meta[2] projects. [1] entry://../libprim [2] entry://../meta Entry: Offloading high-level components to OCaml/Haskell Date: Sun Feb 21 12:13:32 CET 2010 I've been playing with Haskell and (Meta)OCaml a bit lately[1]. Current conclusions are that indeed they seem to be better suited for building program transformers. Types come in handy when things get too complex. This suggests that for Staapl it might also be useful to finally move to Typed Scheme. However, this will probably be postponed a bit until I'm more comfortable with Haskell & OCaml. One thing though: for commercial applications I really see the typed languages as more promising, especially when code modifications are required. Somehow it seems that when the right abstractions are in place, the typed languages are more friendly to the beginner. Finding the abstractions in accordance with the available type magic can be hard though; type waters go deep. [1] entry://../meta Entry: PIC24 assembly Date: Mon Feb 22 21:25:49 CET 2010 Another approach to get PIC24 support is to start from the (textual) assembler, write an s-expression language for it and incorporate it into Staapl in a bottom-up way; i.e. let the structure of the asm language drive the higher abstractions instead of the other way around. Entry: Fixing (limiting) the assembler Date: Wed Mar 3 19:15:16 CET 2010 This needs a fix in the `tv:' form. This is an RPN form which takes names either from the lexical environment and lifts them as constants or delegates to the `scat:' form. It looks like the modification needs to be made to replace that `scat:' form by something with a restricted functionality such that assembler expressions can be used. Roadmap: - replace tv: -> scat: delegation by something more explicit, i.e. with an empty namespace - add functionality to the new form until all uses are covered. This might turn up some cross-connections at places.. ( Ha! With my head in the Haskell learnings lately I forgot that I really like Scheme macros ;) EDIT: I've been going through the code and it seems the use of tv is really quite limited to arithmetic expressions. And, there is already a `partial-eval' that can pretty-print code: this is used in the repl. Atm it seems that this can be left un-checked. Checking can be introduced quite directly in the `tv:' macro definition, using the `fn-no-lex' argument to `lex-mapper'. EDIT2: Maybe it's even simpler to combine the partial eval and other code to using abstract evaluation, instead of explicitly. EDIT3: Or the other way around.. Encode sytax explicitly, but tag semantics in case evaluation is done in Scheme. The unifying principle is: we don't really have semantics in Scheme; the assembler should be abstracted, so it needs an AST rep, not a functional one. Here I miss Haskell type class polymorphism. But maybe other forms of polymorphism can be used. Swindle? Entry: Staapl's substrate Date: Wed Mar 3 19:26:02 CET 2010 I've been pondering a lot about the code/data issue and the (operational) semantics of Staapl. One of the recurring themes is: * The programmer only ever sees combinators, not machine code. * Combinators are defined as manipulation of VM code, which is a mix of real machine code and intermediate representations. The same problem came back in some Haskell code I wrote for performing algebraic simplifications. The vague idea is this: you really want to only ever use combinators, but in order to implement them they need to have a ``data base''. In Staapl this data base is some intermediate machine code. Combinators generate and manipulate machine code. The thorn in my side is: why is the pseudo-machine code (QW,CW) not separate from the real machine code? Or, is there a benefit to combining both over a 2-phase approach where the combinators only work on intermediate (QW,CW) code, and the peephole optimization is done directly on that instead of by the combinators themselves? From the perspective of [1] only the low-level optimizations are relevant for Staapl. Chapter 18 talks about pattern matching for machine idioms, which is what most of the code in the PIC18 code generator is about. Chapter 6 talks about code generation from a lowlevel intermediate representation (LIR) using a SLR(1) parser. [1] http://books.google.be/books?id=Pq7pHwG1_OkC&printsec=frontcover&dq=advanced+compiler&source=bl&ots=4W91Krb-tS&sig=H7M_d8VN-MQAiInjpUshpqiwq_0&hl=en&ei=CauOS4KcKpn20gTNqvnuDA&sa=X&oi=book_result&ct=result&resnum=3&ved=0CBIQ6AEwAg Entry: Correctness and Machine Description Date: Wed Mar 3 19:50:37 CET 2010 One of the major problems in the development of Staapl was the correctness of its code generator and peephole optimizations. While the rules by themselves are easy enough to prove, their large number and the absence of a coverage test makes it possible for mistakes to seep in. I've written before about a way that makes it possible to assure the correctness of the code generation by adding redundant information (semantics) to the machine language (i.e. a simulator). I wonder if this can be used together with quickcheck. Since the problem is the generation of the test cases, automatic that part might be an interesting approach. When trying this I got stuck on the description language for writing down the semantics. Then I got off on a tangent to describe a dataflow language. Essentially what is necessary is to represent each instruction as a transition function, and perform automatic lifting such that sequences of instructions can be translated to composed transition functions. Once a sequence of machine instructions has a _functional_ representation (function + dependencies) it is straightforward to make test cases and relate this concrete low-level operational semantics to any higher level operational semantics. *** The real problem: Staapl is not a language. It is a notation for a macro assembler, and in its current form it is not well specified. I've seen this problem before[1]: thinking of the evolution of Staapl on multiple targets as the evolution of a standard interface. I'm not so sure this is actually workable. The problems I face simply attempting to move to a more complex architecture (PIC24) makes me wonder if it's not better to stick at the low level, or at least have _multiple_ interface inbetween that can be reused, given constraints. I.e. following Haskell typeclasses and OCaml modules, it makes sense to reuse number systems (type classes). It might be more useful to push Staapl in such a direction: organically abstract more functionality. [1] entry://../staapl-blog/20090716-132153 Entry: Assembler expression language Date: Fri Mar 5 08:32:16 CET 2010 The reason this is so difficult is because this mixes a lot of concepts: - binding: all expressions are valid in an environment of labels. i.e. it is a reader monad. - the expressions themselves should be compilable to external assembler expression trees or embedded in Scheme - the syntax of the expression language can be concatenative or nested expressions. maybe it's best to stick to the latter, because they have "value" semantics, not stack semantics. - this can be unified by allowing a lifting procedure that lifts relevant scheme procedures to procedures over an extended abstract domain. I.e. + can mean scat/+ or scat-abstract/+ which delegates to scat/+ in case of literals. So essentially: A an assembly object is a collection of _undefined_ labels (an environment) and a collection of expressions in terms of those labels M machine code is a collection of _defined_ labels containing binary code objects. To be able to use internal and external assembler, the function A -> M needs to be abstracted. Entry: Assember refactoring: practical Date: Sat Mar 6 08:54:35 CET 2010 Get rid of the `source' component in target-value, and use abstract interpretation instead. Try once to see the dependencies, then reconsider if it doesn't work. Reconsider: I do not want to loose names of constants. Currently these go through: (define-syntax-rule (constants (name value) ...) (begin (define name (target-value-delay value 'name)) ... (compositions (macro) macro: (name ',name) ...))) Which binds the name both in the Scheme namespace to a target-value and in the macro/ namespace as a constant function that refers to the scheme-bound name. It looks like there is no way around it: current implementation needs to maintain explicit source rep as we need to export symbols. Maybe the functional rep can then be discarded? I forgot: how de labels fit into the picture? The key routine is the following: target-value objects will trigger their interpreter, while target-word structs return an address if it is defined, or abort otherwise. ;; Undefined words will abort. This is internal: used only to ;; recursively evaluate target-value references. (define (target-value-eval expr) (cond ((target-value? expr) ((target-value-thunk expr))) ((target-word? expr) (or (target-word-address expr) (target-value-abort))) (else expr))) To make this easier to understand, I want to change names: target-value -> target-asmexpr target-word -> target-node Maybe the latter is not necessary, but the former seems to be: it's essential to capture the idea that these are expressions - not fully evaluated. No - it's too interwoven, also the doc. I don't think it's that complicated once you look from the perspective of the assembly result, not the specification: target-word = a machine address target-value = a literal operand (can be address or num constant) Because the target-word objects are not yet instantiated before assembly relaxation is performed, the target-value objects need to be thunks, parameterized by the value of the target-word objects. I think it really needs to stay the same. Maybe the only thing to do is to abstract the composition mechanism and representation of the target-value code components. I.e. turn them into s-expressions instead of concatenative words, as they represent values, not stacks or stack ops. So.. is this a detail or not? PRO: it's instructive to be able to see symbolic asm code in RPN form like this: box> (code> 123 TOSL +) [qw (123 TOSL +)] CON: - essentially, these chunks represent values, not stacks or stackops - an extra translation from RPN -> expression form is necessary What about this: keep the code form to serve as human-readable only, then modify the function form such that the functions come out of an asmexpr namespace. This then allows one to encode the assembler syntax into the operations. I.e. change 'tv:' to something that is parameterized by the asmexpr compiler. So I'm at this now: ;; The target-value compiler uses scat: to construct assembler ;; expressions. (FIXME: this sould later be full parameterization of ;; which assembler to use). (define-syntax-rule (tv: . code) (make-target-value-compiler scat: . code)) Is that enough? No, it needs to be parameterized differently. Is the whole target-value / target-word business necessary when using an external asm? In some sense yes as it's necessary to read back the binary data. (Maybe that alone can be quite a problem?) I.e. the binary code storage is tightly coupled to the target-word struct, which matches it up with assembler opcodes. Really, this can't be handled by external tools. My conclusion is this: - Staapl is a macro assembler. The 'assembler' (symbolic -> binary translation and relaxation) part is a deeply integrated component of the system. I.e. the Staapl assembler is more powerful than external monolithic tools. - To support external tools, use tool behaviour snarfing. Todo: write an assembler snarfer (in Haskell?). Entry: Recovering static structure of Scheme programs Date: Sat Mar 6 09:20:38 CET 2010 Aside from the name bindings, there is of course little static structure in Staapl. I'm having trouble re-figuring out the connection between target values and target words. Hint: follow name bindings, and possibly change them temporarily to find out dependencies quickly. Entry: The PIC24 assembler Date: Sat Mar 6 11:09:12 CET 2010 What assembler can be used to start from? The MicroChip tools can run under wine. Maybe it's best to use those? Alternatively use the dspic30 toolchain [1][2]. Maybe it's best to stick with the windows releases. Other toolchains will probably have similar windows-only components. Entry: MPLAB and wine Date: Sat Mar 6 11:45:58 CET 2010 The MPLAB version I have apparently doesn't run on 64 bit linux. The error message: winevdm: unable to exec 'D:\tom\.wine\drive_c\MPLAB\MPASM.EXE': DOS memory range unavailable Goes away after this: sudo sysctl -w vm.mmap_min_addr=0 Now the message (on 64bit) is: wine: Cannot start DOS application "D:\\tom\\.wine\\drive_c\\MPLAB\\MPASM.EXE" because vm86 mode is not supported on this platform. According to [3]: > This happens because your CPU is still in 64-bit mode. Any Intel/AMD > processor cannot use vm86 once the cpu is in 64-bit mode. The only way to > run 16-bit applications on a 64-bit OS is emulation software such as DOSBox. Ok, full emulation then.. Hmm.. I also tried the pic30 chain at [4]. I did find some compiled debs that work on 32bit[5] but the compilation of [4] requires deps I don't have. Too much hassle. [1] http://www.baycom.org/~tom/dspic/ [2] http://iridia.ulb.ac.be/~e-puck/wiki/tiki-index.php?page=Cross+compiling+for+dsPic [3] http://osdir.com/ml/wine-users/2009-07/msg01058.html [4] http://sourceforge.net/apps/mediawiki/piklab/index.php?title=Compilation_of_pic30_version_3.01 [5] http://iridia.ulb.ac.be/~e-puck/wiki/tiki-index.php?page=Cross+compiling+for+dsPic Entry: dsPIC -> ARM Date: Sat Mar 6 12:53:18 CET 2010 I think it's probably better to forget about dsPIC for now, and work only on ARM (thumb) code generation. There are essentially two problems: - move from flat -> nested syntax for argument modifiers (i.e. addressing mode) Oops. I ran out of steam. Most of these developments seem to be dead ends or require a huge amount of effort I can't spend atm. Entry: GNU Binutils Date: Sat Mar 6 14:57:23 CET 2010 It looks like the real problem is that I'm underestimating the arbitrariness of assembler syntax. This arbitrariness needs to be encoded somewhere.. Maybe it would be instructive to see how binutils is implemented? Some intro here[1]: opcodes/ contains the opcodes library. This has information on how to assemble and disassemble instructions cpu/ contains source files for a utility called CGEN. This is a tool that can be used to automatically generate target-specific source files for the opcodes library, as well as for the SIM simulator used by GDB. I'm looking at the Microchip binutils extension from mplabalc30v3_01_A.tar.gz, in acme/opcodes/pic30-opc.c Much of the necessary information is in const struct pic30_opcode pic30_opcodes Looks like that can be snarfed just fine. [1] http://www.linuxforu.com/teach-me/binutils-porting-guide-to-a-new-target-architecture/ Entry: Graham-Glanville method Date: Sat Mar 6 18:59:51 CET 2010 (See Muchnick: Chapter 6) The basic idea is to relate trees and instructions in a grammar, and perform bottom-up parsing with certain disambiguation rules. Then parsing emits instructions while reducing the tree. Entry: PIC30 tools Date: Sun Mar 7 14:53:50 CET 2010 tar xf pic30-deb-templates-3.01.tar.bz2 cd pic30-3.01/pic30-binutils-3.01/upstream ; wget http://ww1.microchip.com/downloads/en/DeviceDoc/mplabalc30v3_01_A.tar.gz cd pic30-3.01/pic30-binutils-3.01 ; dpkg-buildpackage -b cd pic30-3.01 ; dpkg -i pic30-binutils*.deb cd pic30-3.01/pic30-gcc-3.01/upstream/ ; wget http://ww1.microchip.com/downloads/en/DeviceDoc/mplabc30v3_01_A.tgz cd pic30-3.01/pic30-gcc-3.01/ ; dpkg-buildpackage -b The support files come from the MPLAB distribution: see [1] http://ww1.microchip.com/downloads/en/DeviceDoc/MPLAB_C30_v2_05-Full.exe under windows, install with serial: MTI030340303 Copy directories /include, /lib, /support, and /bin/c30_device.info to pic30-3.01/pic30-support-3.01/upstream/ tom@wurzon /Program Files/Microchip/MPLAB C30 $ tar zcf pic30-support-3.01.tgz include lib support bin/c30_device.info Now copy to linux and untar in pic30-support-3.01/upstream (mv bin/c30_device.info .) debian note: - gcc-3.3 and sysutils are in etch - dos2unix is now fromdos (apt-get install tofrodos) [1] http://www.opencircuits.com/DsPIC30F_5011_Development_Board [2] http://sourceforge.net/apps/mediawiki/piklab/index.php?title=Compilation_of_pic30_version_3.01 Entry: Hands-on Date: Sun Mar 7 15:24:53 CET 2010 Let's forget about the dsPIC assembler until after using the binutils/gcc toolchain to get something to work. I'd like to re-focus on actually using Staapl to build things instead of perpetually redesigning the core. The problem I got stuck on last time was building a network of PIC devices. Let's make this priority one. Last time the problem was that I don't have an I2C network yet. This is the ultimate goal. In order to get there, I need to build a hub app. The hub app takes serial PC communication to anyting else. The hub needs to run at 40 MHz to be able to get a decent data rate. It's been a while. In the summer I got side-tracked by the dataflow language ideas. This is where I got last time: entry://20090711-151122 better to use a single code image (homogenous network) entry://20090627-144638 serial daisy-chaining works entry://20090625-131214 standard ``zwizwa connector'' entry://20090606-125114 ... The circuit that's on my desk is a 452@40 connected to a 2620@54 (using a 13.5 MHz XTAL). Daisy chaining worked by connecting the TX of the 452 to the RX of the 2620. In Staapl this worked like this: tom@zni:~/staapl/app$ make ttlmono-2620-54.dict tom@zni:~/staapl/app$ mzscheme ttlmono-2620-54.dict That didn't work tom@zni:~/staapl/app$ make 452-40.live That gave a programming error. After disconnecting the 2620 it did program succesfully. Looping the RX/TX chain on the slave connector also worked. scan Found 1 target(s). OK If I recall this has something to do with the reset of the 2nd PIC. Or the baud rate? Hmm.. and then out of the blue it works. scan Found 2 target(s). OK Using `target!' it's possible to switch target. This does get messed up easily. I have no idea how to reset it properly. This seems to help: pk2cmd -W1 -R -PPIC18F452 Good. Cleaned up the target addressing so it is a bit more robust: - target-count now sends a nop token to 255 without checking - target-receive+id/b checks to see if a message made a round-trip without being answered - target! performs target-count to make sure the id is valid before setting it. Entry: Dependency analysis Date: Tue Mar 9 11:22:10 CET 2010 Modeling a CPU. Let's stick to the 2-phase model: - static: dependencies (connectivity: input influence & output spill) - dynamic: boolean functions The idea is to express composition: - static: this is type inference: a sequence of instructions has certain static information associated that can be composed at compile time as concrete info (i.e. register usage). - dynamic info can be functionally modeled. It might be better to do this in a typed language then? Entry: Composing partial state maps Date: Sat Mar 13 09:04:52 CET 2010 I need to get this going.. Let's look at the assembler typing/sim stuff, either in staged Scheme or in Haskell. First point is the datasheet[1]. Let's use the 12F675 since the instruction set is simpler than the 18F series. What are the high-level problems? * Functional dependencies. This is easy: ADDLW k input constraint: 0 <= k <= 255 function: (W) <- (W) + k status function: C, DC, Z This needs a low-level logic language to express the semantics. I.e. at the bottom this should be logic gates, but certain compositions should be acellerated by "simulator macros". The status functions need to be made explicit, so the function type is: k -> W -> (W,C,DC,Z) * Composition / dependency analysis. The interesting problem is how to take descriptions that map partial state to partial state, and lift them to complete state maps so they can be composed and then possibly re-embedded in a minimal representation for dependency or "clobber" analysis. What are the low-level problems? 1. The basic level is logic gates: the bottom line of semantics should be as simple as possible. 2. Solve composition of logic gates by dependency analysis. 3. Solve the mapping (simulation) problem, i.e. implement certain networks by operations present in an implemented target, which could be a high level language, a real machine, an FPGA circuit (build accelerated compiler verifiers on FPGA!), ... What is the essential idea? I'm dealing with information on at least 2 levels: network connectivity (meta-lang) and network function (object-lang). I.e. staging is essential: There are going to be computations on the meta-lang (figuring out dependencies, lifting dependencies to provide composition). How this is implemented doesn't really matter much (Scheme, Haskell, MetaOCaml). * Scheme: highest flexibility (hackability due to more operational nature: just do it) and very simple staging (macros & modules). * MetaOCaml: typed staging, might be useful for figuring out static structure of the program itself better. * Haskell: very flexible type system, but type-level computations are somewhat complex. Functions and resources. 1. FUNCTIONS: object language: functional dependencies (AND,OR,NOT,compositions) between nodes. 2. RESOURCES: meta language / type system: _physical_ instantiations of functional networks connecting shared nodes. Tagless interpreters[2] and embedding of staging in non-staged languages[3] are going to be essential components to provide insights. Hardware languages are about instantiation of macros. Is it possible to use the ideas behind Ziggurat[4] to provide higher level semantics of compositions? I.e. it's simple to pool together a huge number of gates into an abstraction, and simulate it by simulating the instance directly. What is interesting though is to abstract it n the meta-level (semantics!) not just the human-modular level. What I mean is: composition of modules in hardware description languages is about abstraction for the engineer: the engineer has a (fuzzy) model about how something works and can use this to "proove" correctness of compositions. Can this be made more formal? Can we use the specification (simplification) as a type of a hardware module? I.e. how to relate low-level properties/semantics to high-level[3]. The real problem seems to be management of state (resource). I.e. an I2C interface isn't a function, it's a state machine. Anyways.. There's something to learn here. [1] http://ww1.microchip.com/downloads/en/DeviceDoc/80125H.pdf [2] http://okmij.org/ftp/tagless-final/APLAS.pdf [3] http://okmij.org/ftp/Computation/staging/metafx.pdf [4] http://lambda-the-ultimate.org/node/3179 Entry: Haskell vs. Scheme Date: Sun Mar 14 21:32:58 CET 2010 Idea: pattern matching is not composable in Haskell/OCaml: syntactic elements that cannot be abstracted over. In Scheme this is easy. (patterns-class (macro) ;;---------------------------------------- (word opcode) ;;---------------------------------------- ((1+ incf) (1- decf) (rot<>c rrcf) (rot<< rlncf) (rot>> rrncf) (swap-nibble swapf)) ;;---------------------------------------- (([movf f 0 0] word) ([opcode f 0 0])) ((word) ([opcode WREG 0 0]))) Here the identifiers `1+',`1-',... are defined by instantiating the two rules on the bottom, filling in `word' and `opcode' respectively. Entry: Functional representation of stages Date: Sun Mar 14 23:00:40 CET 2010 I've identified the language levels[1] in Staapl. I was wondering how to represent these as higher order functions, and whether it is useful. I see 2 ways to do this: A. Directly relating functions in the 3 levels: > type Mem a = [a] -- Machine state, parameterized by representation > type Asm a = (Mem a) -> (Mem a) -- Machine code, represents state transitions > type Macro a = (Asm a) -> (Asm a) -- Macro language = machine code transformers The problem is that this does not include pattern matching rules (intensional analysis). Code is an opaque type. Pattern matching seems to be an essential component to encode the actual work. B. So, what about including the interpretation steps, i.e. have a mix of code and data intermediate? See figure in [1] which looks like data int | stx ====> fun | : | comp : V : data' V stx' where the morphism called "interpretation" is included explicitly. > -- Machine state, parameterized by representation > type Mem a = [a] > > -- Semantics of AsmStx and MacroStx > type AsmFun a = (Mem a) -> (Mem a) > type MacroFun a = (AsmStx a) -> (AsmStx a) > > -- Machine and forth syntax: concrete (structured) data. > data MacroStx a = ... > data AsmStx a = ... > > iAsm :: AsmStx a -> AsmFun a > iMacro :: MacroStx a -> MacroFun a So is interpretation a practical issue (in Haskell, OCaml, Scheme, ... you need the syntax representation to be able to manipulate it) or is this of deeper significance? Ultimately this should be related to mathematical logic, where formal statements and and formal rewrite rules are manipulated on the meta-level. [1] entry://../staapl-blog/20100314-192109 Entry: About compilation Date: Sun Mar 14 23:07:46 CET 2010 It really is ultimately about proof that a walk towards an optimum given by one measure, doesn't change another measure. I have this neat little bag of symbols here that after a complex computation (interpretation) gives a result. What I want to do is to replace that neat little bag with an even neater little bag (according to some measure) such that the result after interpretation isn't influenced. This is a constrained optimization problem. CONSTRAINT: semantics preserving OPTIMIZATION: minimize some other property. Suppose my neat packet is X, suppose my correct semantics expressed by the equation S(X) = 0, and suppose my neatness (i.e. code size, execution speed, power consumption) maps X into an ordered space P : X -> {O,<} such that we can compare X1 and X2 based on the order introduced by the property P. The deal is: in a transformation machine that preserves semantics S(X), a lot of the internal structure of S is ``ghosted''. I think the place to look is the peephole optimizer of [1]. [1] http://compcert.inria.fr/doc/index.html Entry: Compilation, Interpretation and Staging Date: Tue Mar 16 15:13:43 CET 2010 I'm trying to build an intuition for the following diagram data int | stx ====> fun | : | comp : V : data' V stx' which represents the types: int :: stx -> (data -> data') comp :: stx -> stx' An interpreter maps syntax to function, while a compiler maps syntax to syntax. For a state machine representation, data=data'. The difference between staging and compilation is then quite clear: - staging uses the range (data') of a target function domain as the input of a following interpretation step - multi-pass compilation is straightforward function composition The big idea is that compilation is just computation (stx -> stx'). However, the connotation of compilation is usually that semantics of the syntax is preserved in some way. I.e. every intermediate syntax will be related to some semantics (function domain) in such a way that those function domains can be related. int stx ====> fun : ^ comp : ; : ; proj V int' ; stx' ====> fun' I.e. for a definitional interpreter int, we know that comp is correct if int = comp . int' . proj where int' is the target interpreter and proj maps the target semantics (i.e. machine simulator) into the original semantics. Question: what happens if proj points in the other way? Entry: Compiler Testing Date: Wed Mar 17 11:24:37 CET 2010 What is needed is a test relative to a reference implementation, and a sufficient argument the test coverage is broad enough. This requires: - a _simple_ reference implementation (VM + compiler) - simulators for target architectures + test suite. The idea is that it is easy to write these components (errors can't hide), but it is error prone to write an optimizing compiler. Entry: Test Coverage Date: Wed Mar 17 11:42:39 CET 2010 As an alternative to simulation, it might be possible to create a test suite that runs on the target. Essentially, the only reason to write a simulator is to ``augment'' it with behaviour that makes inspection easier. This leads to the question of coverage. Can the compiler be instrumented with a log that records the rewrite rules that are being applied? Entry: Multi-pass : applicative functor? Date: Wed Mar 17 11:59:44 CET 2010 Staapl has 2 passes, one that applies a list of functions to a list of assembly instructions, and one that applies the same function on the list. What about making these operations explicit? Entry: Proving rules Date: Wed Mar 17 12:03:01 CET 2010 Is it really so that a correctness proof is too difficult? Some of them are really quite trivial, i.e.: (((movlw a) (exit) pseudo) ((retlw a))) The good thing about proof is that all rules can be treated individually. For testing, I'm not so sure about that. (([qw a ] [qw b] -) ([qw (tv: a b -)])) (([addlw a] [qw b] -) ([addlw (tv: a b -)])) (([qw a] -) ([addlw (tv: a -1 *)])) Entry: Adding static semantics to macros Date: Sat Mar 20 13:46:29 CET 2010 I'm trying to get an idea of Ziggurat[1] from the JFP paper (no public link). Before trying to explain that approach, I'm going to walk around in ignorance for a bit, and see what I can write down. A macro is a compiler: code -> code' A typed macro is a compiler together with a (static) interpreter. code -> fun = code -> data -> data' The idea is that before (or interleaved with) compilation, some interpretation is performed on the code to see whether it has certain properties, without executing it completely. I.e. one performs an abstract interpretation. ( Note that the basic unit of interpretation doesn't need to be the same for code and data. I.e. type checking/inference can span over multiple functions in a module, while functions are typically isolated in behaviour. ) Summary: The main idea is that there is _both_ compilation and interpretation going on. Ok, now the paper. * The basic idea is delegation of behaviour. In terms of syntax objects this is semantics: how to interpret. A syntax object can provide its own meaning, or delegate to the syntax it expands into. * The "lazy" part is there to break a possible circularity. Delegation depends on expansion (macro use time) and is not knowable at macro definition time. [1] http://lambda-the-ultimate.org/node/3179 Entry: Lambda vs. Patterns Date: Mon Apr 12 10:10:29 EDT 2010 Dominikus' patterns: variables are "freed" after patterns are executed. What makes this different from the LC? I.e. variables are still used to provide random access to values inside data structures (to encode permutation combinators), but there is no concept of closure or environment. Apples and oranges. But what is the link? How to make the correspondence: application/abstraction vs quotation/dequotation? Confusing stuff without a proper substrate. Entry: removed log-stx in asm-template-tx Date: Sun May 16 15:26:56 CEST 2010 I don't remember how the template logging works, but I've commented out the reference of log-stx in the asm-template-tx macro in coma/pattern-tx.ss Entry: Problems with Staapl Date: Tue May 18 15:01:57 CEST 2010 I'm using it for something practical after 6 months of leaving it alone. What is annoying? - PIC chip config is brutal (binary only) - Basic PIC18 library configuration also: there is no mechanism for defaults + there is a bunch of "include" files that are badly organized. - There is no standard approach for debugging up to getting the serial console working. Make a check list or something. (i.e. measuring "#0xF0 transmit" with a frequency counter). - There is no automated procedure for porting Microchip include files to .f Entry: 4550 fatman Date: Tue May 18 15:29:43 CEST 2010 Got it working after soldering on a 100n decoupling cap and switching to 19200 baud. Now it works up to 230400 baud. Entry: The Staapl Killer App: static data structures -> code Date: Thu Aug 19 23:12:41 CEST 2010 Reactive framework. 1. Compile to RAM structure 2. Compile to Flash + RAM structure 3. Combine static data structure + its "interpreter" into in-line code. Number 3 might be interesting to work out for a I/LE reactive network. It has a fair amount of links and a relatively small run-time state. The data structure links can be replaced with code branches, and the run-time state can be used to make those branches conditional. Entry: Targets Date: Tue Aug 24 21:55:11 CEST 2010 A bit too much freedom to pick from so many targets. One thing I wonder about is whether the PIC was actually a good choice. It is really different from standard RISC. ARM cores are getting really cheap. Programming them in C isn't a big deal as I've recently learned; It feels like "normal" programming. So why am I doing the whole tool stack myself? Maybe I should keep these questions away from the project and worry about it in the libprim/meta projects. * Staapl is about the PIC18. I don't really have time to make it work on different architectures (tools + libraries). Maybe one day LLVM, but that's it. * Staapl is about Scheme and Forth. To make Staapl more interesting community-wise, it might be a good idea to get the standard Forth interpreter going: automatic "unrolling" bootstrapping. About the dsPIC. This is an interesting architecture from an application side, but probably better for the other metaprogramming project. In any case, I should not attempt to do anything before writing an app in its machine language.. Then the road will become clear pretty fast. Entry: Staapl heritage: colorForth and Machine Forth Date: Sat Sep 4 19:54:46 CEST 2010 It might be interesting to track down the Staapl heritage. Chuck Moore's and Jeff Fox's web sites are probably key. Brad Rodriguez Moving Forth series[2]. [1] http://www.complang.tuwien.ac.at/anton/euroforth/ef99/thomas99a.pdf [2] http://www.bradrodriguez.com/papers/ Entry: Why does Forth lead to small code size? Date: Sat Sep 4 20:30:24 CEST 2010 1. You're forced to _factor_. This exposes reusable code and "exponential leverage". 2. You're forced to _order_ variable accesses. This reduces addressing overhead (if you can avoid stack shuffling). Both are non-trivial at first, but learnable and soon become second nature. Entry: I need a fun project Date: Wed Sep 8 01:52:10 CEST 2010 Working working serious stuff.. I need some play. What about a bootstrappable (standard?) stand-alone Forth? Last time I worked on this I got stuck at some cross-stage binding issues, which are solvable using let-syntax[1]. [1] entry://20090722-123240 Entry: The good thing about microcontrollers .. Date: Sun Sep 19 20:29:28 CEST 2010 .. is that RAM is pristine. Apart from having to deal with hardware interfaces, there are very little limitations about how to use memory. ( Forget for a moment the possibility to reserve _part_ of the RAM and Flash for code with standard calling conventions and non-moving flat memory layout. ) The point is that building a more useful graph memory on top of RAM/Flash combo is doable when you can design the whole system. Entry: Things to fix Date: Sun Oct 3 08:38:46 CEST 2010 * Modules work fine, but what with parser macros? Currently the way these are made modular is a bit of a hack. Can this be done differently, or is it not a real problem? The problem is that units and macros don't mix well. It is possible to define macros in terms of unit identifiers, but then these are treated somewhat special. Entry: One chip, one model for verification Date: Tue Oct 5 21:15:52 CEST 2010 An advantage of using only one target chip (that's simple!) is to be able to write a verifiable semantics more easily. Entry: Continuous evaluation Date: Wed Oct 6 19:46:20 CEST 2010 Tired, but different frame of mind.. - Staapl Forth is low-level but easy to build on: Each application is a DSL. - For verification, some formal semantics is necessary, even if only for black-box testing. Entry: Picking up again Date: Mon Nov 15 11:52:19 EST 2010 I'd like to pick it up again, probably restructuring and documenting the code. The goal is to work towards a USB driver for the PIC18 with a middle point that implements a proper debugging interface and emacs bridge. Meaning, the compiler seems to be working fine, now make the interaction system a bit less messy. First, what's with this module business? A "staapl pic18/serial" line is equivalent to "(require (planet zwizwa/staapl/pic18/serial))". This defines a couple of macros like "macro/async.>tx" and target words like "target/async.>tx" The macros are code generation/transformation functions while the target words are compiled target code. One of the most important features of modules is to make sure all names are bound at compile time. I.e. I just converted an old usb.f file to __usb.ss including the top line: #lang planet zwizwa/staapl/pic18 \ -*- forth -*- This then allows compilation with "mzc __usb.ss", giving an error in my case: mzc ~/staapl/staapl/pic18/__usb.ss /home/tom/staapl/staapl/pic18/__usb.ss:23:6: compile: unbound identifier in module in: macro/UIE Entry: More flexible name binding Date: Mon Nov 15 12:21:14 EST 2010 What is getting more clear now is that I need both modules and units as abstraction mechanisms. Example: machine constant names are an interface. They are provided _globally_ so need to be parameterized somehow. Modules won't work there. This is important. It needs some serious thought. In deeply embedded software projects, compile time parameterization is important (i.e. see eCos). This means code has holes. Soo, looks like I found a project to constructively procrastinate on the USB driver: work out the module system. Entry: The module system Date: Mon Nov 15 18:51:57 EST 2010 - The current "word set" approach is good. Build on that. - Prefix parsing macros are bad because they hurt composition and interfere with the Racket unit system. Can they be removed or somehow be made harmless? I.e. turn them into surface syntax only? - Can we attach types to the unit interfaces? Entry: Partial evaluation Date: Mon Nov 15 18:55:07 EST 2010 Is the current greedy approach actually smart, or should reduction be defined in a different way? Find a good explanation for the two alternatives. The good part is that partial evaluation is expressed as simple and straightforward evaluation on "code stacks". This is good, as long as the language comprising the code stacks is simple. Currently it also includes machine asm, so it's screaming for a semantics that can be used to verify partial evaluation rules. Entry: Linker scriots: eliminating `load' Date: Tue Nov 16 08:13:21 EST 2010 That's really the basic idea. You write parameterized modules in terms of `import'. The boring part is going to find a way to formulate this. I.e. how to solve the unit/module grouping. Do we allow more than one unit in a module? Entry: Control flow graph Date: Tue Nov 16 08:24:22 EST 2010 The current compilation state is a bit ad-hoc. Can we get something more elegant? I.e. like [1] [1] http://www.cs.tufts.edu/~nr/pubs/zipcfg-abstract.html Entry: Summary of possible directions Date: Tue Nov 16 08:27:28 EST 2010 In order of importance 1 Fix module system: proper units and eliminate "load". The USB driver can be the pull for this. 2 Keep the "eager macros" partial evaluation strategy, but augment it with a semantics of the low level machine language used. 3 Build the compiler on top of a more abstract control flow graph. Currently the way compilation state is maintained feels a bit raw.. 4 Build a theory for the I :: m -> (t -> t) towering in Haskell. Entry: The I :: m -> (t -> t) towering Date: Tue Nov 16 09:59:24 EST 2010 The main idea is that the structure of m ``roughly'' corresponds to the structure of t. Can we call it structural towering? I.e. if m is a concatenation of elements and so is t, then multiple layers of towering ``act as one''. I.e. it is simple to then add processing steps like optimization. * In Staapl the 't' isn't really target code: it contains pseudo code, a control flow graph and a "macro return stack" hack used to implement local exit in macros. More specifically, the interpretation is I :: m -> (t' -> t') where t' is an extension of t. Then compilable macros are those that eventually project down to (t -> t) without loosing structure. * Generalized, if a simple arrangement of m can be translated to a particular function compositions structure, we're still in business. Entry: Little annoyances Date: Fri Nov 19 08:13:21 EST 2010 - can't hard-reset chip on console Entry: Units Date: Fri Nov 19 08:16:04 EST 2010 Starting with something tangible: ttlmono-2620-54.fm It has the following load statements that are to be eliminated: load p18f2620.f \ chip macros load monitor-serial.f \ boot block + serial monitor code The place to start is probably to model model-serial.f as a parameterized unit. That file consists of: load monitor-serial-core.f load monitor-serial-warm.f The first one doesn't have any unbound names, so can be replaced by a module directly. Nope, it needs: - init-chip - fosc - init-serial - baud What does this mean? The monitor APPLICATION needs chip-specific code for serial and whole chip init. That sounds reasonable. Next: turn monitor-serial-core.ss into a unit. This needs components: - define new unit signatures - import / export - link So it introduces some red tape. Let's stick to s-expr syntax for the signatures and link files. Add these units: pic18-osc-sig.ss pic18-serial-sig.ss Let's just keep these inside the pic18/sig.ss for now: (define-macro-set pic18-chip^ (fosc init-chip)) (define-macro-set pic18-serial^ (baud init-serial)) And let's just stick with one interface: ;; Chip-specific code. (define-macro-set pic18-chip^ (fosc ;; oscillator Hz init-chip ;; chip-specific init baud ;; hard-coded monitor baud rate init-serial ;; chip-specific serial port init )) Next problem: the .ss files use #lang scheme/unit How to expose these to Staapl? I've added pic18-unit/lang.ss based on scheme/unit instead of scheme/base but I run into trouble that indicates I don't know what I'm doing. To investigate: how does the lang/reader.ss mechanism work again? It expands to a "module" form, but the scheme/unit/lang/reader.ss might do something else. Entry: The Forth -> module parsing is too complicated Date: Sat Nov 20 08:21:12 EST 2010 Reason: the `expand' re-structuring after a require statement. Maybe it's simpler to expand from Forth (concrete) syntax straight to Scheme module syntax instead of going through forth-begin. Or: is the split between lexing and parsing really necessary? It even complicates standardizing Forth. I think it's important to keep in mind that the basic form should be an s-expression that integrates well with the rest of Racket. What about this: * Write a non-extensible flat Forth lexer that compiles to (module ...) form. * Is it possible to do this in a way that is extensible? I.e. can we have a read-time Forth running? * Can the same code be used on-target? So, the point really seems to be to write a stand-alone Forth and then stub it out to plug in the compiler/macro part. Can we start out with something like eForth and move on from there? Entry: Really unrolling Forth Date: Sat Nov 20 08:56:52 EST 2010 Come on, it can't be that difficult. The point: - Internally, the Forth is unrolled and defined in terms of macros with phase separation. - A reflective front-end should generate such a structure. You really need a reflective Forth to implement this! There is no way around it. It probably needs to be meta-circular too. Then bootstrap it and optimize for a particular target. The problem is the meta-circular interpreter: a semantics for Forth :) Entry: Just get rid of the damn syntax. Date: Sat Nov 20 09:39:57 EST 2010 So, this makes me wonder. Why am I so attached to this Forth syntax? Maybe the project should be finished (implement units properly for the low-level library) without touching any Forth syntax? It's probably much more useful to have a proper s-expression based syntax first. Entry: Summary Date: Sat Nov 20 16:13:05 EST 2010 - Fix units -> define a proper s-expression based module format for macros and target code. Then units are "automatically" fixed on top of the scheme/unit language. This involves refactoring the Forth parser. Soothing words: the current Forth syntax isn't standard anyway, and is maybe more a roadblock than a nice feature. - Write a new parser on top of the s-expression format, one that's more standard Forth-like. Entry: Future of Staapl Date: Tue Dec 28 14:51:17 EST 2010 Time to get started as I need it for a project. I think I'm going to ditch the Forth syntax and move to something more compositional. Roadmap: - Find a way to marry code with modules. Currently only macros have modules. Can this be done in a way that meshes better with the Scheme system? Entry: s-expression only: splitting forth parser and dictionary compiler Date: Thu Dec 30 11:29:22 EST 2010 coma/macro-forth.ss: Currently has both the forth parsing and dictionary compilation parts. Luckily those are already separate. Let's just put them in a separate module. Using `forth-dictionary-log' to see what's actually passed in. I had to fix a bug here: dynamic parameter needs to be a function. The question is: why does the expanded dictionary have `forth-parse' calls? This happends after require statements. I have the feeling that this feedback loop is what makes some things behave badly. Basicly, we don't compile to a flat dictionary structure, but compile to something that has a recursively defined dictionary structure. It's a bit of a mess... It's probably best to start somewhere else. The ingredients are: code-register-postponed! wrap-macro wrap-word wrap-variable This is exactly what is passed to `define-forth-parser' in the pic18.ss module. The first one is defined in the code.ss module. The latter 3 are defined in the comp/compiler-unit.ss module. The word "postponed" just means "postponed to run time", i.e. compiled code. So, how to use that interface directly? I.e. let's define some variables and some words. (wrap-word name loc macro) box> (wrap-word 'foo #f (macro: 1 +)) # #state->state #state->state The values are: label (contains code), label compiler (i.e. for call or address as data), and the code generator. So this doesn't yet compile anything. That's what the `code-register-postponed!' is about. I.e. see ther `forth-word' macro in the macro-forth.ss module. That one obtains the 3 values as: label, wrapper, inline, and will perform registration. This is where we can tap in. Maybe the `forth-word' should be renamed, or at least moved to a different location, i.e. in the compiler-unit.ss module. ( Hmm.. Spagetti code. Or too much parameterized code. ) Why is the `compile' parameter used ion forth-word set to `rpn-lambda'? It is explaned in the docs even, but I don't get it. Wait, it is `macro:' : see `forth-begin/init'. Indeed, just a parameter. Next problem: why doesn't this work? ;; similar to macro-forth.ss: forth-word (define-syntax-rule (instantiate-code name word ...) (begin (define-values (label wrapper inline) (wrap-word 'name #f ;; source location (macro: word ...))) ;; compiler (ns (target) (define name label)) (ns (macro) (define name wrapper)) ;; (ns (inline) (define name inline)) ;; not necessary (code-register-postponed! inline))) box> (instantiate-code foo 1 +) box> (print-target-word target/foo) foo: It probably needs just to be compiled. I've added some code to `print-target-word' to deal with non-compiled code. Something calls compile somewhere.. It's `compile!' defined in the pic18.ss module. Ok, that seems to work fine. Observation: The convoluted code seems to come mostly from the Forth syntax which has parsing state that doesn't mesh too well with the way PLT modules work. The rest seems to be fine. State involves: - forth / macro switching - recursive "require" expansion Entry: Instantiation and control flow Date: Thu Dec 30 13:37:21 EST 2010 The idea in the Forth syntax is to give give full control to the programmer regarding control flow. I.e. the default is for words to fall trough. This base-level control is an important feature: you want to be able to build abstractions on top of control flow. How to make this explicit? Probably we just need two code instantiation forms: one that behaves as the macro form (with implicit exit) and one that has fallthrough code. These are currently set as: (words (name . code) ...) ;; individual words (words-flat (name . code) ...) ;; fallthrough words The order is not specified in the individual words. This makes fallthrough explicit and macro <-> word substitution simpler. I don't see a simple way to write this on top of the `compositions' macro as there are two namespaces involved. So this seems to be it. The same goes for variables. Question: should variables be flat (in sequence) by default, or should we keep them re-arrangable? The usual caveat applies: if flat, it's no longer managable, i.e. no fancy compiler optimizations. Entry: Next Date: Thu Dec 30 14:51:13 EST 2010 Write some code! The words/variables approach seems to work fine. This should fix the composition problem with the the only thing missing the toplevel stubs: compile to .hex and .dict etc.. Roadmap: - Convert one of the Forth language modules to s-expressions Converted serial.ss which was trivial. The tests seem to pass also. The next thing is to write code, and maybe figure out how to get highlighting for the scheme forms, that would be nice. Entry: Retargetting at Schemers Date: Thu Dec 30 20:01:45 EST 2010 A side effect of ditching the Forth syntax is to get rid of the false hope that this would bring in Forth programmers. ( Ha! ) So I'm back to Scheme, or Racket if you want.. One of the reasons I put in the Forth layer is to be able to hide the Scheme side. What was I thinking? While Forth syntax is quite productive for writing low level apps, trying to hide PLT scheme features behind an explicitly coded Forth frontend isn't very productive. Entry: Object-oriented API Date: Fri Dec 31 23:26:39 EST 2010 Now instead of a command API, we need a base-line OO API and build the command line API on top of that. Command lines are nice, but APIs are nicer, as they allow easier "metaprogramming". Entry: Bootloader Date: Fri Feb 4 13:07:20 EST 2011 * To make the framework easier to use, it seems to be a good idea to switch back to the bootloader approach. When working with multiple PIC chips in very simple circuits, the PIC programmer is too much of a hassle, and the ICD connector adds a significant board overhead. * Multiple bus support is really needed. I need bootloader on I2C for sure. Entry: 3V input on 5V circuit. Date: Fri Feb 4 13:12:17 EST 2011 I need to sniff a 3V SPI bus. I only have 5V boards. What's the fastest way to interface? For now it's only 3V -> 5V but near future might need reverse too. * Build a 3V board. * Some inverter chip tricks? It seems it's going to be far easier to just stick to 3V3 for the sniffer. PIC 3V3 pins are 5V tolerant. See figure 26-3 in the 18LF2620 data sheet. At 3V3 max speed is about 20 MHz. It should be straightforward to modify the clock settings on the 2620 to do this. I tried with a 3V3 cable but didn't work. Voltage measured 4.3V so maybe I messed up the voltage regulator in the cable? Check here [2]. The logic levels are 3.3 but the power is 5V. Both 3V3 cables are the same. Why is that? Ok, going from 4.7 -> 5.0 when using a powered hub. [1] entry://../electronics/20110204-131618 [2] http://www.ftdichip.com/Support/Documents/DataSheets/Cables/DS_TTL-232R_CABLES.pdf Entry: Back to the Stack Date: Mon Feb 21 09:17:37 EST 2011 * The "lexical" Staapl spin-off has come full circle[1]: I should switch to a _stack_ architecture for describing directed acyclic graphs. STACKS & FANOUT: What needs to be done is to make this mesh with the rest of Staapl. The key choice is to separate the internal representation that supports higher order functions, from the "user interface" that allows the use of lexical variables on top of this. The stack representation has explicit "dup". Recovery of sharing information is not possible in abstract-interpretation based Haskell implementations; it requires a CPS-style approach. The stack interface is already threaded. To make this explicit, the sawtooth algorithm can be used as an application pull. [1] entry://../meta/20110126-092935 Entry: New PIC arch Date: Mon Feb 21 09:39:35 EST 2011 Arrived[1], together with some 3V3 regulators. [1] entry://../electronics/20110205-163902 Entry: Bootloader Date: Thu Feb 24 20:51:31 EST 2011 Requirements: - Basic command set = standard. - Front end (serial / SPI / I2C / USB / ICD) configurable. - Possible to protect memory, but not mandatory. - Access to reset. - Command/OO interface Entry: Applications Date: Thu Feb 24 21:04:26 EST 2011 Another thing that might be interesting is to introduce "applications". An application is a binary with an entry point. Even better might be objects. The basic idea is to be able to reload code without full chip erase. (Again, balancing on the line between whole-program compilation and linkable objects.) I wonder.. Is there a standard object format to use for the pic18? Entry: PIC18 debug tools Date: Thu Feb 24 21:06:32 EST 2011 If you want to use the Microchip tools on Windows it's all good of course. Not if you want to get creative.. No specs for PIC18 debug mode, and the only programmable programmer (PicKit2) is probably being discontinued soon. Working console-only is doable, giving up on debug features. However one annoying point is lack of access to reset. A nice project would be to write alternative firmware for the PK2. It has the right connection, and is readily available for $30 as a clone[3]. The only real hurdle seems to be USB, but that's something I need to figure out anyway some time.. Let's summarize: * Get USB to work, using USBPicStamp or PK2 * Make console run over ICD2 port (build a serial -> ICD adapter) * Get programming specs * Get debug specs Some reverse engineered debug info should arrive here[1] soon. The PK2 schematic is here[2]. Prog spec is available from the microchip part website, i.e. for 18F1xK50[4]. [1] http://jaromir.xf.cz/hdeb/hdeb.html [2] http://www.modtronix.com/products/prog/pickit2/pickit2%20datasheet.pdf [3] http://www.sure-electronics.net/mcu,display/DB-DP004_1_b.jpg [4] http://ww1.microchip.com/downloads/en/DeviceDoc/41342E.pdf Entry: Programming? Date: Fri Feb 25 23:52:17 EST 2011 The problem isn't programming, it's debugging. If this sounds obious to people that do electronics for a living, but it's not to someone with a "clean" programmer background. I keep being surprised at how easy it is to loose things to the darkness. To get into a situation where you can't see what's happening. Entry: PIC debugging: data over serial connector Date: Tue Mar 1 23:45:19 EST 2011 Currently I have a programmer + serial console per pic chip. This is too complicated. I want something simpler. How to use the ICD to send console data? Using the PICkit2 it is possible to send and receive data. I was on a roll trying to make this work before, but I got lost somewhere. What happened? Entry: RTS / CTS as reset? Date: Tue Mar 1 23:58:53 EST 2011 Can one of these pins (I always forget which is which) be used as a reset pin on a standard serial connection? [1] http://www.easysw.com/~mike/serial/serial.html#5_1_2 Entry: picstamp.fm -> picstamp.ss : removing `load' and usining units Date: Mon Mar 7 14:11:53 EST 2011 Basic idea: the PIC18 support code is parameterized by: fosc, baud. These need to be defined in the application unit. The best approach seems to be to look at the toplevel app module as a linker script that patches together different components, and provides some "configuration modules". Entry: PIC18F constants Date: Mon Mar 7 19:11:45 EST 2011 This probably needs a "shared constants" approach, or else all PIC18 specific modules need to be written as units. Alternatively, a pic18 signature can be created that is enough to support the basic library. Then the constant files would implement 2 interfaces: pic18-shared pic18-device The snarfer can then distinguish between these 2. EDIT: but is it really a good idea to distinguish between the 2? Entry: Tension between modules and units Date: Mon Mar 7 19:18:27 EST 2011 * Modules are definitely easier to use. The directed nature of dependencies makes it easy to hide stuff. * Units are more flexible and necessary when there are different implementations of interfaces. Now, is it possible for modules to depend on some "machine" module that provies all the identifiers, and have this resolved later in a top-level linking phase? EDIT: I.e. where A -> B means "A depends on B" A -> B -> C -> M All the arrows could be implemented by `require' be it for holes in M. I don't think it's possible to automatically translate the module chain to a unit chain when one inserts holes in M. It seems that the only way to do that is to use dynamic parameters, and that's not what I want. I had that before and it's too error-prone. I really want static bindings, no side-effecting behaviour change. Entry: Roadmap: constant dependencies Date: Tue Mar 8 16:08:31 EST 2011 1. Remove pic18-const 2. Make an interface for the minimally needed constants used in the compiler and possibly library code. 3. Make it so that there is no longer a top pic18.ss interface, but a unit that still requires linkage with a chip-specific module. 4. Turn everything into a unit, and splice off the parser. -> Make this work for macros. wow.. does this open a can of worms or what! Entry: Compiler not in module? Date: Tue Mar 8 16:46:18 EST 2011 Maybe I have it all backwards. What if the compiler is something that is fed a configuration (an app link script), and this configuration is a racket module? The thing is this: the compilation is really just another stage, so why is it not abstracted as such? Currently the target words definition needs some state in the main module. How did this work again? Ok. I get it. Each project has a list of non-instantiated macros that are then tied together later. Problem: `words', `variables', ... are macros defined in terms of signatures, so they need to be part of the signatures. Move label-unit.ss to sig.ss How to have a signature depend on another signature, i.e. such that the macros defined in the signature can refer identifiers from another signature? HMM.. that doesn't work so well. This really needs combining. I need a different approach as this is just a shot in the dark.. Q: Why do I end up with macros depending on units in the first place? It seems that a "not fully specified compiler" doesn't really agree well with separate compilation. At least from that perspective it at least makes sense that I'm not able to express what I want. In other words: definition of target words only makes sense once the compiler is fully specified. Entry: Macros in terms of multiple signatures Date: Tue Mar 8 18:18:17 EST 2011 I'd like to have a simpler explanation of why macros in signatures can't depend on other signatures, making signatures depend on each other. Why is this "flattening" necessary. Actually, it's not so hard.. Signatures are just collections of names, and don't contain any dependency information. Actually, dependencies are expressed _in terms of_ signatures. The solution would be to define a new signature that has all the identifiers the macros depend on, and have a dummy unit that translates a bunch of signatures into the one needed by the macros. Ok, I seem to be on the right way, but I at some point one of the macros doesn't expand properly after generating an expression with multiple invocations of another macro: (begin (m ...) ...) Nope... it expands just fine. It's about one of the identifiers: "access from an uncertified context to unexported variable from module: label:exit". Binding it lexically before handing to the "macro:" expander seems to make that error go away. This lexical binding introduced another error: the `define' was no longer top-level. Ok, WORKS! Entry: Compiler fixes Date: Tue Mar 8 20:59:53 EST 2011 next: the code.ss is no more. compiler needs to instantiate the unit. EDIT: the whole live section has a dependency on code.ss What is the real problem? Dependency on machine macros. Actually, there is no import in code-registry-unit.ss so maybe it should go back to being a module? Done. Entry: Next Date: Thu Mar 10 10:47:00 EST 2011 Make pic18.ss the generic linked device, but provide a mechanism to choose a proper chip-specific const module. Entry: Name change? `macro' is misleading Date: Fri Mar 11 10:06:13 EST 2011 How hard would it be to change `macro' to `gen' or something similar, to distinguish from scheme macros? Can it be called `code'? Maybe that's too general. Let's just keep it like it is. The change is cross-cutting and there doesn't seem to be a single right pick. Entry: Next: org-begin org-end Date: Fri Mar 11 10:17:59 EST 2011 It seems that this is best pushed somewhere else. I.e. next to the `wrap-code' functions. Looks like there's plenty of room for simplification once the stack-oriented label juggling is pushed deeper. Especially the org code is a bit of a hack. It probably also just works using dummy names. Maybe `wrap-word' and `wrap-variable' should have an optional org argument? Entry: Introducing names Date: Sat Mar 12 09:04:30 EST 2011 I find myself mindlessly shuffling things to make the symol introduction work for a macro based on define-values/invoke-unit. The idea is that I want a single form that links a user-specified unit into the whole of the pic18 compiler, and exports alll the signature words. This _requires_ some non-hygienic functionality, so it is up to the writer of the macro to ensure that it actually makes sense. In my case, it's providing the proper `require' statements. The error I made was to use relative paths. The trick seems to be to find out where `define-values/invoke-unit' decides where to put the identifiers. The manual[1] says it is introduced in the context of the `define-values/invoke-unit' form. [1] http://download.plt-scheme.org/doc/html/reference/invokingunits.html Entry: Hmm.. still getting weird unknown signature errors Date: Sat Mar 12 12:32:46 EST 2011 It's too complex. It looks like there are some things I don't understand about units, because I keep running into compiler errors I do not understand + they don't seem to be stable either. Sometimes code compiles fine, then it does not.. Probably depending on compiled/* code caches.. It looks like there is only one way to do this right as I don't really know what I'm doing: use units for everything and perform the linking step explicitly in a top-level module. Once that works, try to abstract the linkin step into a macro or something. Basic problem: non-hygienic macros don't compose very well. If at all possible, stick to all hygienic. The main offender here is the forth macros. I can't possibly put all of those in a signature (can I?). However, if I can, the composition problem would be solved completely.. Entry: Moving forth parsing words to signatures. Date: Sat Mar 12 14:12:39 EST 2011 I don't see an immediate starting point to do this incrementally. Maybe that's actually good; maybe the real solution is to throw it all away and start over? Man, this is hard.. Let's start at the beginning. A significant roadblock is the non-composability of define-signature. Can that be fixed transparently? I think it really needs a rewrite, starting bottom up, on top of label^, and taking into account all the problems that are associated with the flatness. Starting from scratch also allows preprocessing macros to be represented differently. They should not be syntax, because they do not behave as scheme macros. The basic idea is that the gizmos that implement the separate behaviour of the syntax-juggling code are themselves implemented as signature-specified _values_ to avoid the problem of having to define them as macros. Can this be done in a straightforward way? I.e. there should be just another stage. The essential element seems to be to "split" the identifiers into two classes: those that are part of the transformer stage and those that are part of the code stage. Entry: The new `forth-begin' Date: Sat Mar 12 15:47:24 EST 2011 Forth code is a straight line, and it generates Scheme expressions that look like: (variables a b c) (words-flat (foo 123) (bar foo 1 +)) (macros-flat (baz foo bar + )) Essentially, a module s-expression is built one atom at a time. The default behaviour is to simply append at the end of the last expression, i.e. in the above it would be at . Problems: there is no "macros-flat" form. Once there is, it should be quite straightforward to use this approach. The expression could be kept in inverted representation as long as recursive expansion is not necessary, i.e. a scheme form that introduces names needs to be properly inserted before the forth parsing goes on. The basic misunderstanding in the previous approach is that it is possible to take a whole file and build a single huge s-expression, which will then be expanded at once. Because of introduction of names this isn't possible without recursive expansion. Let's build that in from the start. I forgot, the prefix parser thing is actually quite deep.. The call to `syntax-local-value` is in rpn/parse.ss So it doesn't look all that bad over there.. It's the middle part that's rotten. Entry: Focus Date: Tue Mar 15 13:34:43 EDT 2011 Bottom line: there are many things to fix. Mostly the forth parser and the compiler are too "stateful" and might need a change, or at least some thought on why things are as they are. So it looks like I need to focus more if I want to work on real-life projects instead of full-time tinkering on Staapl. I'm also not in a terrific physical shape so maybe this is not a good time to do the creative magic required to overhaul the core. One step at a time. Currently, there are two goals that are somewhat intertwined if they need to be done right: get the bare-bones app to work using s-expressions, and cleanup the Forth macro interface on top of units. TODO: * allow `words' and `words-flat' to support raw addresses instead of labels. This might need a change in the word-wrap code. * figure out how to extend signature syntax so it's possible to move code between signatures and plain modules. The latter seems most isolated so let's start there. Entry: Extending `define-signature' Date: Tue Mar 15 13:41:15 EDT 2011 Apparently the identifier is not is a separate namespace, so beware when including this in ordinary modules: (define-signature-form (define-syntax-rule stx) (syntax-case stx () ((_ (name . pat) expr) (list #'(define-syntaxes (name) (syntax-rules () ((_ . pat) expr))))))) [1] http://pre.plt-scheme.org/docs/html/reference/define-sig-form.html Entry: org Date: Tue Mar 15 14:57:47 EDT 2011 The way org is implemented using org-begin and org-end is a bit weird, as it can span over several definitions (i.e. one consecutive chain with several entry points). Entry: Piggy-back signature macros Date: Thu Mar 17 00:23:57 EDT 2011 I have a `prefix-parser' macro defined in rpn.ss and I want to lift this to a signature form. How to do that? 1. Import the original forms with a id prefix Entry: Prefix parsers part of signatures Date: Thu Mar 17 00:34:44 EDT 2011 Ha damn it, it works! After adding some glue and a dummy unit to export the signature, this is the sig def and an expansion test: (require "../rpn/rpn-signature-forms.ss") (define-signature prefix-test^ ((prefix-parsers (macro) ((p3) (+ + +))))) ;; In pic18.ss context: box> (syntax->datum (expand #'(macro: p3))) (#%app make-word (lambda (p) (let-values (((p) (#%app (#%top . macro/+) p))) (let-values (((p) (#%app (#%top . macro/+) p))) (let-values (((p) (#%app (#%top . macro/+) p))) p))))) Nope that's not yet correct. The `+' is not visible in the signature and needs to be made part of the signature. I moved to this: (define-signature prefix-test^ (macro/plus (prefix-parsers (macro) ((plus3) (plus plus plus))))) And a trivial plug in the unit def: (import stack^) ;; for macro/+ (export prefix-test^) (define (macro/plus s) (macro/+ s)) This isn't quite right. I mean, it works, but it's clumsy. The names like `plus' leak into the namespace. It would be better if this didn't need to add aliases. Can `import' or `open' be used in the signature? What about this: use two interfaces. One that lists the deps and another that extends this sig with macros. What I really want is a simple way to bundle things, make unions of interfaces. Entry: Ask racket list Date: Thu Mar 17 01:45:38 EDT 2011 Is it possible to define macros as part of signatures, but allow the macros to see different signatures? Is it possible to create unions of signatures, ala multiple inheritance? Entry: The (ns ...) hack Date: Thu Mar 17 01:54:59 EDT 2011 Trouble is, this doesn't work so well because it inspects forms. Actually, my trouble is that defining identifiers sometimes needs to be abstracted, such as in signatures. Entry: Assembler broken? Date: Thu Mar 17 21:04:48 EDT 2011 mzscheme -p zwizwa/staapl/staaplc -- -c /dev/ttyUPS picstamp.fm patterns: Mismatch at: (((list-rest (list (? (ns (op ?) qw)) b) (list (? (ns (op ?) qw)) a) rest) (macro/append-reverse (begin (list (op: qw (tv: a b /)))) rest))) make: *** [picstamp.dict] Error 1 That makes no sense. What is the question mark about? The error is raised in: staapl/coma/pattern-tx.ss How to add an original source location to the message? Ok, I was able to trace it to the macro `/' and its' definition in pic18-macro-unit.ss: (patterns-class (macro) ;;------------ (word) ;;------------ ((pow) (>>>) (<<<) (/) (*)) ;;--------------------------------------------------------------- (([qw a ] [qw b] word) ([qw (tv: a b word)]))) So what happens is clear: the `/' macro is invoked without constants on the compilations stack. Why is that? Can we also get at the call site of the macro? Looks like I need to get at the backtrace, or define what a backtrace means for a staapl compiler. Entry: Code structure Date: Thu Mar 17 21:24:21 EDT 2011 So.. looks like I'm being forced into cleaning up some other code layers too. First, a summary of recent fixes: 1. Names and units: this seems to be mostly solved, apart from some cosmetics that have to do with bundling signatures, and pushing this style to all the library code. 2. Allow s-expression-only definition of target code modules. Seems conceptually ok, but because of missing library code from point 1. this doesn't work yet. 3. Salvage the forth prefix parser. I wasn't going to do this but it seems it's at least relatively straightforward to mix units and macros, if a bit clumsy. TODO: 4. Make the run-time target code generation layers more transparent and as much as possible implemented with proper identifier management. Entry: Backtraces / continuation marks Date: Fri Mar 18 11:59:29 EDT 2011 From [1]: "The continuation marks included in the exception are effectively a stack trace, and you can convert them into locations." [1] http://www.mail-archive.com/users@racket-lang.org/msg00132.html Entry: Weird bug Date: Sat Mar 19 10:56:51 EDT 2011 Figuring out the backtraces seems a bit much work. I used a tracing approach, tagging the macro invokations with a `printf'. Traced it down to: UEP0 >block where : >block 16 / ; My guess is that UEP0 is undefined. No it's something more insiduous. In ramblock.ss, the following code is compiled as if `>block' is a target word, not a macro. macro : >block 16 / ; forth Changing the `:' to an explicit `:macro' solves the problem. Ok, I moved the code that was in coma/macro-forth-tx.ss back to a `begin-for-syntax' form and the problem goes away. I don't understand.. Something to do with local state maybe? Bottom line: that code has to change. Too obscure. Entry: A new macro-forth ? Date: Sat Mar 19 11:57:26 EDT 2011 In coma/macro-forth.ss: the `forth-compile-dictionary' needs to be replaced by a more direct mechanism. Recap: why does rpn-parse have this first argument? Entry: Uncertified syntax Date: Sat Mar 19 14:23:40 EDT 2011 Simply put, I don't know where to begin to understand why I get these syntax certificate errors[1]. However, my guess is that it's the `ns' form that is doing something wrong. Let's see what happens if I try to remove it. From [2]: "Certification should work automatically unless you're using `local-expand' and re-arranging the result." An I'm not calling local-expand. Removing `ns' doesn't seem to be an option. It's too deep. [1] http://docs.racket-lang.org/guide/stx-certs.html [2] http://lists.racket-lang.org/users/archive/2007-March/016859.html Entry: What a mess.. Date: Sat Mar 19 15:01:19 EDT 2011 Actually, it is working. Don't forget that! But frankly I don't understand why. Code is patched with workarounds. I can't really remove it. Lot of functionality depends on it. What to do?? Maybe best jently move towards rewriting the library code into s-expression / unit style, and _WHEN IT'S WORKING_ rethink the forth part. I would not be surprised if in the mean time something really simple popped up that allowed to scrap all that complexity. It really can't be that hard. It's just because I don't have overview and dug myself too deep into racket macro internals I don't need. Entry: Syntax certificates Date: Sun Mar 20 10:25:09 EDT 2011 In Staapl, I keep running into errors like these: compile: access from an uncertified context to unexported syntax from module: "/home/tom/pub/darcs/brood-5/staapl/pic18.ss" at: label:org-begin in: label:org-begin.261 I've read a bit on the manual[1] and I think I sort of understand the idea, but I can't figure out what is causing this. So, that's today's task: understand syntax certs. There are 3 candidates: - The name prefixing which is used in ns.tx uses `datum->syntax'. According to the manual this does not transfer certificates. - The `ns' macro itself. However, that really just re-arranges its input. - The `rpn-parse' macro which takes apart syntax and puts it back together again. Another experiment. What I noticed is that if I put the reference to `label:org-begin' in a `macro:' form by itself, i.e. (macro: .. ,label:org-begin ..) there is no problem. So I changed the defining form to: (define-syntax-rule (word-defs wrap name raw-macro) (begin (define-values (label wrapper codegen) (wrap 'name #f ;; source location raw-macro)) (word-define (target) name label) (word-define (macro) name wrapper) (label:append! codegen))) and then used the following: (define-syntax-rule (words-org-flat (address rpn-code ...) ...) (begin (define org-begin label:org-begin) (define org-end label:org-end) (begin (let ((raw-macro (macro: 'address ,label:org-begin rpn-code ... ,org-end))) (word-defs label:wrap-word #f raw-macro)) ...))) which works. However, when the `raw-macro' is substituded in the code it doesn't work. Ok, so I've re-arranged the code such that the code generator is bound to a variable. That seems to work and has the added benefit of being a bit more readable. (define-syntax-rule (words-flat (name . rpn-code) ...) (begin (begin (define codegen (macro: . rpn-code)) (word-defs label:wrap-word name codegen)) ...)) (define-syntax-rule (words (name rpn-code ...) ...) (begin (begin (define codegen (macro: rpn-code ... ,label:exit)) (word-defs label:wrap-word name codegen)) ...)) (define-syntax-rule (words-org-flat (address rpn-code ...) ...) (begin (begin (define codegen (macro: 'address ,label:org-begin ;; Switch to a new chain rpn-code ... ;; append the code there, ,label:org-end)) ;; and switch back. (word-defs label:wrap-word #f codegen)) ...)) (define-syntax-rule (variable-entry name size) (begin (define codegen (macro: size ,label:allot)) (word-defs label:wrap-variable #f codegen))) The pattern that shows up is that deconstructing a form like (a . b) causes problems. But that doesn't seem to be right. My guess is that this is really an artifact of the unit stuff. Maybe it's time to start to blame racket? So, I don't understand why it's a problem here.. Let's try to redo the other point where it showed up. Ok, it seems the trouble is with the prefix parsers defined in the sig: (prefix-parsers (macro) ((forth) (:forth #f)) ((macro) (:macro #f)) ((variable n) (:variable n 1 mf:allot)) ;; FIXME: needs parameterization ((2variable n) (:variable n 2 mf:allot)) ;; FIXME: needs parameterization ) The rest seems to work ok. Entry: Is `syntax->list' the culprit? Date: Sun Mar 20 12:54:43 EDT 2011 From [1]: "Calling syntax->list loses the outermost certificate, which is the "safest" place to have one. IOW, syntax->list causes the error when one of its immediate subexpressions is an introduced, unexported identifier." Following up to [2]: (define-for-syntax (recertifiable-transform transform stx) (let ([new-stx (transform stx)] [inspector (current-code-inspector)]) (define (recertify s) (syntax-recertify s new-stx inspector #f)) (values new-stx recertify))) The non-osdir thread: [3]. Following up on the idea that the problem might be `syntax->list' I tried the following to replace the invokation in `rpn-syntax-rules', which seems to work: (define (syntax->rlist stx) (let* ((cci (current-code-inspector)) (recert (lambda (stx-new) (syntax-recertify stx-new stx cci #f)))) (map recert (syntax->list stx)))) Next to that, there is another problem with `datum->syntax'. Looks like the reason I couln't get anywhere before is that it was wrong in at least two places. Another place that caused trouble was `make-rpn-expand-transformer'. Here the certificate needs to come from the result of `begin-stx-thunk'. Another one (damn!) is mf:alloc. Tried for a bit to see what's going on but I'm running out of steam. This is horrible. [1] http://osdir.com/ml/plt-scheme/2010-03/msg00245.html [2] http://osdir.com/ml/plt-scheme/2010-03/msg00247.html [3] http://lists.racket-lang.org/users/archive/2010-March/038586.html Entry: More.. Date: Sun Mar 20 15:10:12 EDT 2011 One step closer: verified that the error comes from using `syntax-local-value' on the `macro/:macro' identifier. So it's really literally so: that particular access is not allowed. I'm assuming this is because unit syntax is somewhat special. Let's explore the other route again. Entry: Avoiding `syntax-local-value' Date: Sun Mar 20 15:28:08 EDT 2011 I'm using a mechanism I don't fully understand (`syntax-local-value' and its interaction with syntax certificates and units) in the heart of the Staapl code (the `rpn-parse' macro). As engineering practices go, that's probably the worst one can do... It might be wiser to do this in two steps: If any plugin behaviour is necessary, put it in a different layer. Another argument: `rpn-parse' is supposed to generate a single form, not a recursive expansion. Entry: Enough Date: Sun Mar 20 18:39:00 EDT 2011 I'm sick of it. Let's rip it all out and build it back up. Entry: Crisis Date: Sun Mar 20 18:43:51 EDT 2011 Rewrite is not an option. I probably don't have the energy to get it back into shape. If there's a rewrite, it will have to be in Haskell in order to be forced to get the types right. Entry: Debugging certificates. Date: Sun Mar 20 18:45:34 EDT 2011 Is there a way to debug this to see what's going on? Entry: Same problem as before, now with working signatures. Date: Sun Mar 20 19:17:18 EDT 2011 [1] entry://20110319-105651 Entry: picstamp.fm back online Date: Sun Mar 20 19:54:12 EDT 2011 Saved the syntax! Now it's time to kill it off.. The implementation is far to crummy. Code upload doesn't seem to work though. It might be a racket thing too. Nope it was `target-byte-address' undefined. I'm not correctly handling some error in live.ss error #(struct:exn:fail target-word-not-found: OK #) Entry: Safely tucked away? Date: Sun Mar 20 20:49:09 EDT 2011 So it's time to get some real work done. Next: target instantiation macro. Entry: PIC18 debug mode Date: Mon Mar 21 00:09:19 EDT 2011 See here[1] for a mirror of Jaromir's notes. From what I gather, the most useful bits seem to be the fact that debug mode can be entered from a host signal without any kind of interrupt support on the target. The rest is software, and effectively using this already assumes there is a protocol over RB6 and RB7. If debug mode is triggered by a 1->0 transition on RB6 (PGC), we can use this to attach a serial port. This means a standard serial start bit can jump to debug. Anyway, it seems best to use an external clock as Jaromir suggests, to be independent of clock speed of the PIC. It seems simplest to use the current size-prefixed command/reply protocol. Target can use PGD to signal ready state followed by a clock-out by the host. Otoh, I2C is 2-wire. Maybe that's what I should stick to? [1] entry://../electronics/20110320-225422 Entry: NEXT Date: Thu Mar 24 20:29:10 EDT 2011 - compound units -> full s-exp only kernel - USB interface Entry: Compound units Date: Thu Mar 24 20:31:11 EDT 2011 I guess what I'm looking for next is a compound unit, or a partially linked unit. Does this need the explicit linking? The unit interface seems to have mostly two ways of linking: manual or automatic. From what I understand the automatic way is if there is no duplication of interfaces. What about compound-unit/infer? I.e. from the guide[2]: > (define-compound-unit/infer toy-store+factory@ (import) (export toy-factory^ toy-store^) (link store-specific-factory@ toy-store@)) But, that doesn't solve the problem of consolidating interfaces to make them simpler to reference. [1] http://docs.racket-lang.org/reference/compoundunits.html?q=bsl [2] http://docs.racket-lang.org/guide/Linking_Units.html Entry: Unit unions Date: Thu Mar 24 20:59:55 EDT 2011 Let's just build and abstraction for it. I don't think that compound signatures are possible, except for the limited single-inheritance. So I got something working: (define-syntax (define-dictionary stx) (syntax-case stx () ((_ name (sig^ ...)) #`(define-syntax name #'(sig^ ...))))) (begin-for-syntax (define (re-syntax context stx) (datum->syntax context (syntax->datum stx)))) (define-syntax (define/invoke-dictionary stx) (syntax-case stx () ((_ dict^^ (unit@ ...)) (let ((sigs (re-syntax stx (syntax-local-value #'dict^^)))) #`(begin (define-compound-unit/infer combined@ (import) (export #,@sigs) (link unit@ ...)) (define-values/invoke-unit combined@ (import) (export #,@sigs))))))) And then: (define-dictionary pic18^^ (stack^ stack-extra^ memory-extra^ .. )) This works, but only because the symbols are non-hygienic. I can't seem to keep the signatures themselves hidden, and export the definitions. Entry: Why is there no unit collection? Date: Thu Mar 24 23:30:39 EDT 2011 I think I'm not using the right collection. Probably, I should be using modules and parameters. Units make it harder to work with macros, and they don't agglomerate easily like modules do. My problem however was that I do really have components with macros in them. Entry: Use the source luke Date: Fri Mar 25 09:39:43 EDT 2011 I really should look more at the racket source code: ./launcher/launcher.rkt:7: (define-values/invoke-unit/infer launcher@) (provide-signature-elements launcher^) From a quick look it seems this is quite common. However, there doesn't seem to be an equivalent of the `provide-all-out' form. Entry: Include files Date: Fri Mar 25 15:03:08 EDT 2011 For the device-specific constants, it might be best to use inheritance here: one unit provides the constants necessary for the library code, another interface provides chip-specific constants. Trouble here is of course that the Microchip header parser needs to distinguish these. Entry: Protocol changes Date: Sun Mar 27 11:12:55 EDT 2011 Making the USB polling work made me thing of a missing feature in the current monitor code: it's not possible to read/write the terminal from the target side. This isn't so hard to do. It just requires some coroutine-style symmetric communication. Maybe it's best to implement that first, and only then move on to the polled PK2 channel. Before doing anything, I have both monitor .ss files and .f files. What's up here? I don't think the .ss ones actually work, as I don't see them included in any projects. The change can be minimal. Only the jsr command executes arbitrary code. The host side that waits for ack there needs to implement the coroutine mechanism. Man, I've been out of it for a while. This is already working! The `console-log' function handles the receive message. Currently the only thing it does is printing messages to the console, but this could be extended into a coroutine call. The problem with the current one is how to express "continue" to the target. It seems best to fully implement continuations as any other approach is going to be an ad-hoc hack from if-then-else hell that wants to be a full continuation implementation. Let's leave this for later. The current implementation at least allows for some form of ping-pong. Entry: Running the test suite Date: Sun Mar 27 11:19:50 EDT 2011 The .hex comparison tests seem to pass, except for the picstamp code which I've been changing. Entry: Reading the ICSP proto on PIC Date: Sun Mar 27 12:45:36 EDT 2011 Trouble is that by default the PK2 sends very narrow pulses: 160ns, which is 2 cycles at 12 MIPS. This is no problem for hardware, but for software it's a bit of a stretch. Can this be slowed down on the PK2. Otherwise it might be necessary to use the KBI2 feature, which is not universal. RB7 = PGC = KBI2 for the "new core" 18F: 18F2550 18F1220 18F2620 18F24J10 but i.e. not for the "old core": 18F252 for the very new core which has a different data sheet pin diagram, they seem to be listed as IOC which I assume is the same: 18F46K22 RBIF (INTCON 0) changes when a PORTB pin changes. The trouble is going to be that we need to detect only clock edges, not a data edge. We can't check the data itself because we're not going to be fast enough.. Triggering on both data and clock should be possible if the delay is set high enough. The PK2 is quite predictable: data is set and clock is toggled as fast as possible, so a small delay after any data or clock change should be sufficient. However, it might be necessary to go to SPI or I2C using the extra AUX line since this seems to be a bit too much foefelare. Wait: it is possible to change the pulse with by changing the speed using SET_ICSP_SPEED. Entry: Forth syntax Date: Sun Mar 27 14:06:29 EDT 2011 I'm not going to ditch it. It's a nice syntax for actual programming. A nice UI. Entry: Clock sync Date: Tue Mar 29 00:03:56 EDT 2011 "begin cond? until" is no longer compiled to bit test + jump backwards, but to a jump forwards. Strange. EDIT: I found this: ;; Conditional skip optimisation for 'then'. ;; FIXME: not used since we can't mutate then (defined in control.ss) ;; (([btfsp p f b a] [bra l1] ,ins [label l2] swapbra) ;; (if (eq? l1 l2) ;; `([btfsp ,(flip p) ,f ,b ,a] ,ins) ;; (error 'then-opti-error))) ;; ((swapbra) ()) I tried to add it but it doesn't seem to trigger. Maybe something changed in the way `label' is handled? Hmmm... `label:' cuts off a part of the code so the optimization is not performed. I don't see a straightforward way to solve this. begin cond? until sym dum>m label: cond? until sym dum>m label: cond? not while repeat sym dum>m label: cond? not sym dup >m jw/false end: m-swap again then The trouble is the `end:' caused by `if'. Is it really necessary? I suppose this is to prevent optimizations to wipe out the branch in some conditions ; a sane default. So the wait loop really needs a different approach: begin cond? jw/false end: Maybe "until" is the natural primitive, not jw/false? Interesting! This is what works: until = ( m> jw/false end: ) Entry: The infinite todo list Date: Tue Mar 29 12:07:22 EDT 2011 * Combine ICSP proto with monitor code. * Try the ICD interrupt. * Get USB to work. * Fix disassembler (use maybe original assembly code annotation?) * Add command line completion for the readline interface. * Re-integrate with snot, or use Geiser[1] * Find out why compilation is so slow. I.e refactor module structure to a finer grain. * Cleanup source code layout. [1] http://www.nongnu.org/geiser/geiser_4.html Entry: Packet-based monitor code Date: Tue Mar 29 22:38:45 EDT 2011 I'm not sure if that's going to work directly with the current implementation. Maybe best to move the current serial port approach to a packet API. Approach: in/b and friends seem to be only called in `target-recieve+id/b' and `target-count'. The latter is an artifact of the daisy-chain serial and needs to move to a different level. The former already hints at a message-based approach. Basically, we can buffer up to the point that we read. This means the byte-oriented "printf" style can be maintained as it works quite well. Looks like all commands expect a reply except for reset. Entry: Target-side monitor code Date: Wed Mar 30 21:56:10 EDT 2011 The basic protocol with sync is verfied. Now the monitor needs to be adapted to work with a packet-oriented approach. Most important issue is that the last byte in a transaction is allowed to detach the receiver for a while. Maybe reception should drop the postamble? The current interpreter code already has some support for headers and acks. Roadmap: - All commands need to end in ack. - Move the serial daisy-chaining code somewhere else. - Add addressing to the ICSP protocol? Entry: .f files and early binding Date: Thu Mar 31 00:21:45 EDT 2011 You know, I went through all this work to be able to use units and not have to rely on context-sensitive "include", but I have to say that in the light of incremental upload and late binding, that approach is quite valid. I mean, it's sane: you can load the same code multiple times with different bindings without fear of changing anything in the core. However. Big caveat. You can't redefine things in Forth code implemented in racket modules, because it uses a different binding mechanism. It would be great to be able to unify those two approaches, i.e. using some form of lexical nesting to in the racket modules to get to the same obscuring behaviour. It looks like it's only n@f+ and n@a+ that need an additional wait-ack appended. Entry: pk2 ICSP proto monitor Date: Thu Mar 31 01:46:51 EDT 2011 Simple receive/transmit works but the monitor itself doesn't want to do much. I assume there is not enough time inbetween bytes for the interpreter code to run. Let's measure. 13us.. That's quite a bit at 0.1us instruction cycle. I had put the period to 10us. Hmm.. can't redefine words as macros? Ok, that's not the problem. It was a missing definition. So the 1 command (push) works. Entry: It's working, now make it faster Date: Thu Mar 31 03:33:08 EDT 2011 It's not particulary fast because of the large delays. Let's see if we can up the clock speed back to 3us. That doesn't do anything. I've added the syncs in-channel, and it seems to work except for 13 (stackptr) which replies with 0 for a while and then sends: icsp-recv: h:#t a:#f b:(0 244 15) icsp-recv: h:#t a:#f b:(0 8 0) Which is this on the line (LSB first): 00 00101111 11110000 That's indeed a 10 sync followed by a #xFF address and a #x00 size byte. Tshifting it should be no problem. The (0 8 0) is: 00 00010000 00000000 Which is a sync bit followed by 2 zeros. This is because the device is in receive mode, and it will clock in address + size, and ignore 0 size message. Ok, receive sync seems to work. However, sending out the command really needs a proper sync as otherwise the targets gets messed up. It seems there are significant delays so we assume the target is always there. When the send sync is on, the receive sync doesn't seem to be invoked. Maybe there's plenty of pause? Next option: see if it can be solved with a loop in a PK2 script using IF_EQ_GOTO. Entry: Looking at "kb" Date: Thu Mar 31 16:04:33 EDT 2011 The "kb" display is quite slow. I'm looking at it at the scope and while there is still a 3ms delay in the sync due to more than one packet begin used, there is an extended period where the host is polling but there is no reply, about 50 ms. That's 500 instructions. That doesn't seem right. Each chkblk loop is only 64 bytes. Wait.. The loop is 5 instructions. That's 6 clocks per iteration including the 2 clocks for the branch. 0218 0009 [tblrd*+] 021A 50F5 [movf 245 0 0] 021C 14ED [andwf .L111 0 0] 021E 06E7 [decf 231 1 0] 0220 E1FA [bpz 1 .L116] Actually, that's already 38.4 ms for just the loop, so that indeed makes sense. So why is the display so slow? On the serial line it's much faster. (really?) Typical delays: 3ms 2ms 4ms 3ms sync <-> send <-> receive_header <-> receive_body <-> sync The actual byte transfers are almost not noticable as they have 3us clocks and are in the order of 100us total length. This is far from ideal! I don't see a simple way to fix this, so for now this will have to do. Maybe lower the clock speed too as that doesn't seem to have much influence either. Yep. Switched to 100us and can't see any difference, except that there is a bit less idle bus time. I'm not sure how to fix this.. Sending larger messages and using less handshakes will help, but it seems to be an inherit problem with the pk2 as there is no pipelining to hide the 1ms usb bus clock. Hmm.. That's not really true. In one direction it pipelines just fine. Sending 2 x 26 bytes transfers spaces them 1ms apart with only 200us wasted space. So it does burst well.. The problem then is the pingpong handshake. So how to fix? The only annoying part is bulk read and write. These can probably be optimized by using larger packets. Ok, so roadmap for faster pk2: - Use RAM buffering for program upload, send one page at a time. - Make checkblock work on 1 k blocks. Entry: Console usability issues Date: Fri Apr 1 20:51:29 EDT 2011 Let's do some cleanup. I want to be able to issue host-only commands without target comm, i.e. set voltages, do chip erase etc.. What this needs is a shorter timeout for ack. 100ms should be enough to see whether target is listening or not. This is the OK word. Entry: Sync issues Date: Fri Apr 1 21:38:53 EDT 2011 It looks as if it's not properly syncing on startup. The lucky thing is that the the NOP is an a 16-bit zero string so that's probably why it recovers eventually. Adding 50 retries for pk2-poll is a proper workaround. OK and BUSY work properly. Entry: Stat Date: Fri Apr 1 21:58:06 EDT 2011 The "stat" word is now available for PK2. Entry: Scheme prompt Date: Fri Apr 1 21:59:29 EDT 2011 Might be best to add a prompt for the "scheme" word to make sure we stay in the scheme interpreter when there is an error. Apparently exceptions go straight through the prompts (different tag?), so I'm using a catch-all exception handler. Entry: Powering target from PK2 Date: Fri Apr 1 22:30:26 EDT 2011 It's possible but needs some massaging. I'll need to recover the exact sequence, but the basic idea is to: * Check if there is target power, and ONLY switch on the PK2-provided line if there is none! * Set the reset/program line correctly. Need to make sure that device is properly configured, so we can get the right operating voltage from the database. Actually, the default works for a 5V PIC, without target voltage checking. The current `target-on' and `target-off' functions work fine from reset. Entry: Faster program Date: Sat Apr 2 01:17:14 EDT 2011 The current protocol slows it down because there are several sync transfers. All should go in a single 8 byte burst. However, I do not want to create an extra instruction for this. This should be something like: JSR ... The change (not necessary for this) but useful for other extensions is to allow a jsr without ack, so arbitrary comm commands can be added. This could have the 'fast-prog command in the dictionary. If it's there it can be used, otherwise the standard approach can be taken. Ok, works. Entry: Can't rewrite Staapl Date: Sat Apr 2 01:35:53 EDT 2011 Maybe the compiler, but definitely not the interaction system. There is too much knowledge and workarounds (``AI'') encoded. Entry: Faster "kb" Date: Sat Apr 2 03:15:25 EDT 2011 This can probably use the same strategy as fast program: do more work per packet, and bundle commands that do not cause pauses into a single command. Entry: Debugger Date: Sat Apr 2 03:18:59 EDT 2011 Making the debugger functionality work should now be straightforward. This should also be kept interactive: - Halt target - Set debug bit - Program debug vector [1] entry://../electronics/20110320-225422 Entry: Low-level conditionals Date: Sat Apr 2 09:28:42 EDT 2011 Does `if' actually work with normal ints? Trouble is that I really rarely need it as it's simpler to process condition bits in a different way. ( Or, I know it's inefficient and will work around using plain `if'. ) Entry: The disassembler Date: Sat Apr 2 10:06:21 EDT 2011 The next broken part is the disassembler. I'm thinking it might actually be better to query the host code instead. Looks like that data is no longer available due to code-clear!. Nope, it is... I'm not sure why the kernel words are not available. It's always more useful to have the original assembly code available. In that case, the disassembler doesn't need to translate labels. Maybe it's more useful if it just uses numbers? I've added word addresses but that's really no solution, as "see" uses byte addresses. Removed. This needs to be fixed properly. Entry: Next Date: Sat Apr 2 14:13:59 EDT 2011 Basically, the PK2 connection is working quite well. Some unrelated cosmetic issues I ran into while hacking: - dasm doesn't really work - chain cutoff problem Next: debugger or USB? Entry: Chain cutoff problem? Date: Sat Apr 2 14:16:36 EDT 2011 The result of "sea" chains seem strange if there are a couple of definitions in a row without fallthrough. No chain cutoff after jump? : bar food food ; : bar2 bar bar ; .OK sea bar bar: 02C8 DFFB [jsr 0 food] 02CA D7FA [jsr 1 food] bar2: 02CC DFFD [jsr 0 bar] 02CE D7FC [jsr 1 bar] OK Entry: USB Date: Sat Apr 2 16:18:46 EDT 2011 The main problem for USB is handling a lot of struct data, i.e. the endpoint registers used to control the USB hardware. Let's assume the descriptors are just flash constants. Just trying to load the usb code gives undefined words for: *EP0-OUT* *EP0-IN* Where do they come from? What about this one? Nope it has some missing code. Tue Jun 2 09:35:33 EDT 2009 tom@zwizwa.be * usb driver needs more abstraction This one does have the EP0 macro definitions: Mon Jun 1 02:56:40 EDT 2009 tom@zwizwa.be * cleanup macro \ *WORD* means the current object context has changed: all literal \ addresses below #x60 are relative indexes. : EPn-OUT 3 <<< ; : EPn-IN EPn-OUT 4 + ; \ ( reladdr -- ) set object to buffer descriptor in bank 4 : *BD* al ! 4 ah ! ; : *EP0-OUT* 0 EPn-OUT *BD* ; : *EP0-IN* 0 EPn-IN *BD* ; forth Basicly, each endpoint has count, buffer and a control register. I was using a-relative addressing. Should this be maintained? The main problem is getting this abstraction right. In C it would be trivial but in Forth we need to be a bit clever. Brodie's idea is to try to avoid structures: use code/commands instead. Essentially what more does is to use global variables to store a current state, and have words operate on the current state. If combined with save/restore on a stack this is essentially dynamic binding. The previous approach did exactly that, but using the index register that's normally intended to hold the stack frame. So it isn't much else than filling buffers either with fresh data or from constants and sending acks, so let's build a proper abstraction for that. Entry: Real problem: no structs Date: Sun Apr 3 23:34:50 EDT 2011 Essentially: no local, late-bound namespaces. They can be emulated, but this requires global names. Is that a problem? Entry: PIC18 extended instruction set Date: Wed Apr 6 21:11:32 EDT 2011 It's a pain in the ass because it forces you to choose. Damn premature optimization! For C it's a no-brainer. For Staapl PIC18 Forth however, I don't know whether it's a good idea because it effectively splits the code into two different platforms. The "current pointer" is too useful to not use. So maybe from an organizational point of view it's best to not use it at all, then all PIC18 targets can use the same code. So, to move the USB driver forward: I need a fast implementation of indirect data access, or a different approach for accessing the endpoint registers. Entry: USB endpoint register access Date: Wed Apr 6 21:25:06 EDT 2011 The simplest approach seems to be to create a word for setting the a register to the correct endpoint window, and have an ep-command word that sets the size and flags. The buffers only need to be set once at startup. The good thing is that the status and count are right next to each other. STAT CNT ADRL ADRH So this is really just a!! followed by !a+ !a+ for the basic count & status access. Probably need to write CNT first though, since STAT can cause an action: setting UOWN transfers the buffer to the USB hardware, which causes a transmit on an IN endpoint. Entry: a!! bug Date: Wed Apr 6 21:40:54 EDT 2011 The a!! macro had lo/hi swapped. This only happened for literals. hunk ./staapl/pic18/pic18-macro-unit.ss 246 - (([qw lo] [qw hi] a!!) ([_lfsr 2 hi lo])) + (([qw lo] [qw hi] a!!) ([_lfsr 2 lo hi])) Entry: Streams are cool Date: Wed Apr 6 22:32:53 EDT 2011 In my day-job embedded work I'm moving from data structures to streams a lot. Main reason is to avoid intermediate lists. My favourite abstraction is for-each with early abort. In C this needs to be combined with the dual stream interface (open / rewind / next / eof) because for-each can't be easily inverted (i.e using partial continuations). Anyways. Maybe this is also true for Forth, but then on one level deeper: to treat structs as streams because indirect addressing (structs) are a bit awkward to use: no support for the separate namespaces. Doing so opens up the door for channel-based multiprogramming, as a streamed structure read or write can easily be replaced by a channel connecting producer and consumer. Essentially this is the same transformation as going from named to nameless arguments: use position instead of names to encode meaning. I think a big idea is hidden here. Streams and state machines... Actually it's not such a big deal: when memory is scarce, you implement functionality as communicating state machines. Streams and their associated parsers and printers are state machines, or state machines embedded in push-down automata (state machine + stack) or 2-stack/tape machines. Entry: A clean usb.f Date: Thu Apr 7 00:24:04 EDT 2011 So I copied the initial (layer 0?) stuff from _usb.f I'm using the following syntax for intializing the the buffer descriptors: The first word sets the a register, and the rest just stores incremental bytes. IN0: #x08 !+ \ clear UOWN, MCU can write 64 !+ \ buffer size buf-IN0 !+ \ addrl buf-page !+ \ addrh What I miss is "emit". It doesn't work properly. I can't do much without basic debug output. So fixing that is the next task. Entry: Fixing emit Date: Thu Apr 7 00:26:25 EDT 2011 Let's use different addresses to distinguish between return and call. I.e. sending to 0xFF causes the host command to terminate, while sending to 0xFE causes a command to be executed. Problem. With: : bar 65 emit 65 emit ; bar icsp:send h:#t b:(0 3 3 214 2) icsp-recv: h:#t a:#f b:(1 255 1) icsp-recv: h:#f a:#t b:(65) icsp-recv-message: (255 1 65) Aicsp:send h:#t b:(0 0) icsp-recv: h:#t a:#f b:(1 144 0) icsp-recv-message: (144 0) icsp:send h:#f b:(0 0) icsp-recv: h:#t a:#f b:(2 0 0) icsp-recv-message: resync: 10 icsp-recv-message: resync: 1 icsp-recv-message: (0 0) OK There seems to be a collision when both sides are writing on the bus. Saved scope capture as: ~/staapl/NewFile1.wfm How to view the waveform? What I see is: * host sending 0 3 3 x x * target sending 255 1 65 x host sending 2 pulses, where target replies ack 1,0 * host sending out 16 pulses, target replies 0 (receives!) * host and target both send data The trouble is at x: target sees the pulses and performs ack, but host doesn't seem to see the ack? Every poll then clocks the device while it's in a 2 byte read. When that is finished it replies. Very strange. Let's look at the transmit part. It misses the sync bit. Why? Does it actually ack when it receives a zero packet? It doesn't so I changed that. This has no effect though. It doesn't see that pulse... Damn. Why? BUG in PK2? FOUND IT: in pk2-in, an icsp-ack was sent while the internal message-oriented code in icsp.ss already performs the handshake. Entry: Resync Date: Thu Apr 7 14:37:19 EDT 2011 Now ts doesn't work any more. All commands that receive data from the host do not work any more after removing the icsp-ack call from pk2-in. Problem was that "emit" and "reply" are not the same thing. "emit" needs a separate ack to keep the request-response cadence going. Last problem seems to be the "kb" word. This was also still calling emit. Fixed. It's a lot faster now that the sync issues are resolved. Fast is good! Tss... now I get this: racket pk2-picstamp.dict Connecting to PICkit2. datfile: /usr/local/bin/PK2DeviceFile.dat iProduct: PICkit 2 Microcontroller Programmer command-made-roundtrip: 0 () I think I found it: when the target has just set the ack bit and we miss it because we clock out the next.. Something fishy going on with the handshake. At times I see a third clock pulse. Entry: Better debug tools Date: Thu Apr 7 20:38:40 EDT 2011 1. Point-and-shoot a word: find its definition if it appears in the console. 2. Get at backtraces. Entry: Test-oriented programming Date: Fri Apr 8 19:33:29 EDT 2011 So, how to make things more testable. I'm currently hacking away at the usb driver, lamenting the absence of breakpoints and data inspection. Might be time to implement those? At first, since updates are so cheap, simply modifying the code is quite doable. What I need is a word that will abort execution and drop be in the interpreter. This isn't so hard: reset the return stack and call interpreter. Entry: ICSP on 8MHz Date: Sat Apr 9 18:19:35 EDT 2011 Doesn't seem to work. Is this important? Yes. I tried it down to 200us clock which is ridiculously slow so there has to be another problem. I can see similar issues at 15us. Some sync issue. It takes the chip 25us to respond to the handshake pulse, so anything that's larger that that should be fine. Let's take period 60us. Problem is: both write to the bus, so it looks like they continuously get off cycle. The thing is, that's quite a long time. Is it actually running at 2 MIPS? It's just sitting there in a tight loop of 3 cycles. It takes 5 cycles to get from detection of positive clock level to setting the output. At 0.5 us cycle time (2 MIPS) that should be 2.5 us, not the tenfold of that. Something's not right. Section 2.4 in the data sheet: at startup, the output of INTOSC is set at 1MHz. That makes more sense! Entry: New proto board Date: Sun Apr 10 00:49:26 EDT 2011 Wired up 2550 to ICD and USB. Latter was quite straightforward. No external components required: just hook up GND, D-, D+. Entry: TODO PK2 Date: Sun Apr 10 00:50:34 EDT 2011 - Proper reset instead of power cycle - Check target voltage before switching ON - Programming - Debugging Entry: Debug output Date: Sun Apr 10 11:53:59 EDT 2011 Sending messages from Flash is trivial. How to encode it in the source? Does backtick still work? No. This needs a different prefix parser. I do have "fstring:" which is based on f-> Since this is just for debugging, it would be nice to have something more general. Something that's part of the parser. Alternatively, conditions could be stored on the host too. What I really need is tracing info: this way a single word "trace" could be inserted at a particular point to see where execution is going. Ok, what works: - 0xFF : normal console logging and ack for empty message. - 0xFE : hex dump What I want: a trace command that allows the host to execute code in the sync loop, i.e. to query the target. OK. The sync is a bit patched together but at least it works: (define (trace-hook addr) (printf "trace: ~x\n" addr) (abd 0) ) : foo trace trace trace ; .OK foo trace: 312 000 F8 12 A4 ED 05 24 53 02 008 D3 80 5A 20 C5 C2 14 0C trace: 314 000 F8 12 A4 ED 05 24 53 02 008 D3 80 5A 20 C5 C2 14 0C trace: 17e 000 F8 12 A4 ED 05 24 53 02 008 D3 80 5A 20 C5 C2 14 0C OK This should enable any kind of program instrumentation at trace points. Entry: It needs to be faster Date: Tue Apr 12 14:56:59 PDT 2011 It needs to be faster. It takes almost 3 minutes to compile all the code dependencies for a single .fm image. After that it's reasonable (just compiling forth code). Why is it so slow, and how to make it faster? The thing is that once it is running, once the compiler is compiled, it shouldn't be changed any more. The inconvenience is just an artefact of me constantly changing things that require full recompile. Maybe getting rid of some of the bundling modules should make it work better: have finer-grained module boundaries. Entry: Forth is not C Date: Mon May 23 15:36:07 CEST 2011 C isn't all that bad. It's main virtue is that it is quite readable and has a stable status-quo. That stifles innovation in programming method, but from a business perspective it brings predictability. Forth (and the idea behind Forth) is definitely more powerful than C. It's a bit like the story of Lisp: with great power comes with great responsability. In practice it seems that toning down programmers by _limiting_ the expressiveness isn't always a bad idea. ( I'm still really a hacker - problem solver and not a manager. However, recently I've had the pleasure of working with a very good project manager, and I'm starting to see some things that were not part of my world before. All of them have to do with making money through preventing loss. ) Entry: The Forth community Date: Mon May 23 15:52:17 CEST 2011 As I mentioned elsewhere here, I don't like the fundamentalism of the Forth community either[1], AND the embedded community when it comes to language innovation. The thread does mention that Dear Chuck is OK, with which I agree completely ;) [1] http://news.ycombinator.com/item?id=2574204 Entry: Forth as an interface Date: Mon May 23 16:07:11 CEST 2011 The good thing is that it has manual fanin/fanout specs so the compiler doesn't need to do that part of (very hard!) re-arranging. Maybe Forth as a frontend really isn't such a bad idea. This is essentially Factor[1]. [1] http://factorcode.org/ Entry: Metalanguages are functional Date: Sat Jul 16 14:35:41 CEST 2011 What I thought to be the big insight in Staapl is that the metalanguage (macro language) of an imperative Forth language can be made purely functional. What I did not understand at that time is that this is also part of the idea of using monads in Haskell[1]: While programs may describe impure effects and actions outside Haskell, they can still be combined and processed ("compiled") purely, inside Haskell, creating a pure Haskell value - a [computation description] that describes an impure calculation. What is similar between Monads and Staapl's associativity which allows compile time and run time to be arbitrarily separated, is that in both cases the awareness that we're dealing with a meta-level "disappears a bit". Of course, Staapl's macro language is dynamically typed so there are no static type signatures to go by. Maybe it would be a nice exercise to make all that hidden structure more explicit. [1] http://www.haskell.org/haskellwiki/Monad Entry: Running synth with PK2 Date: Tue Sep 27 20:08:14 EDT 2011 Problem: probably need to disable interrupts during PK2 bitbang. PIC waits in the following routine: : icsp-sync \ -- : sync on rising clock edge begin icsp-clock low? until begin icsp-clock high? until ; Is it enough to do this? : icsp-sync \ -- : sync on rising clock edge sti begin icsp-clock low? until begin icsp-clock high? until cli ; Entry: PORTB / ICSP comm? Date: Thu Oct 6 15:46:08 EDT 2011 Is this the culprit? : init-out TRISB 2 low TRISB 3 low ; Nope. ICSP pins are RB7-RB5 It seems to really be the interrupts. Whenever I switch on one of the timer interrupts the comm gets messed up. It probably misses pulses. I seem to recall that the ICSP hardware's pulse size can't be changed, but I beleive it does keep the data stable after the pulse. Is there a way to let the hardware detect the pulse? PORTB 6 There is the interrupt on change mechanism. The question is whether it is worth to spend time to do this. Let's just briefly look at the IOC mechanism. Entry: PORTB IOC Date: Thu Oct 6 16:01:14 EDT 2011 A change on RB7-RB4 sets RBIF (= INTCON:0) To ack we need to read PORTB to end the mismatch condition, and clear RBIF. : rbif-ack PORTB @ drop INTCON RBIF low ; : rbif INTCON @ 1 and ; : rbif-test rbif-ack rbif ; Tried this, no success: \ This doesn't actually wait for a clock pulse, but for a change on \ *ANY* of RB7-RB4. This is a hack to work around failure to detect \ short pulses due to interrupts in the busy loop. Doesn't seem to work. : icsp-sync.hack \ .hack begin icsp-clock low? until PORTB @ drop INTCON RBIF low \ ack RBIF begin INTCON RBIF high? until \ wait for change on RB7-RB4 ; Entry: Console working with app that uses interrupts. Date: Thu Oct 6 16:39:40 EDT 2011 Test case: synth. Seems like there are a couple of routes: - Just use the console to setup some stuff and call `main'. This works fine, but console is dead of course. - Make come kind of CTRL-C command to stop the app. - Run the console from interrupt, at least the byte-input part. This might be a bit of work. - Use the ICSP debugger stuff[1]. [1] entry://../electronics/20110320-225422 Entry: Problem: Data Structures Date: Mon Oct 31 23:50:55 EDT 2011 - data structures vs. "protocol oriented" programming. I was thinking that maybe Staapl should be about minimizing code size. A single-purpose language. Currently the lack of lexical variables make working with data structures quite a challenge. I.e. the USB driver horror. I'm not sure yet if this is really just a name space issue. It's funny how this arises only in externally specified protocols. When I write my own stuff I get away with representing things as code and actions. Entry: Preparing for release Date: Sat Nov 5 09:49:53 EDT 2011 Need to fix: - libusb DONE - hex printing DONE - proper reset DONE, 'cold' still works - reliable ping (+- DONE.. pk2 seems to get stuck sometimes) - reliable pk2 reset ??? - record definitions to file DONE - documentation Entry: hex printing Date: Sat Nov 5 13:18:19 EDT 2011 So emit can do strings, but it's probably best to allow a hex printing mode also to avoid having to waste much time on the PIC for trace logging. Actually, it's already there. I changed the encoding so row/plain is just one bit: FC FD. Maybe this should be changed however, to use one extra byte and use only the #xFF address for host calls, instead of encoding host calls in the host address space. Entry: Reliable ping Date: Sat Nov 5 13:52:51 EDT 2011 When target is sending stuff, i.e. in a print loop, it will look as if it is responding to pings. This means the pings are too simple. It still gets stuck sometimes: foo 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F 10 11 12 13 14 15 16 17 C-c C-c Command "foo" interrupted. Trying cold restart... target-off target-on recv-header: malformed header: (0) but I this seems to be due to a stack underflow on the device, using: : foo 1 + dup dump foo ; with a correct def it doesn't behave like that: : foo 0 begin 1 + dup dump again ; I do get this error still: bad-reply-id: 0 () That latter one seems to be recoverable by another retry. This one however seems to indicate that pk2 is stuck: icsp-recv: pk2 read: expected 3 bytes, got 1: icsp-recv: b:2 h:#t a:#f -> (0) even the stat command doesn't give good results: stat (status (0 "Vdd GND") (1 "Vdd") (0 "Vpp GND") (0 "Vpp") (0 "VddError (Vdd < Vfault)") (0 "VppError (Vpp < Vfault)") (0 "Button Pressed") (0 "Reset since READ_STATUS") (0 "UART Mode") (0 "ICD transfer timeout/Bus Error") (0 "Script abort - upload full") (0 "Script abort - download empty") (0 "RUN_SCRIPT on empty script") (0 "Script buffer overflow") (0 "Download buffer overflow")) (voltages (0.000152587890625 "Vdd") (3.425 "Vpp")) subbytes: ending index 65 out of range [1, 64] for byte-string: #"@\254\0@\4\0 \376\200\3\1\0\304\37\200 \0\0\361\a$\b\0@\374\1\n\2\0\210?`A\0\0\361\a0\b\0@\374\1\r\2\0\20\177\200\203\0\0\3... Maybe it would be good to find out how to reliably reset the pk2? Entry: Record definitions from interactive session Date: Sun Nov 6 09:07:23 EST 2011 It seems really hard to save the state due to possibility of macros. So what about just logging the input to a file for each successfull compilation / macro def? This requires a patch to the interactive parser live/commands.ss So I've added an eval log, which seems most appropriate as it contains validated syntax. Entry: PK2 slave? Date: Sun Nov 6 10:56:35 EST 2011 I miss a non-interfering console log. It would be nice if it were possible to put the pk2 into slave clock mode such that the pic doesn't have any restrictions on how fast it can send the data. Currently it seems to be problematic to do a bit-banged interface with a slave clock on the pic when there is an ISR active, i.e. as with the synth. What about this: - pic master clock, only 1-directional send. - pk2 can interrupt pic by raising a particular line. Looks like I2C and SPI are only supported with PK2 set as master. Without a gigantic hack this is probably not going to work.. What about making an ICSP interrupt handler? I looked at this before but it doesn't seem to be so simple because of the change-detect pins. However, the on-chip debugger unit uses this approach. Maybe the right approach is just to implement that debugger interrupt[1]. The real solution is a protocol that has no timing issues, meaning each event needs to be acknowledged. The simplest protocol I can think of is here[3]. Can the PK2 do that? [1] entry://../electronics/ [2] http://jaromir.xf.cz/hdeb/bdm/bdm.html [3] entry://../electronics/20111106-131545 Entry: Synth: chaotic oscillator Date: Sun Nov 6 10:59:39 EST 2011 I thought I had some kind of fake[1] chaotic oscillator running.. How did it work? Looks like I lost some code somewhere.. Let's see if I can reconstruct with current design. This needs 3 features: - main periodic oscillator - a "resonance" which is the timeout from start -> end - a random pulse that triggers the reso This seems to conflict with the reso mixer as the nose osc is OSC1 and not OSC0: OSC0 xor (OSC1 and OSC2) this fixes OSC1 and OSC2 to be gate and reso, and OSC0 to be chaos. Maybe the mixer should be changed to support this again? It would also be good to dig up the old code since there have been quite some archive changes over the years.. [1] entry://20071117-022751 Entry: Problem with synth + pk2 solved. Date: Sun Nov 6 14:34:07 EST 2011 There were 2 problems actually: interrupts + PORTB config. Culprit was this: : init-ports-digital #x61 TRISB or! \ analog 4 is RB0 + RB5-RB7 are digital in (RB7 = icd tx) INTCON2 RBPU low ; \ enable weak pullups for port B (switches) The following dump routine now works: : _dump_ cli dup dump sti ; Entry: Documentation Date: Mon Nov 7 11:01:04 EST 2011 The `pic18>' macro in the demo is broken, meaning that the documentation won't generate properly. Entry: pic18/demo Date: Mon Nov 7 11:11:00 EST 2011 Needed to fix some code in pic18/demo that's used in the documentation after the unit refactoring early this year. Entry: state machines Date: Mon Nov 7 16:44:03 EST 2011 I'm thinking about what to do next with the synth. The main problem is that I don't have a good way to deal with state machines and building dataflow networks, or at least event sending.. State machines can be well represented using python-style coroutines, which are essentially threads with only one cell, meaning that yield can only be called at the entry level. It would be nice to have some global compiler support for doing jump table allocation for this mechanism, since RAM is so expensive and most state machines can do well with 256 states. The problem then is how to find "islands of control" that can use disjunct control points? It might be best to do this with an extra context, as we also need to enforce that yield is not used is library routines. Some notes - Islands of control: collect words that are separated by tail calls only. Doing this automatically is possible, but would require it to be done on a low level. - Explicit declaration might be better, something like: begin-sm ... end-sm This way the context for this can be stored in a threaded compiler state, and we can enforce that all calls inside such a section are actually tail calls. - Python-style generators/coroutines do not allow calling `yield' from nested context, which is exactly the point here since we want to save memory and resource allocation hassle. However, in Staapl nesting that doesn't use any data state can still be done using macros, which would expand into multiple control points. - It needs an extra compiler state. It might be better to first make compiler states composable, or provide a way to attach some state for later extension. Entry: The Staapl compiler is an Arrow Date: Mon Nov 7 18:15:43 EST 2011 In spirit that is, not in interface. It behaves as one, and it would be great to be able to implement it as one. Arrows in Haskell use binary tuples for plumbing. What I need is a mechanism that can identify components by name. This would kill modularity somewhat (i.e. can't arbitrary nest the same types) but would make things a lot easier to work with. The other idea is that instead of using names as in what I'm looking for, or arbitrarily nested binary tuples as in the Haskell Arrow class, the states could be nested as stacks also. This might give a more appropriate "stacks of stacks" approach. Anyway, let's first summarize what there is at this time: - Compilation stack: code compilation and partial evaluation - Control stack: local control words: if, then, begin, again - Macro stack: implements ';' in macro nesting. - Chain stack: collects basic blocks. The first 2 are called '2stack' and are defined in ... The other 2 are define in staapl/coma/state.ss as a struct derived from 2stack with two extra components. Let's see if I can add a coroutine stack in there without too much trouble. Maybe just adding a hash is already more than enough to keep the extension in a module. I've added an 'ext' field to the 'compiler' struct in comp/state.ss and adjusted the matchers and constructors appropriately. Seems to work just fine. Now how to access? I see I did my best to go creative with the `state-update' macro. This uses the `mu-lambda-struct' from vm.ss, a shorthand notation for machine register updates, where assigment and matching are expresed by -> : -> Maybe I should make a list called "goldmine of weird ideas" ;) I'm having deja-vu now from the time when I wrote this. Conclusion: the state mechanism is already extendable through the struct derivation method. stack -> 2stack -> compiler These support automatic lifting / polymorphism, i,e, you can pass a derived struct to a basic op and it will update only the derived field. So yes, it should be possible to use the extension mechanism that's already there. However, it would make it difficult to do this using the current modular mechanism, in that the state constructor would need to be extended in a way that seems to interfere with the modularity constraints.. So let's implement it as a hash table with local names, i.e. not symbols but something that can be globally unique. Entry: General cleanup Date: Tue Nov 8 13:12:58 EST 2011 - Error messages are horrible. Some examples: duplicate definition -> where's the first one?) non-null-compilation-stack: (#) -> say where the context is opened, i.e. the location of the non-matched "for" - .f files are not in dependencies for compilation of .fm module. - ! and @ only work for macros. how to fix that? Entry: Some sounds Date: Tue Nov 8 13:47:44 EST 2011 Can play with this: : 2execute \ lo hi -- push TOSH ! TOSL ! ; : play init-board engine-on 2execute engine-off ; \ DEMO 3 ' z2 play 30 ' z1 play 20 ' z0 play ' wioew play ' rrxmod play 1 ' nzwioew play ' iiuu play ' woe play 3 ' iiuwoe play 128 ' pattern-sequencer play Entry: Where's the other code? Date: Tue Nov 8 14:41:46 EST 2011 I had some other code with synth patches and a sequencer. Where did it go? I found them in the brood-4 archive. Looks like I just forgot some files when porting from 4 to 5 last time. So I copied some files. Looks like demo.f is what was used for the Piksel performance, and is more advanced that what is in synth-control.f I also found the bassdrum / hihat stuff in sounds.f And indeed, after a couple of minor changes, it still works! 128 ' pattern-sequencer play Entry: PK2 sync Date: Wed Nov 9 00:47:15 EST 2011 I wonder if it would be possible to change the sync part to a level triggered interrupt, maybe using the same protocol as is used in the ICD, which is to cause a 1->0 transition on RB6 pin (clock). This means that we need to keep the bit high in idle, and lower it to signal interrupt. I think this even works with the current protocol. pulse 0: target writes 1 to ack, otherwise line pulls low. pulse 1: target releases bus pulse 2+: host writes packet, then waits for reply so instead of expecting a reception after the first pulse, the host will poll until it sees a 0->1 transition from the target, which acknowledges the interrupt. from then on, the fast protocol can be switched on. Entry: Serialzing thread state Date: Wed Nov 9 10:40:38 EST 2011 I'm thinking about the shallow coroutine (SCR) approach to encoding code that's formulated using recursion into a compact state-machine representation. Currently there are 2 ways to go about this: - Implement 1-SCR using state ID allocation, jump table generation - On top of this, implement n-SCR by control flow analysis. Maybe the flow analysis really isn't necessary. As long as the "recursion" is factored out as macros, there is no real issue since this can just use linear allocation of control point IDs. The code turns out to be quite small and straightforward. It is amazing how such a minimal change can have such a big impact. Entry: Wow Date: Wed Nov 9 15:08:55 EST 2011 I'm loosing it.. going too creative with inventing new syntax... What I want to do is: - make `make-target-label' available to Scheme code to make it easier to construct control flow abstractions in Scheme using lexical scope, instead of using m stack juggling. - make some syntax for local binding of the compiler state variable to make extended compiler state available as lexical bindings. The reason is that the following is too verbose: (compositions (macro) macro: (word ,(lambda (state) ((macro: ......) state)))) - singleton coroutine or multiple instances? (where to bind state var?) ;; The SCR state is accessible through a compiler state extension tag. (define scr-tag (state-tag-label 'scr)) ;; STATE REP (define-struct scr-context (var yield)) (define-struct scr (ctx labels)) ;; ACCESSORS (define (ref s) (state-tag-ref s scr-tag)) (define (ctx s) (scr-ctx tag (ref s))) (define (yield s) (scr-context-yield (ctx s))) (define (var s) (scr-context-var (ctx s))) (define (labels s) (scr-labels (ref s))) (define (next-id s) (length (labels s))) ;; MUTATORS, all curried to produce state transformers. ;; Add label to SCR jump table state. (define (set v) (state-tag-set scr-tag v)) (define (update label) (lambda (s) (match (scr-ref s) ((struct scr (ctx labels)) ((set (make-scr ctx (cons label labels))) s))))) (define (init-state var) (set (make-scr (make-scr-context var (make-target-label)) '()))) ;; Convenience: bring state in lexical context of macro, and apply ;; macro to state. (define-syntax-rule (let/state state macro) (lambda (state) (macro state))) ;; Bound to macro: This has a semicolon as ad-hoc separator to ;; indicate that state is a binder. (define-syntax macs (syntax-rules (:) ((_ state : . words) (let/s state (macro: . words))))) Entry: Shallow Coroutine State Date: Thu Nov 10 08:22:03 EST 2011 So the question is: where to store the address to the state variable. If it is not a singleton coroutine it can't be in the code itself, only in some wrapper word or passed explicitly to the SCR entry/resume point. What about storing it on the r stack? Entry: Conditionally compiling target words Date: Thu Nov 10 08:34:53 EST 2011 Macros are free. Nobody cares if they're there, except for the namespace pollution. Words are not, so how to solve the problem of conditionally compile some target code that supports a macro collection? Looks like I already need this functionality for '!'. Entry: Python coroutines Date: Thu Nov 10 23:00:35 EST 2011 A generator suspend is return but without decref of the stack frame, so it includes arguments and locals. [1] http://mail.python.org/pipermail/python-dev/1999-July/000467.html Entry: Fix documentation Date: Sat Nov 12 12:08:50 EST 2011 Find out how to build it outside of racket. This generates file:///home/tom/staapl/staapl/scribblings/staapl.html cd ~/staapl/staapl/scribblings scribble staapl.scrbl Things to fix: > (macro +) (word # '((((qw a) (qw b) +) ((qw (tv: a b +)))) (((addlw a) (qw b) +) ((addlw (tv: a b +)))) (((qw a) +) ((addlw a))) (((save) (movf a 0 0) +) ((addwf a 0 0))) ((+) ((addwf POSTDEC0 0 0))))) Not such a big deal, but I don't see why this isn't just #state->state > (print-code (macro: add)) reference to undefined identifier: POSTDEC0 POSTDEC0 is not exported by demo.ss It is in the pic18/sig.ss signature module, signature pic18-const-id^ > (target-value->number (tv: 1 2 +)) reference to undefined identifier: tv: I moved the definition from pic18-macro-unit.ss to target-scat.ss (define stx1 #'(rpn-lambda (macro-push 1) (macro-push 2) (scat-apply (macro +)))) > (pretty-expand stx1 expand-once) (rpn-lambda (macro-push 1) (macro-push 2) (scat-apply (macro +))) > (pretty-expand stx1) eval:53:0: macro-push: (macro-push 1) did not match pattern (macro-push val p sub) in: (macro-push 1) This was a missing provide of "rpn.ss" ids in "demo.ss" Looks like it's done. Entry: Documenting the new Forth -> Sexpr compiler Date: Sat Nov 12 12:38:03 EST 2011 Let's first do this based on the macro stepper. Hmm.. Looks like I need to look at the source to see how this works. Entry: Macros don't support quote Date: Sat Nov 12 13:17:54 EST 2011 This doesn't work: (let ((p ....)) (forth-begin path ,p)) The problem is in macro-forth-sig.ss I have a phase problem. How to get the string "pic18", represented by a value, injected into the body of the macro? (define-syntax forth-begin (lambda (stx) (syntax-case stx () ((_ . code) #`(begin (mf:forth-begin path #,(build-path (home) "pic18") ;; library path . code) (mf:compile!)))))) The word after 'path' is always parsed as a literal. What I need is a form like "asdfasdf" path! Hmm.. then even still.. It's a weird kind of phase mixing! Find out what is the real problem here. It looks like this really needs to be inserted as a compile-time entity, not a run-time entity. That's why it can't be done in the signature definition, because macros in a sig def can only depend on identifiers in the signtature. The underlying reason of this needing to be a compile-time entity is because the file search path representation is also a compile-time entity, so the let form at the top of this post is meaningless: the binding doesn't exist at expansion time. I fixed the issue by moving the '(library "pic18") form to pic18/lang.ss and added a stub to insert that term in the live/command.ss interpreter by means of `forth-begin-prefix' which can insert the same. Now forth-begin is generic. Entry: Next? Date: Sat Nov 12 15:21:21 EST 2011 Done for now. - broken doc fixed - removed some dead code after "forth as unit" refactoring - removed pic18 reference in generic forth macro parser Next? - pulling in word defs when a macro is used - ! and @ - SCR Entry: Pulling in word defs Date: Sat Nov 12 18:12:25 EST 2011 There is currently no dead code elimination for globally defined words. Maybe this should be done differently. The current ad-hoc way of building control flow graphs is a bit of a hack.. It also doesn't include jump tables. Maybe the (lack of) language semantics is just a bit too low level. So, how can I implement this without getting into muddy waters? One way is to build an extended compilation state that records a reference to the word (present / not), and adds a definition to another word chain when it doesn't find a definition, just like org-begin and org-end do. Entry: Property based testing Date: Sat Nov 12 18:41:36 EST 2011 How to make property based testing for the peephole optimization rewriting? Entry: Compile a macro into a word Date: Sat Nov 12 19:06:53 EST 2011 This was the attempt, but it seems I'm confusing macro (state->state) and state types. ;; FIXME: compile a word outside of the current chain. (patterns (macro) (([qw label] [qw macro])) ,(add-word label macro)) (define (add-word label macro) (state-update ((dict : (struct dict (current chain store)) -> (make-dict current chain (cons (make-dict-entry label macro) store)))))) (define (make-dict-entry label macro) ??? ) That's where I saw that I'm dealing with apples and oranges. How to call the compiler recursively? Or how to avoid that, and do the chain trick? Looks like what's really needed is something that does state stack juggling, like this: (macro: ',label label-org-begin ,macro exit org-end) So I've wrapped it in a 'compile-macro' word, which seems to work. The code is compiled outside the current chain. box> (forth> ": foo 5 6 begin make-label [ 123 + ] compile-macro 7 8 again") .L11: 0052 00A4 0F7B [addlw 123] 0053 00A6 0012 [return 0] foo: 0054 00A8 6EEC [movwf PREINC0 0] 0055 00AA 0E05 [movlw 5] 0056 00AC 6EEC [movwf PREINC0 0] 0057 00AE 0E06 [movlw 6] .L10: 0058 00B0 6EEC [movwf PREINC0 0] 0059 00B2 0E07 [movlw 7] 005A 00B4 6EEC [movwf PREINC0 0] 005B 00B6 0E08 [movlw 8] 005C 00B8 D7FB [jsr 1 .L10] Trouble is though, if I want to use this for one-shot compilation of code that's used whenever a macro is invoked that needs support code, it will be compiled again for incremental compilation, as this kind of state is not saved. Entry: DIP packages Date: Sat Nov 19 22:24:07 EST 2011 One of the reasons I stuck to PIC18 is the DIP package. So what are the other options? - PIC - AVR - MSP430: only MSP430F2013 Looks like MSP430 is not really worth it: only one "sugar me" DIP package.. Entry: State machines & register allocation Date: Wed Jan 11 10:59:17 EST 2012 I've been playing with Haskell a bit lately, writing a state machine compiler (essentially SSA form of code without recursion: original form is only letrec-style mutual tail recursion). This makes me think about a certain disconnect between the approach used in Staapl (stacks = local variables) and a state machine approach that's so useful for embedded systems. There isn't really a good way in Staapl to make this kind of thing work well due to lack of structure namespaces., i.e. separating the 'instance' and 'member' concepts. All the talk of protocol-oriented programming does seem to have something underlying it (minimise state in data processing by focusing on streams), but in practice it happens that data structures are buffer-oriented, not stream-oriented, probably due to a bias of the C programming language. Stream-oriented work requires protocol design. If the protocol is fixed the minimal memory usage is limited by (ad-hoc) protocol decisions and random data dependencies. Concretely: i'd like to solve the USB protocol parser problem in a smart way but this really requires either buffering and structure (namespace) support, or a botched stream-oriented approach that tries to work with the limitations of the protocol. I was thinking that *if* the intention is to compile down to a bunch of global variables, then it's possible to just use a modifed kind of macro. The Staapl macros have lambda and I bet this isn't so hard to then turn into global memory references in an instantiation phase: - function abstraction: name binding only: associate each argument with a "var @" macro. - function calls: pop arguments from stack and store them into the associated macros. requires knowledge of arity, but could be included at definition time. This requires a "lazy argument" macro, where each argument is interpreted as an inline function instead of a literal value. A macro like : : foo-macro a b c | ... ; would be instantiated to a function foo' by binding its arguments to the variables: : foo-code [ v1 @ ] [ v2 @ ] [ v3 @ ] foo-macro and each call to 'foo' would be implemented as a macro or function : foo v3 ! v2 ! v1 ! foo' ; The purpose of this approach is then to isolate naming (just lamba arguments) from storage allocation. This technique is probably useful in general to allow "flattening" of stack allocation. [1] entry://../meta Entry: Next? Date: Fri May 4 18:38:18 EDT 2012 Been a while. It's starting to itch again. Looking at those USB connectors in my rack here.. Maybe it's time for another pass. Get the USB driver going and make a synth controller. Why in that order? The USB would seriously simplify interface issues, and is long overdue. I wonder what Staapl could have been where it not for the failure to have a working USB interface. However, the reasons for that impasse are quite deep, and seeded in doubt about "the right approach". I still like the idea of an untyped macro language, but currently it lacks something that is completely trivial in C: hierarchical data structure namespaces. There is still the idea of "protocol oriented programming" or "minimal complexity stream parser approach" whatever name it should bear, but unfortunately that doesn't work so well with existing protocols like USB which are based on random access to flat memory buffers, an approach that is quite biased to using C. So, to summarize: - formalize "minimal complexity stream parser protocol design" - implement hierarchical namespaces - get that damn USB driver to work Reading the previous post, it seems that at least the idea of protocol-oriented programming is a bit stable. Maybe time to figure out if it actually makes sense practically. Theoretically at least I see a whole bunch of complexity disappear if protocols are designed better, or even automatically. And the solution for adding namespaces had crossed my mind too: there's lambda to introduce local names. Much more is not needed, nesting those will do just fine. Entry: Forth is compression Date: Sun May 6 08:47:13 EDT 2012 - compression of code size and temporary data storage (short-lived values values) through implicit operand access. - compression takes effort: there's no free lunch. writing forth takes more time because it forces "good structure". writing "bad forth" is very obvious: code size explodes. - this idea can be taken far: stream-oriented programming: state machines and small tasks. Entry: Stream-oriented machine Date: Mon May 14 17:28:48 EDT 2012 I found a use case for it, so let's set up the basic architecture: Instructions: LIT ( -- n ) copy byte from instruction stream to stack CPY ( n -- ... ) copy n bytes from instruction stream to stack EXC ( .. addr n -- ) execute C ABI LOK ( n -- addr n ) lookup addr/nb32bitargs LIT 1 LIT 4 COPY 1 2 3 4 LIT 1 LOK Then, it would be nice to be able to define a shortcut like this: DOFUN .... This way it's possible to get the forth machine through the company management, making it do something useful first, and then extending it with all kinds of Forth goodies ;) Can this use the trick of loading the return stack with opcodes? If opcodes are both primitives and calls, this might work. Entry: Executing a word by pushing it's instructions on the return stack. Date: Mon May 14 18:13:28 EDT 2012 Why do I need it? This way I don't need to implement code threading, just loading literals on a stack. What does this require? A union type that represents both primitives and sequences. Probably, sequences need to be abstracted. No, what this changes is explicit "exit", where "exit" re-loads the return stack. Basically I have this idea in my head that the return stack is really just the continuation, which represents an infinite list of instructions (that might be non-deterministic, i.e. it branches). The return stack is then a "cache" for the head of this list. So, instead of using an instruction pointer, what actually happens is that the last *PRIMITIVE* instruction on the RS will re-populate RS: code can only be executed from RS, but can reside anywhere, abstracted by the particular code representation. That's it! Entry: Input stream vs parameter stack Date: Mon May 14 19:52:05 EDT 2012 There's always this tension between prefix commands and postfix commands. The pain is that moving from prefix (input) to postfix (parameter stack) reverses order. I.e. PRE: call n fn a1 ... an POST1: a1 ... an fn n call POST2: an ... a1 fn n call The order of a1 ... an doesn't matter so much in the last call, so it can be assumed that the whole command is fully reversed. So what is the problem I'm solving, really? Why is this always such a problem in Forth. Are parsing words really that essential? I mean, there is the interplay between RS and DS, but it seems there's a similar thing going on between the input stream and DS. Is Forth really a 3-stack machine, or a 2-stack, 2-stream machine - console input and threaded code stream, which are also *very* similar. Entry: Loosing >r Date: Tue May 15 21:38:17 EDT 2012 Loading instructions on the return stack causes works like >r, r> to no longer work. So is this a good idea then? Entry: Notes about starting up Date: Thu May 17 12:45:14 EDT 2012 Notes - Don't connect more than one PK2. Later: allow multiple. - Overall, it boots just fine: connect PK2 and run "make xxx.live" Entry: Datastructures (named offsets / addresses) Date: Thu May 17 12:50:00 EDT 2012 For getting USB to work, first problem is handling datastructures. The idea that came out of previous notes is to solve the namespace problem by using positional pattern matching: bind macros to fields. I.e. like in Haskell: dataGet1 (DataStructure d1 d2 d2) = d1 Note that accessors in Haskell also don't use hierarchical namespaces. Everything is solved with modules. Underlying idea: the power of Staapl is in the macro system, which is essentially Scheme. The question is then, how to represent it? This might need references to datastructures on the macro level, which are eventually flattened to raw memory accesses? Basically, this is a wrapper around: 0 hard-coded addresses (static memory) 1 indirect objects On the PIC18 this would be implemented the "a" register. The question is then how to avoid contention with operations that use "a" directly? This is a bunch of loose ideas that doesn't quite fit together clearly. Let's go back to the basic app. Entry: USB stuff Date: Thu May 17 13:13:50 EDT 2012 Using M-x staapl-usb from tools.el: load usb.f include "/home/tom/pub/darcs/brood-5/staapl/pic18/usb.f" include "/home/tom/pub/darcs/brood-5/staapl/pic18/debug.f" .........................................................................OK I believe that's the last effort. init-usb OK $ sudo tail -f /var/log/syslog May 17 13:14:49 zoo kernel: [ 9866.365107] usb 4-1.4: new full speed USB device using ehci_hcd and address 7 May 17 13:14:50 zoo kernel: [ 9866.441358] usb 4-1.4: device descriptor read/64, error -32 May 17 13:14:50 zoo kernel: [ 9866.616742] usb 4-1.4: device descriptor read/64, error -32 May 17 13:14:50 zoo kernel: [ 9866.793137] usb 4-1.4: new full speed USB device using ehci_hcd and address 8 May 17 13:14:50 zoo kernel: [ 9866.869508] usb 4-1.4: device descriptor read/64, error -32 May 17 13:14:50 zoo kernel: [ 9867.049510] usb 4-1.4: device descriptor read/64, error -32 May 17 13:14:50 zoo kernel: [ 9867.225507] usb 4-1.4: new full speed USB device using ehci_hcd and address 9 May 17 13:14:51 zoo kernel: [ 9867.632247] usb 4-1.4: device not accepting address 9, error -32 May 17 13:14:51 zoo kernel: [ 9867.705418] usb 4-1.4: new full speed USB device using ehci_hcd and address 10 May 17 13:14:51 zoo ntpd[2365]: adjusting local clock by -6.269105s May 17 13:14:51 zoo kernel: [ 9868.112332] usb 4-1.4: device not accepting address 10, error -32 May 17 13:14:51 zoo kernel: [ 9868.112856] hub 4-1:1.0: unable to enumerate USB device on port 4 I get -71 (EPROTO) on old dell machine and -32 (EPIPE) on new amd64 host. I'm starting to worry that it doesn't work on USB1.1 Entry: maybe different approach better Date: Thu May 17 17:52:14 EDT 2012 broebel:~# mount -t debugfs none_debugs /sys/kernel/debug broebel:~# sudo modprobe usbmon broebel:~# cat /sys/kernel/debug/usb/usbmon/1u | tail /tmp/usb1.log [1] http://www.makestuff.eu/wordpress/?p=2537 [2] http://www.mjmwired.net/kernel/Documentation/usb/usbmon.txt Entry: Analyzing USB traffic Date: Fri May 18 11:41:50 EDT 2012 As usual, the effort should go to *effective* debug tools to just see what's going on. Then fixes are trivial. Problem is observation. Next: - Event log on PIC, maybe in flash? - Interpret USB debugging [1] [1] http://www.mjmwired.net/kernel/Documentation/usb/usbmon.txt Entry: Buffered console log Date: Fri May 18 11:58:00 EDT 2012 What about this: - Make emit buffered - Before each handshake, dump out the emit buffer first. Having 3 kinds of replies might be a bit too much: emit, hexdump, non-formatted hexdump. Also, it doesn't work everywhere... let's clean it up a bit. I've commented it out for now.. Main problem encountered today: - protocol is very ad/hoc and probably needs central place of documentation - i don't have a simple way of using fifos Entry: USB debugging Date: Fri May 18 14:12:17 EDT 2012 Taking only a single URB: c40c7200 T Event Type. This type refers to the format of the event, not URB type. 120 Available types are: S - submission, C - callback, E - submission error. ADDR "Address" word (formerly a "pipe"). It consists of four fields, separated by colons: URB type and direction, Bus number, Device address, Endpoint number. Type and direction are encoded with two bytes in the following manner: Ci Co Control input and output Zi Zo Isochronous input and output Ii Io Interrupt input and output Bi Bo Bulk input and output Bus number, Device address, and Endpoint are decimal numbers, but they may have leading zeros, for the sake of human readers. S Status word. This is either a letter, or several numbers separated by colons: URB status, interval, start frame, and error count. SETUP Setup packet, if present, consists of 5 words: one of each for bmRequestType, bRequest, wValue, wIndex, wLength, as specified by the USB Specification 2.0. These words are safe to decode if Setup Tag was 's'. Otherwise, the setup packet was present, but not captured, and the fields contain filler. URB TIME T TD:B:DEV:E S RT RQ VAL INDX LEN ---------------------------------------------------------------- c40c7200 2500976461 S Ci:1:001:0 s a3 00 0000 0001 0004 4 < c40c7200 2500976494 C Ci:1:001:0 0 4 = 01010100 First line is input request. RT a3 = 1010 0011 dir=dev->host, type=class, recp=other RQ 00 = GET_STATUS bmRequestType: D7 Data Phase Transfer Direction 0 = Host to Device 1 = Device to Host D6..5 Type 0 = Standard 1 = Class 2 = Vendor 3 = Reserved D4..0 Recipient 0 = Device 1 = Interface 2 = Endpoint 3 = Other 4..31 = Reserved Second line. Hmm... doesn't correspond to [1] RT = 1000 0000b RQ = GET_STATUS (0x00) VAL = Zero INDX = Zero LEN = Two I need a working starting point... I moved to wireshark: this has some parsing, makes it more clear. URB TIME T TD:B:DEV:E S RT RQ VAL INDX LEN comment -------------------------------------------------------------------------------- c40c7200 2500976461 S Ci:1:001:0 s a3 00 0000 0001 0004 4 < GET_STATUS c40c7200 2500976494 C Ci:1:001:0 0 4 = 01010100 c40c7200 2500976511 S Co:1:001:0 s 23 01 0010 0001 0000 0 c40c7200 2500976522 C Co:1:001:0 0 0 c40c7200 2500976533 S Ci:1:001:0 s a3 00 0000 0002 0004 4 < c40c7200 2500976543 C Ci:1:001:0 0 4 = 00010000 c40c7200 2501080613 S Ci:1:001:0 s a3 00 0000 0001 0004 4 < c40c7200 2501080638 C Ci:1:001:0 0 4 = 01010000 c40c7200 2501080689 S Co:1:001:0 s 23 03 0004 0001 0000 0 c40c7200 2501080703 C Co:1:001:0 0 0 c40c7200 2501136463 S Ci:1:001:0 s a3 00 0000 0001 0004 4 < c40c7200 2501136503 C Ci:1:001:0 0 4 = 03010000 c40c7200 2501192458 S Co:1:001:0 s 23 01 0014 0001 0000 0 c40c7200 2501192480 C Co:1:001:0 0 0 c40c7200 2501192538 S Ci:1:000:0 s 80 06 0100 0000 0040 64 < c40c7200 2501213484 C Ci:1:000:0 -75 0 c40c7200 2501213573 S Ci:1:000:0 s 80 06 0100 0000 0040 64 < c40c7200 2501217471 C Ci:1:000:0 -71 0 c40c7200 2501217550 S Ci:1:000:0 s 80 06 0100 0000 0040 64 < c40c7200 2501221470 C Ci:1:000:0 -71 0 c40c7200 2501221552 S Co:1:001:0 s 23 03 0004 0001 0000 0 c40c7200 2501221569 C Co:1:001:0 0 0 c40c7200 2501276456 S Ci:1:001:0 s a3 00 0000 0001 0004 4 < c40c7200 2501276498 C Ci:1:001:0 0 4 = 03010000 c40c7200 2501332450 S Co:1:001:0 s 23 01 0014 0001 0000 0 c40c7200 2501332472 C Co:1:001:0 0 0 c40c7200 2501436471 S Ci:1:000:0 s 80 06 0100 0000 0040 64 < c40c7200 2501440425 C Ci:1:000:0 -71 0 c40c7200 2501440517 S Ci:1:000:0 s 80 06 0100 0000 0040 64 < c40c7200 2501443429 C Ci:1:000:0 -71 0 c40c7200 2501443502 S Ci:1:000:0 s 80 06 0100 0000 0040 64 < c40c7200 2501447427 C Ci:1:000:0 -71 0 c40c7200 2501447501 S Co:1:001:0 s 23 03 0004 0001 0000 0 c40c7200 2501447517 C Co:1:001:0 0 0 c40c7200 2501500454 S Ci:1:001:0 s a3 00 0000 0001 0004 4 < c40c7200 2501500496 C Ci:1:001:0 0 4 = 03010000 c40c7200 2501556451 S Co:1:001:0 s 23 01 0014 0001 0000 0 c40c7200 2501556473 C Co:1:001:0 0 0 c40c7200 2501660468 S Co:1:001:0 s 23 01 0001 0001 0000 0 c40c7200 2501660497 C Co:1:001:0 0 0 c40c7200 2501660561 S Co:1:001:0 s 23 03 0004 0001 0000 0 c40c7200 2501660573 C Co:1:001:0 0 0 c40c7200 2501716479 S Ci:1:001:0 s a3 00 0000 0001 0004 4 < c40c7200 2501716528 C Ci:1:001:0 0 4 = 03010000 c40c7200 2501772449 S Co:1:001:0 s 23 01 0014 0001 0000 0 c40c7200 2501772470 C Co:1:001:0 0 0 c40c7200 2501772530 S Ci:1:000:0 s 80 06 0100 0000 0040 64 < c40c7200 2501776436 C Ci:1:000:0 -71 0 c40c7200 2501777147 S Ci:1:000:0 s 80 06 0100 0000 0040 64 < c40c7200 2501779365 C Ci:1:000:0 -71 0 c40c7200 2501779520 S Ci:1:000:0 s 80 06 0100 0000 0040 64 < c40c7200 2501783371 C Ci:1:000:0 -71 0 c40c7200 2501783526 S Co:1:001:0 s 23 03 0004 0001 0000 0 c40c7200 2501783548 C Co:1:001:0 0 0 c40c7200 2501836458 S Ci:1:001:0 s a3 00 0000 0001 0004 4 < c40c7200 2501836500 C Ci:1:001:0 0 4 = 03010000 [1] http://www.beyondlogic.org/usbnutshell/usb6.shtml Entry: USB debugging with wireshark Date: Fri May 18 15:16:34 EDT 2012 First thing I see are messages to 1d6b:0001 Linux Foundation 1.1 root hub What is this? What's that "Linux Foundation" business? Looking into drivers/usb/core/hcd.c it seems that the root hubs are emulated. I don't find any direct explanation but it seems that this is to share code between the host controller and hubs, by exposing a HC as a hub. Note: For a USB device, all traffic on the wire is directed to one device. For a usb BUS (which wireshark sees) there are multiple devices. The URB distinguishes between them. I have my device connected as the only device on the test machine, but there is still the hub traffic to ignore on the usb bus. How to set a wireshark usb[1] filter? usb.urb_id == 0xc40c7800 Looks like the first couple of transfers are handled by the PIC hardware, I don't recognize them, and it seems neither does WireShark as it's not parsed ("Application Data") The first properly parsed message is a GET DESCRIPTOR device. The ones before that are bmRequestType 23 h->d, class, other a3 d->h, class, other First non-parsed bytes in those packets: 0,1,0,0,3,0,1 This should be the bmRequest field From the names in [2] these seem to be "physical requests". It doesn't seem that they actually make it to the firmware. First bmRequestType:bmRequst I see on the device is 80:06, whis is DEVICE request GET DESCRIPTOR. So it seems: - physical requests can be ignored - next: reply to GET DESCRIPTOR - next: fix logging issues (either RAM buffer or TTL serial port) [1] http://www.wireshark.org/docs/dfref/u/usb.html [2] http://www.compsys1.com/support/usb/pic_code/HIDCLASS.ASM Entry: Old USB code Date: Fri May 18 16:30:17 EDT 2012 Last reference of what I was doing with USB is from [1]. This triggered a bunch or problems: emit -> PK2 stuff that eventually ended in me getting bored/disgusted/... I see, the old code is in _usb.f AHA. Now I remember. The old USB code used FSR2-relative addressing (the a reg), which is a problem for my Forth library because it's essentially a different machine to manage. That's why I switched to a "stream" approach. Next: - send dummy device reply + check on sniffer - get the "struct compiler" back online [1] entry://20110407-002404 Entry: Speed vs abstraction: current object? Date: Fri May 18 16:46:31 EDT 2012 While this stream-oriented approach does seem to work a little bit, I'm not sure that using the 'a' register for this is a good idea. Or 'a' should be something like current object, which means that this only works for highly coupled code. Entry: Can the usbmon be trusted? Date: Fri May 18 19:31:01 EDT 2012 According to the USB dump there is a reply to the device request, but I'm not sending anything from the uC. I just get "Malformed Packet: USB" and 24 bytes. Maybe it doesn't get past the host controller? Time difference (- 0.246596 0.238753) which is 7.8ms Something isn't right.. Let's go back to the raw capture node. Entry: Today's trouble Date: Fri May 18 19:57:50 EDT 2012 - It doesn't seem that usbmon/wireshark can be trusted to say what actually goes over the wire. Either I'm not getting it or wrong data doesn't make it to the PC. - PK2 is unstable: it can get stuck requiring a reset. Maybe I should just accept standard debug tools, otherwise I'm never going to get anything done. - Overall it feels too complex. I'm tempted to start over. Entry: usbmon Date: Fri May 18 20:10:06 EDT 2012 Let's just work with the raw USBmon stuff: cfbc4500 2426908079 S Ci:1:000:0 s 80 06 0100 0000 0040 64 < cfbc4500 2426915987 C Ci:1:000:0 -75 0 cfbc4500 2426916076 S Ci:1:000:0 s 80 06 0100 0000 0040 64 < cfbc4500 2426918960 C Ci:1:000:0 -71 0 So there is no trace of any reply packets. Seems wireshark's display was bogus indeed. 75 == EOVERFLOW 71 == EPROTO I found this in ohci.h: /* map OHCI TD status codes (CC) to errno values */ static const int cc_to_error [16] = { /* No Error */ 0, /* CRC Error */ -EILSEQ, /* Bit Stuff */ -EPROTO, /* Data Togg */ -EILSEQ, /* Stall */ -EPIPE, /* DevNotResp */ -ETIME, /* PIDCheck */ -EPROTO, /* UnExpPID */ -EPROTO, /* DataOver */ -EOVERFLOW, /* DataUnder */ -EREMOTEIO, /* (for hw) */ -EIO, /* (for hw) */ -EIO, /* BufferOver */ -ECOMM, /* BuffUnder */ -ENOSR, /* (for HCD) */ -EALREADY, /* (for HCD) */ -EALREADY }; But the controller is UHCI. The other plausible mention is in hub.c:2720 In uhci-q.c:763 I find: static int uhci_map_status(int status, int dir_out) { if (!status) return 0; if (status & TD_CTRL_BITSTUFF) /* Bitstuff error */ return -EPROTO; if (status & TD_CTRL_CRCTIMEO) { /* CRC/Timeout */ if (dir_out) return -EPROTO; else return -EILSEQ; } if (status & TD_CTRL_BABBLE) /* Babble */ return -EOVERFLOW; if (status & TD_CTRL_DBUFERR) /* Buffer error */ return -ENOSR; if (status & TD_CTRL_STALLED) /* Stalled */ return -EPIPE; return 0; } So this is at least starting to make sense a bit. 1st: EOVERFLOW from TD_CTRL_BABBLE 2nd: EPROTO from TD_CTRL_CRCTIMEO probably timeout. Also seen: EPIPE from TD_CTRL_STALLED Some UHCI docs[1]: "When a device transmits on the USB for a time greater than its assigned Max Length, it is said to be babbling." [1] ftp://download.intel.com/technology/usb/uhci11d.pdf Entry: Simple software? Date: Fri May 18 21:06:11 EDT 2012 It's a bit of a hopeless situation to try to keep things simple when connecting to with existing software/hardware interfaces. Entry: Getting things done Date: Fri May 18 21:08:09 EDT 2012 So it's clear now: staapl is an experimental toy. I don't think I can ever climb the mountain of compatibility with C. Staapl probably only makes sense in either a PIC architecture, or on a real stack machine in i.e. an FPGA. Trouble is that this is a work of passion, of a bit crazy ideas, and currently there's a bit too much wind ahead to make this a fun project.. So what to do? Entry: USB on scope Date: Sat May 19 00:21:15 EDT 2012 Looking on scope, I see only one of the lines (D-) move between 0-3V, the other (D+) stays at 0. Here[1] is mentioned that there is single ended signalling, but only for initial conditions. Doesn't look normal. However, if something that lowlevel is wrong, why does reception work? PIC does get bytes in just fine. Ordering new chips so I can see if a fresh chip has the same behaviour. Could also be output config. EDIT: Looks like I was just seeing the single-ended signalling. Setting a 1->0 trigger on D+ does show some 12Mbps symmetric waveforms after a while. [1] http://www.beyondlogic.org/usbnutshell/usb2.shtml#Electrical Entry: Next Date: Sat May 19 01:36:55 EDT 2012 Electrical is OK, so let's continue looking in the linux source. Or maybe, let's try low speed first. -> didn't work drivers/usb/core/message.c:132 usb_control_msg() :44 usb_start_wait_urb() Entry: kernel with USB debug message Date: Sat May 19 12:11:40 EDT 2012 May 19 11:53:16 broebel kernel: [ 925.937217] usb usb1: usb resume May 19 11:53:16 broebel kernel: [ 925.937246] usb usb1: wakeup_rh May 19 11:53:17 broebel kernel: [ 925.976143] hub 1-0:1.0: hub_resume May 19 11:53:17 broebel kernel: [ 925.976205] uhci_hcd 0000:00:07.2: port 1 portsc 0093,00 May 19 11:53:17 broebel kernel: [ 925.976240] hub 1-0:1.0: port 1: status 0101 change 0001 May 19 11:53:17 broebel kernel: [ 926.080205] hub 1-0:1.0: state 7 ports 2 chg 0002 evt 0000 May 19 11:53:17 broebel kernel: [ 926.080269] hub 1-0:1.0: port 1, status 0101, change 0000, 12 Mb/s May 19 11:53:17 broebel kernel: [ 926.192164] usb 1-1: new full speed USB device using uhci_hcd and address 6 May 19 11:53:17 broebel kernel: [ 926.221152] usb 1-1: uhci_result_common: failed with status 440000 440000 == (1<<22) | (1<<18) == TD_CTRL_STALLED | D_CTRL_CRCTIMEO Conclusion? I don't think the device sends anything. Next step: - read the datasheet - read example code broebel:/net/kers/home/tom/linux/linux-2.6-2.6.32/drivers/usb/core# rmmod ehci_hcd uhci_hcd usbmon usbcore ; insmod ./usbcore.ko ; modprobe usbmon ; modprobe uhci_hcd Entry: DATA 0/1 Date: Sat May 19 13:32:47 EDT 2012 I removed data toggle, so maybe that's a problem? [1] http://wiki.osdev.org/Universal_Serial_Bus#Data_Toggle_Synchronization Entry: ASM USB example code Date: Sat May 19 13:50:53 EDT 2012 It does some things to the BD0O registers I don't really understand.. Let's analyse ProcessSetupToken() / SendDescriptorPacket() - Copy the 8 byte setup packet to a separate buffer. - Reset both BD0I and BD0O - Reset PKTDIS - .... fill buffer IN0 - Toggle+transfer DB0I The strange thing is that I get a reset, get_dev_descr, reset, set addr. So one would think that the 80 06 did work.. R 80 06 00 01 FF R 00 05 12 FF 00 05 12 FF 00 05 12 FF Man I'm so confused.. What I need is some documentation that explains all this as a trace: what exactly happens on the wire for an entire enumeration process? What I'd like to see is a successful device descriptor transaction. I'm thinking that maybe the data phase is not correct. Focus on this: what is the DATAx for a reply to a SETUP packet with a device request? [1] http://pe.ece.olin.edu/ece/projects.html Entry: SET_ADDRESS: why does lab1.c wait for IN ? Date: Sat May 19 16:43:53 EDT 2012 lab1.c replies with 0-size DATA1 to SET_ADDRESS, and it doesn't immediately set the address. That only gets used in ProcessInToken. Why is there an IN token after SET_ADDRESS? Answer: there is always an IN transaction after a SETUP transaction. ProcessInToken() merely acts as a notification that the 0-size IN transaction sent in response to the SET_ADDRESS is done. Only after that it is safe to set the address. Basically, TRIF -> IN is just an acknowledgement of the end of an IN transaction initiated by the PIC. Like this: - PIC firmware prepares IN buffer (an empty one, serving as the Status word of a SET_ADDRESS control message) - PIC HW waits for IN token sent by host - PIC HW sends out DATA token to host - PIC HW receives ACK from host - This completes the transaction in hardware, so TRNIF is set, and the PID of last transaction set to "IN" in USTAT. - Only after the whole SET_ADDRESS transaction is done the device address can be changed. The reason that this is the only case handled in IN events is because we don't care in the other case: nothing more to be done after transaction is over. Entry: Status USB Date: Sat May 19 18:06:10 EDT 2012 I still don't know what's going on. It seems the first get_device request doesn't make it over the wire. I'm making a lot of changes for small things that seem to be wrong not writing things in the right order etc, but there doesn't seem to be any improvement in the log messages. I'm still flying blind. ( removed the hub messages ) May 19 18:50:44 broebel kernel: [25973.576247] usb 1-1: new full speed USB device using uhci_hcd and address 46 May 19 18:50:44 broebel kernel: [25973.576282] drivers/usb/core/message.c: usb_control_msg: cca1e000 6 256 0 May 19 18:50:44 broebel kernel: [25973.627168] usb 1-1: uhci_result_common: failed with status 440000 May 19 18:50:44 broebel kernel: [25973.627285] drivers/usb/core/message.c: usb_control_msg: cca1e000 -> -84 May 19 18:50:44 broebel kernel: [25973.627318] drivers/usb/core/message.c: usb_control_msg: cca1e000 6 256 0 May 19 18:50:44 broebel kernel: [25973.630139] usb 1-1: uhci_result_common: failed with status 440000 May 19 18:50:44 broebel kernel: [25973.630215] drivers/usb/core/message.c: usb_control_msg: cca1e000 -> -71 May 19 18:50:44 broebel kernel: [25973.630246] drivers/usb/core/message.c: usb_control_msg: cca1e000 6 256 0 May 19 18:50:44 broebel kernel: [25973.634123] usb 1-1: uhci_result_common: failed with status 440000 May 19 18:50:44 broebel kernel: [25973.634221] drivers/usb/core/message.c: usb_control_msg: cca1e000 -> -71 May 19 18:50:44 broebel kernel: [25973.744252] usb 1-1: device descriptor read/64, error -71 May 19 18:50:44 broebel kernel: [25973.848155] drivers/usb/core/message.c: usb_control_msg: cca1e000 6 256 0 May 19 18:50:44 broebel kernel: [25973.883124] usb 1-1: uhci_result_common: failed with status 440000 May 19 18:50:44 broebel kernel: [25973.883241] drivers/usb/core/message.c: usb_control_msg: cca1e000 -> -84 May 19 18:50:44 broebel kernel: [25973.883274] drivers/usb/core/message.c: usb_control_msg: cca1e000 6 256 0 May 19 18:50:44 broebel kernel: [25973.886095] usb 1-1: uhci_result_common: failed with status 440000 May 19 18:50:44 broebel kernel: [25973.886168] drivers/usb/core/message.c: usb_control_msg: cca1e000 -> -71 May 19 18:50:44 broebel kernel: [25973.886199] drivers/usb/core/message.c: usb_control_msg: cca1e000 6 256 0 May 19 18:50:44 broebel kernel: [25973.890081] usb 1-1: uhci_result_common: failed with status 440000 May 19 18:50:44 broebel kernel: [25973.890197] drivers/usb/core/message.c: usb_control_msg: cca1e000 -> -71 May 19 18:50:45 broebel kernel: [25974.000292] usb 1-1: device descriptor read/64, error -71 Entry: I have no clue Date: Sat May 19 19:06:54 EDT 2012 Looks like I'm stuck. - I don't see what's on the wire - Data doesn't seem to get past the UHCI controller It seems as if something doesn't get sent. Next: more reading? PIC datasheet and lab1.c This is 99% about attitude and stamina ;) Quite a challenge. Entry: Increase visibility Date: Sat May 19 20:34:26 EDT 2012 I'm getting quite sick of it, so let's just try to increase visibility and interactivity. The main problem is that I don't see what's going on, and what I see I don't trust. So. 1. Take PK2 out of the loop. Move back to serial port console. EDIT: This needs some fixing. Something broke after PK2 stuff? 2. Patch linux kernel to fail on the first error. This way it will be more clear what's happening. Entry: Serial console Date: Sat May 19 21:18:34 EDT 2012 (console 'uart "/dev/ttyUSB1" 230400) Looks like that is broken. Entry: Frustrated Date: Sat May 19 21:45:34 EDT 2012 I'm running into too many problems. All of them are debuggable but I'm re-entering the debugging cycle quite deeply now. Everywhere I look there's an insurmountable heap of cruft. Time to start from scratch? Something simple. Entry: Future of Staapl Date: Sat May 19 23:05:28 EDT 2012 It's becoming quite clear - Staapl is (not yet) for practical things. It needs to grow. - I'm going to continue on the USB as an excercise in debugging. I periodically get quite sick of this. The reason for that is that I'm thinking goal-oriented instead of process-oriented. Main conclusion is that Staapl is still a hack, not really ready for building "real" (non-exploratory) software with external constraints. The USB driver turns out to be a good exercise in debugging skills, but it's going to take a LONG time to finish this, so I'm no longer putting any "product" goals. It will take as long as it takes and it's a process. I'm decoupling it from any practical tools (effect pedal), which will be built in C. Entry: Things to fix Date: Sun May 20 11:11:15 EDT 2012 - Make the USB driver debuggable (interactive?) - PK2 "interrupt" isn't stable. Does this also pose a problem if interrupt isn't used? - Serial console is broken on 18F2550 - Fix error messages and code navigation. - Documentation? A bit of a chicken-and-egg problem: without docs, nobody touches it, and until I'm getting some real feedback, docs will be bad or outdated. At this point it doesn't seem like a good way to use time. Entry: Solipsism Date: Sun May 20 11:02:05 EDT 2012 Another point that has been bubbling up in the atmosphere of despair surrounding the USB driver is is integration into "society". Up to this point, apart from some pats on the backs, nobody cares. It all feels a little solipsistic, which would be fine if it were to be on the same level as "solving a crossword puzzle". However, that's not what I want. At least, it should be useful in some way. Either as a project in and of itself, or as a means to an end. I will continue working on it as long as it peaks my interest, but without any particular "product" in mind. EDIT: I really have more fun when I ignore purpose.. Entry: Debuggable USB driver Date: Sun May 20 11:15:08 EDT 2012 The idea is to make it run just once: make one service attempt, then stop. What about adding debug strings? More general: send "string tokens" to the host. Entry: Printing racket backtrace Date: Sun May 20 11:51:34 EDT 2012 [1] https://groups.google.com/group/plt-scheme/browse_thread/thread/231bb68fbc8093eb [2] https://groups.google.com/group/racket-users/tree/browse_frm/month/2010-06/090316a2a81df3a4?rnum=91&_done=%2Fgroup%2Fracket-users%2Fbrowse_frm%2Fmonth%2F2010-06%3F Entry: Trouble Date: Sun May 20 19:15:43 EDT 2012 match: no matching clause for # Goes away after "make clean". Entry: Basically a whole day of stuff Date: Sun May 20 19:20:05 EDT 2012 From today: - using an interactive approach: run code, don't reset PIC or PK2. - quit / abort / warm / continue - patching linux driver with debug messages and eliminating retries Entry: Debugging Date: Tue May 29 13:29:02 EDT 2012 So nothing is happening on the wire. My guess is that it's software or configuration, and not electrical. But to eliminate that it's probably easy to: - Try another 18F2550 chip - Try PICstamp board - run some third pary code [1] [1] http://www.sparetimelabs.com/usbcdcacm/index.html Entry: Resetting PK2 Date: Tue May 29 13:43:47 EDT 2012 PK2 gets into a state that is only broken using a hard USB unplug/replug cycle. Error in Scheme is: Error opening console pickit2: procedure application: expected procedure, given: #(struct:exn:fail "error: no-pickit2-found" #) (no arguments) Process staapl-usb exited abnormally with code 1 Error using pk2cmd is: No PICkit 2 found. make: *** [pk2-2550-48.flash] Error 10 Though the device is still there: tom@zoo:~/$ lsusb ... Bus 001 Device 049: ID 04d8:fc92 Microchip Technology, Inc. tom@zoo:~/$ lsusb -v -s 1:49 Bus 001 Device 049: ID 04d8:fc92 Microchip Technology, Inc. Device Descriptor: bLength 18 bDescriptorType 1 bcdUSB 2.00 bDeviceClass 2 Communications bDeviceSubClass 0 bDeviceProtocol 0 bMaxPacketSize0 8 idVendor 0x04d8 Microchip Technology, Inc. idProduct 0xfc92 bcdDevice 1.00 iManufacturer 1 iProduct 2 iSerial 0 bNumConfigurations 1 Configuration Descriptor: bLength 9 bDescriptorType 2 wTotalLength 67 bNumInterfaces 2 bConfigurationValue 1 iConfiguration 0 bmAttributes 0x80 (Bus Powered) MaxPower 200mA Interface Descriptor: bLength 9 bDescriptorType 4 bInterfaceNumber 0 bAlternateSetting 0 bNumEndpoints 1 bInterfaceClass 2 Communications bInterfaceSubClass 2 Abstract (modem) bInterfaceProtocol 1 AT-commands (v.25ter) iInterface 0 CDC Header: bcdCDC 1.10 CDC ACM: bmCapabilities 0x02 line coding and serial state CDC Union: bMasterInterface 0 bSlaveInterface 1 CDC Call Management: bmCapabilities 0x00 bDataInterface 1 Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x82 EP 2 IN bmAttributes 3 Transfer Type Interrupt Synch Type None Usage Type Data wMaxPacketSize 0x0008 1x 8 bytes bInterval 2 Interface Descriptor: bLength 9 bDescriptorType 4 bInterfaceNumber 1 bAlternateSetting 0 bNumEndpoints 2 bInterfaceClass 10 CDC Data bInterfaceSubClass 0 Unused bInterfaceProtocol 0 iInterface 0 Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x03 EP 3 OUT bmAttributes 2 Transfer Type Bulk Synch Type None Usage Type Data wMaxPacketSize 0x0040 1x 64 bytes bInterval 0 Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x83 EP 3 IN bmAttributes 2 Transfer Type Bulk Synch Type None Usage Type Data wMaxPacketSize 0x0040 1x 64 bytes bInterval 0 can't get device qualifier: Operation not permitted can't get debug descriptor: Operation not permitted cannot read device status, Operation not permitted (1) Using this code[1] usbreset.c #include #include #include #include #include void main(int argc, char **argv) { const char *filename; int fd; filename = argv[1]; fd = open(filename, O_WRONLY); ioctl(fd, USBDEVFS_RESET, 0); close(fd); return; } ./usbreset /dev/bus/usb/001/049 I still get trouble: May 29 13:59:08 zoo kernel: [1049324.765287] usb 1-2.1.2.1: reset full speed USB device using ehci_hcd and address 49 May 29 13:59:08 zoo kernel: [1049324.863631] cdc_acm 1-2.1.2.1:1.0: This device cannot do calls on its own. It is not a modem. May 29 13:59:08 zoo kernel: [1049324.863674] cdc_acm 1-2.1.2.1:1.0: ttyACM0: USB ACM device It shows up as a serial port. After unplug, replug gives: May 29 14:03:54 zoo kernel: [1049610.664984] usb 1-2.1.2.4: new full speed USB device using ehci_hcd and address 50 May 29 14:03:54 zoo kernel: [1049610.767214] usb 1-2.1.2.4: New USB device found, idVendor=04d8, idProduct=0033 May 29 14:03:54 zoo kernel: [1049610.767224] usb 1-2.1.2.4: New USB device strings: Mfr=1, Product=2, SerialNumber=3 May 29 14:03:54 zoo kernel: [1049610.767232] usb 1-2.1.2.4: Product: PICkit 2 Microcontroller Programmer May 29 14:03:54 zoo kernel: [1049610.767239] usb 1-2.1.2.4: Manufacturer: Microchip Technology Inc. May 29 14:03:54 zoo kernel: [1049610.767244] usb 1-2.1.2.4: SerialNumber: At which point it works again. So it seems firmware gets stuck. USB reset does not reset the PK2 firmware. Looks like this bug is a dead-end. The only thing I can think of is to rewrite the firmware. [1] http://www.roman10.net/how-to-reset-usb-device-in-linux/ Entry: What to do with PK2 Date: Tue May 29 14:21:57 EDT 2012 Since there are some issues with the firmware, it doesn't seem to be a good idea to stick with original firmware. I'm not sure if it's useful to write new firmware for PK2. I suppose it is possible, as copyright is only on the original Microchip code, and the schematic is not patented. Entry: Next Date: Wed May 30 09:03:23 EDT 2012 - add power led to board - add debug led? (i.e. on when waiting in interpreter) - Try another 18F2550 chip - Try PICstamp board - run some other USB code [1] I tried to run this[1] in a hurry yesterday but nothing happened. [1] http://www.sparetimelabs.com/usbcdcacm/index.html Entry: Fixing serial Date: Wed May 30 09:53:30 EDT 2012 I need to do someting else to get me going this morning. Let's try to fix the board + serial. I noticed a stupid soldering error last week. EDIT: Looks like this is a bit broken.. That sucks as this pre-interpreter stuff is hard to debug. First problem: TX wasn't connected. Old 18F452 board has a serial daisy chain that needs to be jumpered for single connection. Second problem: running the code below gives me 9600 baud (9803 as measured by OLS). Weird.. would assume when it's off that it's way off, not 2% of something that seems standard. : serial-test 230400 40 init-serial 0 begin dup async.>tx 1 + again Got it. Problem was that this probably never worked after the interpreter changed to the new "handshake" interface necessary for PK2. Entry: PicStamp Date: Wed May 30 17:03:36 EDT 2012 On Johannes' board it seems to get a bit further. This is the device descriptor it receives on the amd64 host: 12 01 00 02 00 00 00 40 D8 04 FF 01 01 00 01 02 00 01 which looks OK. On the 32bitter USB 1.1 it doesn't go without errors, but the bytes at least go over the wire in an IN transaction. Let's try with a different chip. Same results. After the SETUP token I get SYNC errors So it's probably electrical. Entry: Electrical problem? Date: Wed May 30 17:39:07 EDT 2012 Reading the 18F2550 datasheet, here's something I missed: 17.2.2.8 Internal Regulator The PIC18FX455/X550 devices have a built-in 3.3V reg- ulator to provide power to the internal transceiver and provide a source for the internal/external pull-ups. An external 220 nF (±20%) capacitor is required for stability. I found it looking up "VREGEN" from inspecting the configuration bits in piklab. EDIT: Adding the 220nF did the trick. Another one for later: The data in the USB Status register is valid only when the TRNIF interrupt flag is asserted. Entry: Next Date: Sun Jun 3 11:30:25 EDT 2012 So problem is pierced. What next? Probably should focus on proper logging facility, i.e. fixing/simplifying the interpreter. It's not necessary to store strings on the device. Simply defining "error conditions with arguments" should be enough. They can be part of the host system. Goal remains the same: absolute minimum use of resources on target. The point is not to make a stand-alone forth, so offload all debug if possible. It seems simplest to do this using a proper RPC mechanism. Define a way to call host words from target, where the empty reply is "return" as it was before. Trouble is that once this is introduced in target, it creates a coupling between host and app, i.e. app can no longer run stand-alone. So there needs to be a way to turn this off. Entry: RPC Date: Sun Jun 3 11:54:35 EDT 2012 Host RPC calls. We already have several: emit, hexdump1, hexdump2, return. How to make this simpler? - Protocol? Same as before: each app has a number of RPC calls defined in its host dictionar that are encoded just like interpreter commands are encoded. - Detach: at some point it needs to be possible for the target to call local code or raise an exception when host words are called in standalone mode. The simplest way to "vector" these words is to ignore them. It is known how many arguments are being sent, so simply replacing the current interpreter I/O with some simulation should create a proper stub. Entry: Fixing RPC Date: Sun Jun 3 14:35:34 EDT 2012 If RPC is to be handled correctly, it needs to be handled in all parts of the code, meaning: host sends RPC request -> target answers (0xFF) -> target sends RPC request (0xFE) Cases of "return" and "call" need to be clearly distinguished. Maybe it's time to change this a bit such that the addressing is clear. EDIT: fixed. Seems to work ok. Entry: Disassembler: use "sea" instead of "see" Date: Sun Jun 3 21:31:28 EDT 2012 I forgot how this worked: It seems indeed better to just store assembly for viewing instead of trying to reconstruct from raw disassembly. See [1]. [1] entry://20110402-100621 Entry: Symbolic target words Date: Sun Jun 3 21:49:49 EDT 2012 Instead of using a bunch of codes, it's probably simplest to call host code using symbolic names, and using the same approach as those (non-parsing) words run from the command line. What this needs is the following interface: >h h> move words to host stack #xFE [ ... ] execute target word Maybe all RPC code should drop into interpreter? EDIT: done, looks better. Next: where is the live: state stored (stack)? grep for "state:stack" in staapl/staapl/live ./rpn-target.ss:85: (void ((target: code ...) (state:stack)))) So it looks like I need to open up this one: (define-syntax-rule (target> code ...) (void ((target: code ...) (state:stack)))) Is it target: or live: ? reflection.ss `run' delegates to `forth-command' defined in rpn.target.ss in terms of `target:' to interprete commands. So do we want the target to have acess to that or only the `live:' set? The namespace used by 'live:' is defined by `live-interpret' in rpn-live.ss, and it will take words from the (scat) namespace but delegate to scheme usoing `scat-wrap-dynamic'. EDIT: Actually, this should probably be `target-interpret' instead of `live-interpret'. That works, but it creates loops: `target-interpret' gives preference to target words, so this needs to be broken. I don't see how to fix it. `kb' is a 1cmd: in the (target) dictionary, which is implemented as a prefix parser. so "kb" -> "t> kb", where 'kb' is then interpreted in (target) which delegates to target word. Maybe this needs a more systematic approach. Instead of delegating to host commands from .f code, make it so that all host commands are accessible on the target. Entry: Flash literals Date: Tue Jun 5 08:42:16 EDT 2012 Need to get back literal strings. Some options: - store in separate segment + provide word which loads pointer as literal - inline code: store as Pascal string + use return address to locate + skip -> 2 versions: skip / drop. I think I had drop before. See string.f : : f-> \ -- TOSL @ fl ! TOSH @ fh ! pop ; : foo f-> 3 , 65 , 65 , 65 , : .foo foo @f+ for @f+ emit next ; There seems to be symbolic quotation using '`' but I forgot how to then compile this to a byte sequence using ',' It's 'string,' but something's broken. Yep, '->byte-list' didn't support symbols. Entry: Host commands accessible on target Date: Tue Jun 5 10:32:53 EDT 2012 Maybe it's time to make a little diagram of all the different name spaces and prefix parser tricks.. It's getting quite complicated. 1. Allow for "` cmd" syntax. 2. Fix delegation loop (live) ------ This is scat running on host with some words to interact with the target machine, i.e. >t t> texex/b. For convenience, it has late binding for all symbols, with automatic lifting of Scheme words it finds in the name space. Note that scat is early bound. (target) -------- This is the interpreter for the command line dialect of the target language. It consists of 2 layers: * prefix macros that translate to live: code. (See commands.ss) * name space delegation to target, then (live) (See rpn-target.ss) Entry: Prefix commands Date: Tue Jun 5 11:42:11 EDT 2012 (prefix-parsers-wrapped (target) 1cmd: (kb)) (prefix-parsers/meta (target) live: ((1cmd: w) (t> w))) Somehow (1cmd: kb) reseolves 'kb' using 'interpret-target', while what I expect is this to be in live. The trick here is in the /meta stuff, which defines prefix parsers in terms of some other interpreter, while the 'kb' parser itself is defined just as a (target) namespace prefix parser. ;; Like 'prefix-parsers', but translate code using a different ;; compiler and splice it in. (define-syntax-rule (prefix-parsers/meta ns lang: (pat code) ...) (begin ;; Evaluation the pattern to check if the names are actually ;; defined, but that doesn't work because it includes pattern ;; names as well.. ;; (begin (lang: . code) ...) ;; test-eval it (prefix-parsers ns (pat (,(lang: . code))) ...))) So, knowing this, how to break the loop? Looks like it has to be solved in the (target) namespace because that's where the prefix parsers are. So this needs to be solved in `target-interpret'. Solution seems simple: call `target-interpret' direcly, and add an optional argument. Hmm... doesn't seem to work very well. Interactions are too complex. Maybe it's better to not define these words as prefix macros. That seems to be the real problem. These could be just words.. Almost there. This works: > (target/kb (state:stack)) But when it's typed on the console, the words are not found. Maybe it just needs to look in the target/ dictionary first? Another problem: why are the target words not defined as target/ prefix? Ok, I see. Looks like we hust need a different namespace. What about sim? It's probably better to get rid of (target) for any console code, and use it just for target words. That's for later cleanup, for now let's just introduce an extra namspace. Let's just call it (host). Entry: Simplification Date: Tue Jun 5 13:33:05 EDT 2012 With target->host RPC working, there is really no need for a lot of "push" commands implemented on target. I.e. can the 3 dumps + trace be eliminated? There's only one detail: those commands support streaming of data, which might be handy. Let's just put all 3 versions of dump in one command, or maybe better, dump to host stack byte list? What about keeping the scat stack that's used in (target) language active? This way target could do: dump packet + exec command. Hmmm.. that doesn't work due to recursive evaluation: this would require the stack to be threaded through this recursion. I think this is too big a change, so let's keep stack local and use an explicit rpc stack. EDIT: Flash and RAM dump are not necessary on target. They can be implemented on the host. Entry: Next Date: Tue Jun 5 15:43:25 EDT 2012 * Add a single prefix parser macro word to execute a host command, e.g. h: px such that these don't need to be duplicated in the namespace. * add some support for byte streams. this probably needs a 2-byte "execute" or a 1-byte indirection, or say a table of file descriptors. Maybe some more general indirection handling should be in order, also for "pulled in" code. 1-byte indirection is handy, since it's the data unit. Entry: Dynamic binding Date: Tue Jun 5 18:44:57 EDT 2012 While I don't really like dynamic binding so much, in forth it can sometimes be quite handy. Especially for streams it seems. The overhead of local (stack) variables is too high. In Chuck Moore's stack machines, there is also a current pointer register. For Staapl on PIC there is already a & f so might be good to abstract it a bit. However, moving bytes around is about the only thing we are ever going to do a lot of so let's just define i> >o and use dynamic binding to point them to current input and current output, with values saved on RS. what about this: iopen- connects something to i> and performs init iclose calls close method + pops previous input nope open/close should be different from `parameterize'. i-begin pushes old i> vector to RS and installs new one i-end pops old i> vector So it seems best to do this on top of a vector abstraction. There really aren't so many tokens to manage, so sticking to bytes seems the best approach. The token table can be defined in the monitor, or generated. Previous code used 128 tokens (even ones not used) all stored at first ram bank. : do-arrow 0 a!! TOSL @ !a+ TOSH @ !a+ pop ; Actually there seems to be a simpler way to do this, which is to place the indirection at the hardcoded level: reserve flash space for vectors. In the config .fm it could be defined where to put it. I'm wondering if it's possible to do some kind of smart ' (tick) operator: automatically generate tokens whenever a tick is encountered. TRAP. This is a trap. Just define a global execute function with a route macro : e0 0 ; : e1 1 ; : e2 2 ; forth : execute _e0 . _e1 . _e2 ; Hmmm.. Trouble really is that global and incremental compilation are really different. Unless we use a linked list approach. TRAP. It's a trap! What I need is a way to make some global "gather" compilations work. It should be no big deal in the all-at-once module compiler, but doing this incrementally is problematic, unless there is a way to make the gathering operations updateable. I think the state machine stuff works here too. Also, the code instantiation should be able to use it. Anyway, this is a deep hole. Entry: Doing this stuff in GDB Date: Tue Jun 5 22:33:44 EDT 2012 What is so helpful about this is to start from the idea of "everyting on the target" and then start offloading things to the host. The big idea here is that the target can "call host code". For GDB, this would mean: - set a breakpoint in a generic gdb_call() method. use a single method to only have to use a single breakpoint. - disambiguate this call by using a scratch buffer in RAM that takes the parameters of an RPC call. - when a breakpoint fires, have GDB transfer this RPC param buffer to a handler, possibly an external program. - allow this handler program to call back into the program, and/or change the program's state. Now, to simplify things, can the GDB part be limited to transferring raw data packets between the target and some host process? What I really want is just a bi-directional message pipe. Anything else can go on top of that. So basically, gdb_call() would give a command buffer to GDB, and receive a reply. This method would be called whenver there is a target request or event, and called periodically to receive incoming calls. An a-symmetric RPC mechanism can carry calls in the other direction, by embedding it into two calls: poll(), reply(). The main feature is really that the protocol is synchronous, so both sides are always in a well-defined state. So, what about this simplified version: - breakpoint at gdb_call - read target RAM buffer, save to req.bin (pipe?) - read from reply.bin (pipe?) into target RAM buffer - continue Once the pingpong channel is organized, all the rest is software protocol. gdb -> ext : dump binary value ext -> gdb : source hmm doesn't look like it.. gdb doesn't want to block on sourcing from a pipe. this probably needs a shell command to keep it synchronous. dump binary value req.bin gdb_req shell do_rpc source reply.gdb Entry: gdb_req Date: Wed Jun 6 00:00:56 EDT 2012 Simple hack to make RPC channel from (embedded ARM) target program to external linux application using GDB / JTAG. target -> app: req.bin binary request app -> target: reply.gdb gdb command reply This is useful to write a test system with minimal modification of the target program. The test system can be structured as if it were running on the resource-limited embedded target, while in effect it is implemented by host code with arbitrary complexity. The gdb_call() below is a hook point to insert remote calls. If the test system is detached, gdb_call() is a NOP, and the application can perform its normal behaviour. ##### gdb_req.c #include unsigned int gdb_req = 0; void gdb_call() { } int main(void) { for(;;) { gdb_req++; printf("OUT gdb_req = %d\n", gdb_req); gdb_call(); printf("IN gdb_req = %d\n", gdb_req); } return 0; } ##### gdb_req.sh # Dummy operation: replace with program that handles request. hd req.bin echo 'set gdb_req = 123' >reply.gdb ##### gdb_req.gdb define service dump binary value req.bin gdb_req shell ./gdb_req.sh source reply.gdb continue end define start_service break gdb_call tbreak main run while 1 service end end Entry: Inlining symbolic constants Date: Wed Jun 6 08:32:04 EDT 2012 There's a way to do macros. : foo [ 1 2 3 ] i But I'm not sure if functions will work. I have been playing with this before, but I don't remember. There's also this: : foo make-label [ 1 ] compile-macro 2 ; So it should be enough to dup it and compile a call using cw. ccm = compile and call macro. : foo [ 1 ] make-label dup >m swap compile-macro m> cw ; Or factored out macro : ccm make-label dup >m swap compile-macro m> cw ; forth : foo [ 1 ] ccm ; I've added it to the compiler as compile-call-macro. This allows: : _kb [ ` kb fbin; ] compile-call-macro fcmd ; Some refactoring is necessary. compile-macro compiles exit which we don't want, and it might be simpler to allow for macro wrapping and composition to be explicit. Works. Entry: quote/exec Date: Wed Jun 6 10:10:52 EDT 2012 RPC can be simplified more. - Only one word for sending packet, parameterized by i> for input - Same code sends quote/exec requests Next: use vector.ss EDIT: done, works fine. Entry: Speed vs. code size Date: Wed Jun 6 12:00:50 EDT 2012 Maybe it's time to make this more explicit: I have a strong irrational tendency to want to write fast code. In practice, this means a lot of macros to eliminate indirect references. While it's possible to do this, it's very inconvenient and doesn't usually help much with code size. I guess it's always possible to optimize for speed when it's necessary. Entry: do-arrow (vector.ss) Date: Wed Jun 6 12:09:29 EDT 2012 The vector abstraction I used was: 2variable token : set-token token -> ; : run-token token invoke ; It seems to work just fine for now. Uses 2 bytes RAM to store the address (which is abstract) but uses only 1-byte tokens (index into table of addresses), which is convenient for use. For now this seems good enough. Storing the addresses in Flash is possible but requres more elaborate bookkeeping. I used it to create words: stdin i> This solves 99% of all data transfer problems in an abstract way. Entry: Next Date: Wed Jun 6 14:54:55 EDT 2012 Probably USB driver. Got some new tools now. Entry: Problem with host commands Date: Wed Jun 6 15:04:23 EDT 2012 The target shouldn't execute host commands, but live (scat/scheme) commands that operate on the host stack and access the target stack explicitly. Implemented as such, then there is no recursion. The difference is that host commands take arguments form the target param stack. So what should it be? The question is: implicit or explicit? It's a bit confusing. I.e. the console is "magic", all the rest (scat host machine and Forth target machine) are straighforward. Explicit is better for target->host code. But this need to find a good way to make all this explicit in documentation. I.e. the console is "magic", all the rest (scat host machine and Forth target machine) are straighforward. Entry: Pull commands for dumps Date: Wed Jun 6 16:04:32 EDT 2012 Added _ad and _fd scheme functions that can be called from target to perform memory dumps without pushing. It's not sure how these will be useful, but at least it's possible to gather more data than in a single packet, since host is smarter about chunking. Another option would be to push a data generator that can be called by host code to continuously stream data. Entry: GDB stuff Date: Thu Jun 7 19:19:10 EDT 2012 With a little more effort this can be made to talk to the gdbserver directly, doing it in-image, avoiding the funky shit. Entry: memcpy Date: Sat Jun 9 10:24:50 EDT 2012 This is difficult in the current implementation because one of the indirect addressing registers is used as the rs. So either I do it slow with memory pointers and the stdio stuff, or I write a special routine. For USB, it might not be necessary to use it, i.e. avoiding memcpy altogether and doing it in-place. For Flash there is no problem: initializers can go in Flash and will probably use less code. Entry: RPC context save Date: Sun Jun 10 08:48:25 EDT 2012 RPC calls should preserve the following state: - 3 stacks: xs, ds, rs - 2 registers: a, f However, the interpreter acts as if it owns these registers, so before doing anything destructive a/f need to be saved. It's easy to do this on the target but why not do it on the host? The trouble is that indirect addressing uses the a register, so it doesn't look like this is possible without a register fetch support. Maybe it's simplest to reuse the stackptr word to also dump out other state. Then, what about restoring? Store also needs the a reg, so restoring a reg is not possible without target support. Looks like this needs 2 words: save and restore. There's one slot left as I see 2x reply0. I guess slot 6 is not used. Also, I'm not sure if jsr is still useful, so might reuse slot 7 also. It seems better to turn the basic >t t> support words into multi-byte words. OK, done. However, I'm not sure that was really necessary but it's good to have it. I'm thinking a bit more. It's probably good to keep the interpreter minimal, and only implement support for other features where they are actually needed. I.e. when not running a recursive interpreter, save/restore is not necessary. So let's put it in debug.f EDIT: reply/1 isn't necessary. Only used for stackptr and checkblk. stackptr is necessary for ts, so might go in debug.f also. EDIT: interpreter now only does memory transfer + execute (with and without ack). Decoupled other functionality from the host i/o. Entry: Correct for .. next Date: Sun Jun 10 10:14:36 EDT 2012 For .. next isn't implemented correctly when count = 0; it will loop 256 times. How to do that better? What about 1+ followed by a jump to the "then clause?". Entry: Next Date: Sun Jun 10 13:27:40 EDT 2012 - check proper save/restore of a/f registers on host RPC: OK - continue usb Entry: USB get descriptor + set addres OK Date: Sun Jun 10 15:53:04 EDT 2012 However, the irony is that the debugging I have now slows it down too much. So what now? Time for the auto-compressed log message generator! Anyways.. it should probably work OK as long as there is only a single PK2 pinpong going on. How many are there now? It's a couple: - quoted symbol to print - pb - ts - stack@ - a! - dump Probably it's best to make a do-it-all log command that dumps out address, symbol and datastack. Is there a simple way to do address -> symbol translation? Maybe it's best to just use trace actually.. EDIT: nope, also too much. I got the USB hardware wire trace so that's probably enough for now. Entry: USB next Date: Sun Jun 10 17:46:42 EDT 2012 Let's just do all transactions one by one. On the wire: - addr:0 GetDescriptor 80 06 00 01 00 00 40 00 - addr:0 SetAddr x 00 05 48 00 00 00 00 00 - addr:x GetDescriptor 80 06 00 01 00 00 12 00 So second time it asks only for 12 bytes. I don't see a reply to that request, so looks like address is not OK. Entry: USB cont Date: Tue Jun 12 08:15:49 EDT 2012 So on the wire things go wrong after SetAddr. The device does seem to receive the address. address @ . 77 OK But UADDR isn't correct. UADDR @ . 112 OK That's actually a bug in tethered.ss because the correct value 4D is visible on the dump: #xF6 abd F60 00 00 00 00 00 00 00 00 F68 08 00 00 9F 04 20 4D 14 So UADDR is set, but device doesn't reply. Maybe it should only be set once? I see the value of address changes.. Data corruption? Might be the vector table. Or something more subtle. Do variable addresses clash when defined in modules? Doesn't seem so.. OK, problem is missing "UIR TRNIF low". But this probably means there is no transaction.IN handler for the SETUP packets? I'm confused.. Next error: for the initial addr==0 GetDescriptor there is SETUP,IN,OUT but for the subsequent GetDescriptor to the new address, the OUT phase is missing, and DataCenter (Beagle USB sniffer software) doesn't see it as a GetDescriptor transaction. Questions: if there is supposed to be an OUT, why isn't this visible in the sniffer? Aha, the IN phase in SETUP is DATA0, not DATA1 as in the successful transaction. Correct should be: Get Device Descriptor: SETUP DATA0 IN DATA1 OUT DATA1 Set Address SETUP DATA0 IN DATA1 Looks like I'm just toggling from the Set Address IN reply, but I should reset. EDIT: Refactored a bit, now I get different errors. Stall on OUT phase of first GetDescriptor call. 0 OUT/DATA0 \ make room for next SETUP request on EP0 Entry: Boolean functions Date: Tue Jun 12 11:08:44 EDT 2012 1,a -> ~a 01 -> 1 (a xor b) or b Entry: Things to change Date: Tue Jun 12 11:50:14 EDT 2012 - cache the PIC18 target code generation in the Racket compilation phase, it is very slow. I.e. once the module code is fixed, there is no reason to run the macros more than once: result will be the same, so just store the target code in the module and wrap this in a Racket compilation phase. - decouple PK2 driver from interpreter image, I.e. provide over TCP. This could even be written in C to allow it to run on a larger uC. Entry: USB linear code Date: Tue Jun 12 12:04:40 EDT 2012 Instead of writing this as a state machine, I wonder if it's not easier to use linear code. Entry: More USB debugging Date: Tue Jun 12 13:39:20 EDT 2012 The next SETUP txn it receives is: 80 06 00 06 00 00 0A 00 This is a Get Descriptor request for descriptor type 6, which is not defined. Weird.. Is this some kind of standard-complience test? It doesn't happen on the other PC... Might be some id-dependent thing in the Linux kernel.. Let's just continue on other host. Next request is get config: 80 06 00 02 00 00 09 00 I think it's time to revive the request struct compiler. Entry: Struct compiler Date: Tue Jun 12 14:47:04 EDT 2012 This needs a way to create target words in a module. I did this before, where is it? The syntax is: (words ...) Example in serial.ss Old usb.ss code uses "route/e" "_x>" "xskip" which probably need to be revived. Testing: Device descriptor works. String descs go over the wire but are not properly encoded. There's something wrong with config Ok, this was max packet size. -> Need to truncate packet. Entry: USB cont Date: Tue Jun 12 18:56:59 EDT 2012 Enum working up to Set Configuration. Next problem is the descriptors themselves: [2276531.480629] usb 1-2.1.2.2: new full speed USB device using ehci_hcd and address 37 [2276531.573072] usb 1-2.1.2.2: config 1 interface 0 altsetting 0 has an invalid endpoint with address 0x80, skipping [2276531.574313] usb 1-2.1.2.2: New USB device found, idVendor=04d8, idProduct=0001 [2276531.574321] usb 1-2.1.2.2: New USB device strings: Mfr=4, Product=3, SerialNumber=2 [2276531.574328] usb 1-2.1.2.2: Product: USB Hack [2276531.574333] usb 1-2.1.2.2: Manufacturer: Microchip Technology, Inc. [2276531.574339] usb 1-2.1.2.2: SerialNumber: 0.0 [2276531.575129] usbhid 1-2.1.2.2:1.0: couldn't find an input interrupt endpoint [2276535.883246] usb 1-2.1.2.2: USB disconnect, address 37 Next is to pick a an interface and stick to it. I'm tempted to do something really simple for Staapl, just wrap the monitor protocol in 2 vendor-specific requests: SET_DATA (push) GET_DATA (pull) and write a C program that takes data on stdio so this can be combined with socat. What are the alternatives? - vendor-specific / really simple - CDC - FTDI Entry: USB rpc stuff Date: Wed Jun 13 13:54:35 EDT 2012 Cought trying to make a (simple) RPC mechanism for remote USB calls. One that works for PK2 and can be reused for some ad-hock Staapl packet protocol. I'm running into some weird problem with memset() / memcpy (). Wait... this is silly. 0xad is probably \255 octal? Entry: Linux USB serial Date: Wed Jun 13 21:00:39 EDT 2012 What about emulating the simplest driver? Some candidates: linux-2.6-2.6.32/drivers/usb/serial$ ls -lS *.c |grep -v mod -rw-r--r-- 1 tom tom 2069 Dec 2 2009 hp4x.c -rw-r--r-- 1 tom tom 2025 Dec 2 2009 siemens_mpi.c -rw-r--r-- 1 tom tom 1521 Dec 2 2009 funsoft.c See also[1] Reading Documentation/usb/usb-serial.txt in Linux source gives: If your device is not one of the above listed devices, compatible with the above models, you can try out the "generic" interface. This interface does not provide any type of control messages sent to the device, and does not support any kind of device flow control. All that is required of your device is that it has at least one bulk in endpoint, or one bulk out endpoint. To enable the generic driver to recognize your device, build the driver as a module and load it by the following invocation: insmod usbserial vendor=0x#### product=0x#### where the #### is replaced with the hex representation of your device's vendor id and product id. [1] http://comments.gmane.org/gmane.linux.usb.general/34211 Entry: USB Bulk Date: Thu Jun 14 17:00:45 EDT 2012 So Linux USB Generic serial driver seems to be enough to get going. It uses bulk transfers. I can see OUT and IN transactions on the wire, but don't know what I'm supposed to see. First thing to do is to enable the endpoint buffers. Entry: IN transaction Date: Thu Jun 14 19:06:24 EDT 2012 When exactly is the TRNIF for the IN transaction set? Figure 17-9 indicates that it is after a transaction is complete. So, to send data on an IN endpoint, update BD and transfer it to USB transceiver. Whenever it receives an IN token it sends out the data and after receiving an ACK from the host, it sets TRNIF. With IN implemented I see stuff on the wire, but the serial port side doesn't do anything. Probably needs OUT too. So.. now I see a bunch of data on both endpoints, but still nothing on the serial side. Overlooking something.. Oops.. going a bit too hard? [17504.255316] ------------[ cut here ]------------ [17504.255334] WARNING: at drivers/usb/serial/usb-serial.c:410 serial_unthrottle+0x53/0x72 [usbserial]() [17504.255342] Hardware name: Aspire M3400 [17504.255346] Modules linked in: ftdi_sio usbserial binfmt_misc ppdev lp sco bnep l2cap crc16 bluetooth rfkill vmnet parport_pc parport vmblock vsock vmci vmmon autofs4 powernow_k8 cpufreq_stats cpufreq_powersave cpufreq_conservative cpufreq_userspace cpufreq_ondemand freq_table nfsd exportfs nfs lockd nfs_acl auth_rpcgss sunrpc act_police sch_ingress cls_u32 sch_sfq sch_cbq 8021q ipt_MASQUERADE xt_tcpudp xt_state iptable_nat nf_nat nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 iptable_filter ip_tables x_tables bridge stp fuse radeon ttm drm_kms_helper drm i2c_algo_bit tun kvm_amd kvm dm_mirror dm_region_hash dm_log sbp2 ieee1394 loop usbhid hid snd_hda_codec_atihdmi snd_ice1712 snd_ice17xx_ak4xxx snd_ak4xxx_adda snd_cs8427 snd_ac97_codec snd_hda_codec_realtek snd_seq_dummy ac97_bus snd_i2c snd_mpu401_uart snd_seq_oss snd_seq_midi snd_rawmidi snd_seq_midi_event snd_hda_intel snd_hda_codec psmouse snd_pcm_oss snd_mixer_oss snd_seq snd_pcm i2c_piix4 shpchp snd_timer snd_seq_device snd i2c_core soundcore serio_raw snd_page_alloc button processor pci_hotplug evdev wmi ext3 jbd mbcache dm_mod sr_mod cdrom sd_mod usb_storage ohci_hcd ahci r8169 libata thermal mii thermal_sys scsi_mod ehci_hcd [last unloaded: usbserial] [17504.255566] Pid: 7475, comm: cat Tainted: G W 2.6.33.7-rt29 #1 [17504.255571] Call Trace: [17504.255586] [] ? warn_slowpath_common+0x76/0x8d [17504.255597] [] ? serial_unthrottle+0x53/0x72 [usbserial] [17504.255608] [] ? tty_unthrottle+0x39/0x45 [17504.255616] [] ? n_tty_flush_buffer+0xe/0x67 [17504.255625] [] ? tty_ldisc_flush+0x27/0x3c [17504.255634] [] ? tty_port_close_start+0x13b/0x163 [17504.255643] [] ? tty_port_close+0x11/0x41 [17504.255651] [] ? tty_release+0x23c/0x578 [17504.255661] [] ? handle_mm_fault+0x3cb/0x79a [17504.255669] [] ? rt_spin_lock+0x29/0x6d [17504.255678] [] ? __fput+0x10e/0x1e2 [17504.255686] [] ? filp_close+0x5f/0x6a [17504.255693] [] ? sys_close+0xa2/0xdb [17504.255701] [] ? system_call_fastpath+0x16/0x1b [17504.255708] ---[ end trace b863518ac3707459 ]--- EDIT: This is probably because 64 bytes is not a short packet, and host keeps polling, so the transfer never stops. Maybe it's the CRC? Doesn't look like it. From what I see in PIC DS this is handled by the transceiver. Maybe it needs a stall packet? From [1]: IN: When the host is ready to receive bulk data it issues an IN Token. If the function receives the IN token with an error, it ignores the packet. If the token was received correctly, the function can either reply with a DATA packet containing the bulk data to be sent, or a stall packet indicating the endpoint has had a error or a NAK packet indicating to the host that the endpoint is working, but temporary has no data to send. Searching for NAK in the PIC DS doesn't turn up an explicit mechanism. Maybe just not send stuff? Ok, this goes a little better. OUT now seems to work: when I cat a file to /dev/ttyUSB it goes over the wire in its entirety. However, the IN stuff is still weird. In response to IN, host sends an OUT transaction. Maybe there's a handshake? Or is the IN data interpreted in some way? Wait, this could just be TTY stuff in response to the codes sent by the device. Let's just send characters. It's probably because echo is on. And the fact I don't see anything is probably because of line buffering. Adding CR/LF to the string gives this: broebel:/home/tom# cat /dev/ttyUSB0 !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\] Looks like it's working! [1] http://www.beyondlogic.org/usbnutshell/usb4.shtml#Bulk Entry: Serial on windows? Date: Fri Jun 15 09:38:23 EDT 2012 Now, pick one that also works on windows[1]. [1] http://www.linuxjournal.com/article/6573 Entry: Next Date: Fri Jun 15 09:58:32 EDT 2012 - Make PK2 more usable: use RPC stuff / fix the sync bugs. - Speed up compilation: generate target code only once by moving it to transformer phase. - USB: MIDI / Generic interfaces. Entry: Cleanup staaplc Date: Fri Jun 15 11:07:11 EDT 2012 It doesn't look like these lines are really necessary: ;(target-words-check! labels) ;(code-pointers-set! pointers) As long as we have the correct .fm corresponding to the compiled code, everything should be there. It's not necessary and a bit confusing to redefined words. If I recall, the reason that this is there is to allow the shell to communicate with "binary only" images, by just saving the addresses. I no longer thing this is necessary, or a good idea. Entry: Cache target code compilation Date: Fri Jun 15 11:48:54 EDT 2012 Trouble here is that it needs to have a sytax representation. Is that possible with arbitrary (circular) data structures? Doesn't look like it. #lang scheme (require (for-syntax scheme)) (define-syntax (foo stx) (let ((val (make-vector 1))) (vector-set! val 0 val) #`'#,val)) (define bar (foo)) datum->syntax: cannot create syntax from cyclic datum: #0=#(#0#) So... before this can work, there first needs to be a non-circular representation of target words, where names are used to introduce circularity. The only place where this happens in code is in an assignment statement that links a word to its code. Can this be broken open? grep -nrI . -e set-target-word ./comp/postprocess.ss:68: (set-target-word-next! w w+) ./comp/postprocess.ss:71: (set-target-word-code! word code) ./comp/postprocess.ss:91: (set-target-word-code! word ./asm/assembler.ss:134: (lambda (w) (set-target-word-address! w #f)) ./asm/assembler.ss:167: (set-target-word-address! word addr) ./asm/assembler.ss:231: (set-target-word-bin! word inss) I'm not sure if this is going to be an easy change. It might be better to live with the situation and first make PK2 more robust. Entry: PK2 robustness Date: Fri Jun 15 12:28:33 EDT 2012 It looks like the main problem is the device getting out-of-sync. There is not so much of a problem with stuck PK2 firmware with some safe-guards installed. Let's investigate. Run current usb test app a couple of times, using CTRL-C to interrupt and power-cycle the target. The error message after a while is: OK test C-c C-c Command "test" interrupted. Trying cold restart... target-off target-on icsp-recv: pk2 read: expected 3 bytes, got 1: icsp-recv: b:2 h:#t a:#f -> (0) So it gets out of sync and after that never recovers. How to resync? Will try resync on this: (unless (= expect-size real-size) (error 'icsp-recv "pk2 read: expected ~a bytes, got ~a:\n~a\n" expect-size real-size (log-msg reply))) Looking on the logic analyzer, there is some transfer going on, but it's not sure who is writing. This probably needs a look on the scope to see if the read/write phase got messed up.. There is a collision during one bit, so something is out of sync. What I don't understand is why a power reset doesn't solve it. Hard to see what's going on, but I think it's safe to assume the problem is on PK2 side. Target seems to be fine. Solution for now: (define (reconnect) (pk2-close) (sleep 1) (connect) ) The PK2 RESET command doesn't work if PK2 is stuck. It disconnects the USB device while opened with libusb which seems to cause trouble.. Entry: Opti stuff Date: Fri Jun 15 16:08:48 EDT 2012 Might need to revert. The building blocks seem to work but driver doesn't respond to first SETUP packet. EDIT: fixed. problem was different interface for OUT/DATA0. \ PIC18 specific (since USB is also PIC specific): use rot<<, indirect \ addressing using a register, and assumption that >r and r> can span \ across procedures. \ ep -- bd : IN rot<< 1 + ; : OUT rot<< ; \ Toggle mask for stat+ : bit 7 for and, bit 6 for xor. : DATA0 #x00 ; \ 0 and 0 xor : DATA+ #xC0 ; \ 1 and 1 xor : DATA1 #x40 ; \ 0 and 1 xor macro \ Just needed once, so use macros : bdaddr \ n -- lo hi rot<< rot<< 4 ; \ Update status register : stat+ \ togglemask ustat -- ustat+ over rot>> and xor \ Apply togglemask to DATAx bit #x40 and \ Keep only DATAx, rest was filled by USB #x88 or ; \ UOWN set to USB, DTSEN=1, KEN=0, INCDIS=0, BSTALL=0 forth \ Prepare buffer descriptor to send to USB transceiver with updated \ DATAx and buffer size. Destroys `a' reg. : >usb \ n bd x -- a>r >r bdaddr ah ! al ! r> @a+ \ STAT @ stat+ >r !a- \ CNT ! r> !a \ r> STAT ! r>a ; : OUT/DATA0 OUT DATA0 >usb ; : IN/DATA0 IN DATA0 >usb ; : IN/DATA1 IN DATA1 >usb ; : IN/DATA+ IN DATA+ >usb ; Entry: Next Date: Fri Jun 15 18:47:40 EDT 2012 Time for reflection, maybe? USB is working. Yeah! Is it worth writing alternative firmware for PK2? Probably not. At least not yet. Adding programmer to Staapl PK2 driver might be nice, but is not really necessary either.. There's another window that opens up now: PC USB <-> stuff ;) For the synth project, the killer app is an exponential D/A converter with digital (time) feedback. Maybe it's time to start to do that? Entry: USB midi Date: Fri Jun 15 19:02:01 EDT 2012 I have a serial port, which is trivial to add to Pd. Could already run MIDI over it. Ha, I have a sniffer now, I don't need to read docs! Endpoints send 64 byte packets containing MIDI messages, with a single byte prefix, and padded with zero. From page 16 in [1] this byte is cable number + code index. Each message is 32 bit, (I guess) with up to 16 messages per 64 byte IN/OUT transaction. This is the configuration info sent by Bus 003 Device 020: ID 09e8:0076 AKAI Professional M.I. Corp. 00000000 09 02 65 00 02 01 00 a0 32 09 04 00 00 00 01 01 |..e.....2.......| 00000010 00 00 09 24 01 00 01 09 00 01 01 09 04 01 00 02 |...$............| 00000020 01 03 00 00 07 24 01 00 01 41 00 06 24 02 01 01 |.....$...A..$...| 00000030 00 06 24 02 02 02 00 09 24 03 01 03 01 02 01 00 |..$.....$.......| 00000040 09 24 03 02 04 01 01 01 00 09 05 01 02 40 00 00 |.$...........@..| 00000050 00 00 05 25 01 01 01 09 05 81 02 40 00 00 00 00 |...%.......@....| 00000060 05 25 01 01 03 |.%...| 00000065 [1] http://www.usb.org/developers/devclass_docs/midi10.pdf Entry: USB Serial: raw mode by default? Date: Fri Jun 15 20:53:44 EDT 2012 Is there a way to disable all tty processing, especially echo, before any application has a chance to send data to the device? Entry: Too Concrete Date: Fri Jun 15 22:43:41 EDT 2012 For memory operations current Staapl isn't abstract enough. Juggling the a register and working with bytes / words is a pain. I wonder if the r register can be freed up a bit. Can we make it so that 'r' is not guaranteed to be available in interrupts? This would allow the FSR1 register to be used as a pointer reg for memcopy. Interrupt could have its own stacks.. : >IN1 a>r 13 4 a!! a> \ get buffer count top 6 high? if \ check if full drop IN1-flush IN1-wait 0 then \ check if full dup >r 128 + 5 a!! \ get next byte address >a \ store byte r> 1 + \ increment counter 13 4 a!! >a \ store it ; Writing code like this is really too much work, isn't it? I'm having fun with this, but somethines it's a royal pain to express something, especially dealing with memory. Mostly because I don't want to make "slow" abstractions; I'm still programming a register machine... Stuff like this really can't beat C.. It will just never get done. Maybe I should make a real Forth to try out some other ideas? In the code above I'm thinking, it really can't be that hard to write a byte to a buffer in memory and increment a counter. But I think it is on PIC... Some opti: counter could just contain the al register. Before sending out, mask out bits that are not used. Version two Entry: PIC and memory buffers (banked access) Date: Sat Jun 16 07:38:40 EDT 2012 Re: last post.. Trouble is of course that 1. PIC is a pain to use an 2. abstraction solves everything. (and 3. I have to accept these things ;) However, thinking in assembly it seems that what is needed is a way to use banked addressing. The pointers themselves are in a fixed location so not necessary to use the 'a' register here.. Let's define the words @b, !b, b@, b! that perform the banked access and access to the banking registers. DONE. Entry: Next Date: Sun Jun 17 14:42:08 EDT 2012 - >IN1 using banked access for pointer Make the abstraction such that there is no huge setup penalty for transferring multiple bytes. I.e. just use >a but add an open/close abstraction that loads/stores the CNT reg. Entry: PIC indirect addressing Date: Sun Jun 17 14:59:42 EDT 2012 In general this is a pain in the rear, except when it's possible to use the FSR registers to keep loop state. With a bit of tinkering this doesn't seem to be a big problem. Makes a lot of sense now why the ColorForth chips have 2 pointer registers: one source, one dest. It makes loops over memory a lot lot more efficient than juggling pointers. It's really just stdin/stdout. Anyways, here the new IN1 update using b and a. \ When filling up the buffer, CNT has AL. Strip off the bits when \ sending it out. We can just use the >a and a> words to access the \ buffer. \ Since the location of the buffer is known, these are implemented as \ macros to make the other code a bit more readable, and to have a \ more efficient implementation. Indirect addressing is inefficient \ since we're already using all 3 pointer registers. macro : IN>BD OUT>BD 4 + ; \ EP -- BD : OUT>BD 8 * ; \ EP -- BD : CNT 1 + ; \ BD -- BD.CNT : IN1/CNT 1 IN>BD CNT ; \ -- BD.IN1.CNT : bd@ >m bd-page b! m> @b ; \ addr -- value (fetch in BD page) : bd! >m bd-page b! m> !b ; \ value addr -- (store in BD page) forth \ These serve as "open/close" for the IN1 buffer. : a/IN1-begin IN1/CNT bd@ al ! buf-page ah ! ; : a/IN1-end al @ IN1/CNT bd! ; \ Single byte access, saving a. : >IN1 a>r a/IN1-begin >a a/IN1-end r>a ; \ byte -- Entry: Scheduling Date: Tue Jun 19 12:08:55 EDT 2012 Next problem is scheduling since there are now two tasks; i.e. how to handle USB consumer/producer control flow? Maybe it's time to start using interrupts? Alternative is to do the polling in the event loop. Maybe that's going to be easier. Entry: Cheapest USB PIC18F Date: Tue Jun 19 23:43:30 EDT 2012 Currently that's PIC18F13K50[1] at $1.32 volume price, compared to $3.44 for my default PIC182550. Mouser pricing for 13K50 is [2] 1: $2.39 10: $1.91 25: $1.75 100: $1.58 Bummer, reading the Flash programming manual[3] it looks like D+/D- are multiplexed with PGC/PGD, which means PK2 approach can't be used. Probably needs a bootloader to use Staapl. Talking about bootloader, it might be simplest to go that route now that USB is working. [1] http://www.microchip.com/wwwproducts/Devices.aspx?dDocName=en533925 [2] http://www.mouser.com/Search/ProductDetail.aspx?qs=hH%252bOa0VZEiAcEtBytpgHsA%3D%3D [3] http://www.microchip.com/wwwproducts/Devices.aspx?dDocName=en533925 Entry: GDB stuff Date: Wed Jun 20 15:30:39 EDT 2012 See previous post[1]. How to connect this to a server application? Let's make a small C app that handles requests by waiting for a single write on a named pipe, and writing back a reply on another. The thing is.. it's a lot simpler to just exec GDB from the C app, and use the --annotate=3 protocol that's also used in emacs. EDIT: Tried the exec GDB approach for a closed project. Works well! Main benefit is that test system and target system share the same language + code base while running on different hosts. [1] entry://20120606-000056 Entry: Busy Date: Wed Jul 4 13:16:25 EDT 2012 Couple of weeks off the project due to work and holidays. Next thing to do is to make a strategy for having 2 tasks: USB driver and main app. Some options - state machine polling loop - blocking tasks - ISR + main task The usefulness of these depends on the application. ISR+main seem to be the simplest approach. Trouble on PIC18 is that tasks need to share the return stack, other than that task switching is fast. The USB driver can easily be written as a state machine so it can run in a polling loop or from an ISR. Let's go for slow ISR for USB, then move from there. Entry: Cheapest PIC18F Date: Tue Jul 31 17:23:08 EDT 2012 For an electronics project I want the cheapest possible 18F chip that can be used with Staapl. I'm thinking to go for a 2-chip solution: one programmer/hub with USB and and one or more slave chips. At volume pricing, the 18F13K22 is current the cheapest at $1.16 [1] PDIP low volume is $2.30 The 18F1220 volume is $1.96 with low volume PDIP at $2.44 So for low volume it doesn't make much difference, though the 13K22 is faster (16Mhz intosc up to 16 MIPS / 64MHz). [1] http://www.microchip.com/wwwproducts/Devices.aspx?dDocName=en538201 Entry: Synth Club Date: Tue Jul 31 20:27:25 EDT 2012 1. Working stuff - 18F1220 synth: cheap + boards available. Problem: this is programming. How to simplify? - Mixer feedback 2. Breadboard experiments - Inverter feedback Entry: Starting again Date: Mon Oct 15 10:39:51 EDT 2012 Question now is: what works and what doesn't work for PK2? I'm quite ennoyed by this to the point I want to be done with the whole bazar. It's too much effort to work around this all the time.. Connecting to PICkit2. datfile: /usr/local/bin/PK2DeviceFile.dat iProduct: PICkit 2 Microcontroller Programmer Console startup failed: #(struct:exn:fail bad-reply: id:0 msg:() #) Continuing with REPL anyway: Press ctrl-D to quit. OK After that it works. Entry: Debugger Date: Mon Oct 15 11:21:13 EDT 2012 The thing is that the PK2 is only necessary for kernel programming. After that it's just a digital serial system, so I wonder if it might be possible to wire-or onto the ICD lines. Entry: Next Date: Mon Oct 15 15:15:52 EDT 2012 1. I can't do anything without proper debug tools, so that problem needs to be solved first. Since the final product is going to have USB, it seems best to go with a bootloader and debug console built in. Focusing on a single chuip would make things easier too. It's nice to play around with the smaller ones but the 18F2550 family will do just fine for now. With a bootloader there is the PK2 to program it and debug the bootloader. Remaining debugging could be done over USB, i.e. "inside the OS". 2. Once a proper USB device is working, this can also be used as a debugger/programmer. NEXT: Get the Staapl protocol working on a virtual serial port. Entry: Standard bootloader Date: Mon Oct 15 15:21:58 EDT 2012 It would be best to re-use a standard bootloader. Is there one? Doesn't look like it, then the best bet I have is to boot it up as a serial port and use the Staapl protocol. Entry: Staapl protocol on virtual serial port. Date: Mon Oct 15 15:30:55 EDT 2012 Problem: buffering. It's probably best to stick to the ping-pong protocol. The EP0 has IN and OUT buffers, which are separate, so there should be no problem simulating single-byte rx/tx words. Entry: Next: console on USB SERIAL Date: Sat Oct 20 14:55:14 EDT 2012 The problem is flow control. Monitor is written in terms of blocking read, so some inversion is necessary. The big question is: do we want tasks? It's easy enough in PIC18 as long as the hardware stack is deep enough, but it's quite a pain if it is not. One more reason to switch to a different arch. The other approach is to just use interrupts. This requires some thinking as I do need the high-priority interrupt for the audio stuff. Let's do this then. 0x18 is LPIV Section 17.5 “USB Interrupts”. Low priority interrupts: PIR2 USBIF : USB interrupt flag IPR2 USBIP : USB interrupt priority PIE2 USBIE : UIE : Propagate USB interrupts to microcontroller From Figure 9-1, this is the configuration that enables USBIF to interrupt the CPU to the Low Pri 0018h vector. USBIF = 1 USBIE = 1 USBIP = 0 GIEL/PEIE = 1 : peripheral interrupt enable GIE/GEIH = 1 : global interrupt enable IPEN = 1 Also, not to forget from UIR -> USBIF there is UIE UIE = 7F Entry: What to save in ISR? Date: Sun Oct 21 10:16:39 EDT 2012 The main question is, is it safe to use the stacks? I think so, because none of the ASM instructions leave the stacks in an inconsistent state. So the recipe is: - dup is MOVWF which does not affect status flags, so WREG can be saved first. - STATUS can then be copied into wreg. The above are just STATUS @. To restore, care needs to be taken that the drop doesn't mess up the flags after restoring it. So STATUS ! can't be used. There is a "nfdrop" that uses MOVFF to not affect flags. This should work: dup STATUS ! nfdrop - save STATUS before doing any dup, since dup affects flags. ther eis an Entry: Simpler, not more complex. Date: Sun Oct 21 10:36:30 EDT 2012 I'm thinking about how to do this bootloader thing, but really, that is currently not the issue. Stick to kernel + interaction approach, where: - PK2 or other Microchip programmer is used to upload the kernel. - Interaction is over USB serial port. Later if necessary, the kernel can be programmed only once, and all updates can be done over USB interaction (i.e. arduino-style) It's probably not a good idea to start adding all kinds of hooks to try to predict usage at this point: interrupts and USB descriptors etc.. Entry: Fundamental linking question Date: Sun Oct 21 10:39:11 EDT 2012 I had started to refactor things to use Racket modules, but in practice, just leaving .f files with undefined names seems to be a lot simpler. Is there a way to modify the compiler to manage dependencies better if the .f files are changed? Currently it's completely ignored. Maybe this can use an "include" directive or so. Entry: Serial port echo Date: Sun Oct 21 11:24:37 EDT 2012 So, service-usb is run from interrupt. Next is to make an echo app in "userspace". To avoid double buffering, this approach can be used: If IN1 is empty and owned by the UC, it's possible to send out a packet by locking the usb interrupt, filling the buffer, and sending it out. Otherwise, just wait until it is. Trying a bit, I can get it to work without interrupts but once I start playing with the interrupt enable/disable it goes wrong. Is there a simpler way to synchronize? The thing is this: if we're not currently sending, the IN1 buffer is owned by the uC. The ISR will not touch it until the flags are set, so can that be used? EDIT: Might be actually that it never leaves the ISR because some flag is not acknowledged. Maybe something else is triggering the interrupt? Entry: Status LED Date: Sun Oct 21 11:55:14 EDT 2012 I'm reminded of a simple fact: if your debugger doesn't work, you need a status LED to figure out what's going on. And currently, the PK2 stuff is playing up again. Entry: Tools trouble Date: Sun Oct 21 13:26:46 EDT 2012 Hmm... This sucks. There are a bunch of things that are not really working as they should, so maybe a new approach is necessary. Problems: - PK2 is not reliable - Bootloaders are cumbersome to use (what if they get overwritten) - Serial console doesn't have a reset Instead of making a radical change and ending up at another problem, is there a way to do this with minimal changes? Would a proper reset for serial console be enough? Entry: What's working? Date: Sun Oct 21 15:11:17 EDT 2012 WORKING: 1. "test" on pk2-2550-48.fm + just PK2 connected, after 2nd try after full unplug. 2. ctrl-C + reload .live + "test" 3. ... 4. ... 5. "testi", all the same 6. ctrl-C + reload .live "testi" 7. ... NOT WORKING: The same, but using serial console. Disconnecting the power might help. Nope... something weird is going on. Maybe some ports are disabled or something? Weird.. Let's cut the power from USB-TTL, see what happens then. Trouble was that "PIR2 USBIF low" was needed. Entry: Byte read/write Date: Sun Oct 21 18:47:48 EDT 2012 Byte write: - Busy loop until UOWN=0 (if UOWN=1 a transaction is in progress) - Save a byte to the buffer, update pointer - If buffer is full, send to USB Flush: - 1 IN/DATA0 Implementation: CNT can be used to store the LSB of byte pointer. I did this impl before. Where did it go? Yep: a/IN1-begin a/IN1-end IN1-flush Entry: Test primitives Date: Sun Oct 21 20:26:17 EDT 2012 I need a better proper "equals" operation that properly drops the top element. It seems best to use the carry flag for conditions since it survives "drop". : = =>c c? ; Entry: Reproducibly stable startup for USB code Date: Mon Oct 22 13:32:21 EDT 2012 I don't know what's going wrong, but the problem seems to be that it doesn't want to start up properly right after flashing with PK2. This works: - make .flash / .live - exit staapl - power cycle both PK2 and the USB (1-2 seconds) - make .live - testi Could be a hub issue. Might want to plug it in directly.. Entry: Loopback working Date: Mon Oct 22 13:37:15 EDT 2012 Not particulary fast though. Saturates at 62.3kB/s uC is running at 12 MIPS. Since each byte is processed separately by >IN1 and OUT1> this gives 192 instructions per byte. Actually it quite consistently gives 62.5kB/s which is 64000 / 1024 or exactly one 64 byte buffer per millisecond (USB tick). Entry: Indirection for >INx OUTx> ? Date: Mon Oct 22 13:54:25 EDT 2012 Would be nice, but maybe isn't necesary. Entry: Automatic banked access Date: Mon Oct 22 13:55:12 EDT 2012 The bank select instruction gets in the way of optimization. It's probably best to make bank-select words for all operations that use direct memory references, and just inject the instruction in a preprocessing step based on the actual value of the address: This could use the following map: 000-07F access RAM 080-0FF device registers 1FF-F80 bank-select accessed F80-FFF stack ram (from 080-0FF). For straight line code the bank select instruction could then also be omitted. Entry: Next: board powered from USB Date: Mon Oct 22 16:46:26 EDT 2012 - Power board form USB and disconnect PK2/FTDI. - Drive the interpreter Power from USB works with serial loopback. However, I need to unplug it for about 5-10 seconds before replug. Maybe needs brown-out reset? Hmm... it worked for a bit now not at all. Ok it does work when I wait 5 seconds and plug it into a different hub, facing down. Maybe solder issue, or a voltage issue. Maybe I should try with an LF? It's very picky about the port it goes into. Plugging it directly into the tower doesn't work either.. Only the powered hub. I don't think it's the voltage. Very unpredictable behavior. Probably some soldering issue. It all works fine as long as its powered by the PK2. It seems really just to crash. Maybe a pause at bootup would work? I checked on the scope. Don't see much on the outside except for the 3.3V VUSB signal. It stays on for about a second, then discharges to about 1V over a little over 2 seconds. This cycle repeats indefinitely. I don't see anything happening to the supply voltage or the reset. The device descriptor is received OK. Very strange. This might be related to the thing not working with just a serial console attached.. EDIT: Trying again using interactive PK2 it works for a bit, then it disconnects again. Maybe it's time to switch back to uart and see what is actually going on. EDIT: Trying some more things. Added a startup delay: 0 for 0 for 0 for next next next, which seems to work. EDIT: When I connect just the ground from PK2, it seems to run without problems. Disconnecting the ground then keeps it running. Nope, this is not consistent. I just switch the hub's switch (unplug/replug) and it starts OK. What can this be? Touching the solder joints makes it crash. If I don't send any data it lives longer. Running from with PK2 with USB power disconnected works fine. I can only make it crash by touching the oscillator connections. Can this be caused by a floating input pin? More weirdness. PK2 is now supplying 3V. I noticed this when trying to close the USB 5V jumper while running. After programming, pk2 give 4.3V Running with that, after a while it gives up. Stopping PK2, then starting again runs it at 3V. First time no proper comm (bad-reply) 2nd time it works. Could it be that the regulator is enabled, but the signalling voltage is not 3.3V? EDIT: It is. So everything seems to work at 2.7 and 3.6 V, but 4.3V doesn't work. Maybe the power is just too unstable? Entry: 18F2550 voltage Date: Fri Oct 26 08:58:57 EDT 2012 From the datasheet: 1.2 Other Special Features Like all Microchip PIC18 devices, members of the PIC18F2455/2550/4455/4550 family are available as both standard and low-voltage devices. Standard devices with Enhanced Flash memory, designated with an “F” in the part number (such as PIC18F2550), accommodate an operating VDD range of 4.2V to 5.5V. Low-voltage parts, designated by “LF” (such as PIC18LF2550), function over an extended VDD range of 2.0V to 5.5V. This makes no sense. It's an F part (not LF) runs fine on lower voltage but not on higher voltage.. WTF? Also, later in the DS the operating voltage is mentioned to be a larger range: 28.1 DC Characteristics: 2.0-5.5V Let's replace the chip and see what it does. EDIT: Maybe this is really an LF device that's marketed as an F device. I assume they are the same die but get sorted into 2 range bins. Some discussion here[1]. People think indeed they are the same chips passing some extra testing. Verified by some F parts working at LF voltages. [1] http://www.electro-tech-online.com/microcontrollers/130612-difference-between-pic-lf-pic-f.html Entry: Next Date: Fri Oct 26 09:21:16 EDT 2012 1. Work on firmware: on 3V it seems to work so let's just continue. 2. Try on different hardware: - USBPicStamp - Build new board for verification using 18LF4550 - Swap in new 18(L)F2550 (waiting for sample order) Entry: FLUNK presentation Date: Mon May 6 11:06:23 EDT 2013 What is Staapl? - "assisted" macro assembler - low-level code modeling tool - experiment: does it make sense to write "abstract low-level code". Some basic ideas. - FACTOR uC code is often code-size constrained. A stack language (Forth) or stack machine model can help here. Why? Replacing global registers (RISC machine) with a 2nd stack can decouple code, introducing more opportunity for code reuse. - GENERATE uC code is often very specialized and "hand optimized". This means there is _implicit_ structure in the code that is no longer visible in the low-level assembly (or C) code. It makes sense to generate it from a higher level description, to make this otherwise hidden structure explicit. I.e. represent a model and a specializer as opposed to just the specialized code. - FUNCTIONAL At the meta level, a stack language is easily represented as a pure functional language. syntactic concatenation -> composition of code generators How does it work? Take the Forth snippet 1 2 + which loads two numbers on the parameter stack and performs the "+" operation. The result of this is a parameter stack loaded with the number 3. When compiling this to machine code, one would typically see the instruction sequence: push 1 push 2 call + Note there is a direct correspondence between a forth code sequence and a machine code sequence. The trick is then to interpret the recently generated machine code as a _data stack_ I.e. after compiling "1 2", the compiler sees the following generated code segment: push 1 push 2 When it encounters "+", instead of compiling the call to "+", it would remove the two _instructions_ from the compilation buffer, perform the computation at compile time, and produce the result. push 3 In general, this is called _partial evaluation_. This particular structure is also called a _peephole optimizer_ in that it only looks at the most recent code. Once this mechanism is in place, it is possible to generalize it to "virtual instructions", meaning to compile code that cannot be compiled directly to machine code, but will act as parameters passed to other code generators. This introduces _composition_ of code generators. And that's it. Everything else fits in this picture. It allows the same syntactic representation for compile time and run time code. It's not that different from the macros vs. functions idea in Scheme, only that this works on stacks of code and not data flow graphs (= lambda expressions). So what's the point? It allows to mix machine mapping (i.e. PIC18 macros) and highlevel code generators, all in one representation, giving access to the _real_ machine. ( As opposed to performing the partial evaluation directly on the input syntax. This is related to joy's syntactic vs. semantic isomorphism. ) Example in staapl: - conditionals use uC flags instead of the data stack. - Entry: Why work on the stack level as opposed to the syntax level? Date: Mon May 6 12:16:44 EDT 2013 From [1]: In Joy, the meaning function is a homomorphism from the syntactic monoid onto the semantic monoid. That is, the syntactic relation of concatenation of symbols maps directly onto the semantic relation of composition of functions. It is a homomorphism instead of an isomorphism because it is onto but not one-to-one, that is, some sequences of symbols have the same meaning (e.g. "dup +" and "2 *") but no symbol has more than one meaning. How is this relevant for Staapl? The idea is that it is easier to work with the semantic representation (functions) than the syntactic representation. I.e. instead of using term rewriting as the computation engine, one uses function composition, which in practice is implemented as directed rewriting, i.e. pattern matching. The advantage here is to be able to encode machine-specific ideoms (i.e. machine instructions) along side high level constructs. [1] http://en.wikipedia.org/wiki/Joy_%28programming_language%29