This is the BROOD dev log. The older the entries, the less accurate they probably are with respect to the current design and implementation. If you have comments, give them on the brood list http://lists.goto10.org/cgi-bin/mailman/listinfo/brood-list Entry: compiled / interpreted Date: Thu Feb 23 17:33:21 EST 2006 After briefly using a direct interpreter i've opted for : run compile execute ; Code is now always compiled. Where : compile translate asm ; With 'asm' the a scheme eval in the forth module. (There's still the word 'scheme' which evals in the current module). This approach is way simpler than direct interpretation, and it allows words to be represented as scheme procedures. Also, it was a piece of cake to get true tail calls this way, while in the interpreted version it was more difficult. (Fri Apr 14 21:53:31 CEST 2006 it does work now, but compilation is still the preferred method) (in case you're wondering, i'm trying to split up packet forth into 2 layers: one closer to scheme, with GC, and one closer to forth, without too many dynamic memory tricks) so basicly, i'm re-inventing Joy, Factor, ... I found a lot of concatenative languages. See http://del.icio.us/doelie/concatenative Entry: true forth->c compilation Date: Thu Feb 23 17:37:41 EST 2006 I'm thinking display lists from opengl. If there's no conditional code or loops, there's little challenge in compiling code, you can just flatten the call tree to a long list of primitives. Including loops and conditional code, it is still relatively straightforward. Now to do this properly, it might be best to keep using a register language (C function) for primitives. These can be easily compiled to an 'unrolled stack' C function. This is easiest for the C compiler to do its data flow analysis for optimization. Maybe its best to just use a new variable for each assignment. All stack manips just manip the current variable stack. : x 2 add ; : y 3 mul ; : z 1 x dup y sub ; // z d0 = 1 // 1 (d0) d1 = 2 // 2 (d0 d1) d2 = add(d1, d0) // add (d2) // dup (d2 d2) d3 = 3 // 3 (d2 d2 d3) d4 = mul(d3, d2) // mul (d2 d4) d5 = sub(d2, d4) // sub (d5) Then the next steps would be to include * conditional code * loops Maybe its easiest to just not support unrolling for those. Entry: the chicken was first! Date: Fri Feb 24 11:54:43 EST 2006 I found this today: http://savannah.nongnu.org/projects/chicken/ So, not only is my idea non-original, the whole chicken-egg meme is already taken! So the project is now called 'brood', with chicken and egg being its components. Entry: cross compilation Date: Fri Feb 24 14:08:45 EST 2006 So what's the most important difference from a cross-compiler/monitor/ interpreter and a metacompiling one? The language and the meta-language are not the same. Was adding some features in egg. The interpreter only accepts numbers and symbols. So lists can have a separate meaning, where i made them to be scheme code now. (define (compile-other thing) (primtive-eval thing)) This is the same as [ ] words in forth. So, instead of : x [ 1 2 + 3 + ]L * ; i can do : x (compile (+ 1 2 3)) * ; where scheme is the meta language, or : x (1 2 3 + + compile) * ; where chicken is the meta language. Of course, the 'compile' part can be made implicit. Entry: all eggs Date: Fri Feb 24 17:12:34 EST 2006 Why not use all eggs? Instead of making the communication between chicken and egg asymmetric, why not let them all connect to each other, with one meta-programming the other? Again, the point of the whole exercice is twofold: (A) write a high level (functional) concatenative language on top of scheme (B) use this to metaprogram small forth eggs Point (A) is the next step after PacketForth, while (B) is the next step after BadNop. The whole idea is that some things are better done without a garbage collector, and other things are better done with. (there's this note though on http://home.pipeline.com/~hbaker1/ForthStack.html about linear languages not needing a GC) About the first egg: this is an in-image shared library. Dangerous, but very flexible. Eventually it should be an out-of-image tethered forth, to prevent the monitor (chicken) from crashing. All eggs should eventually be on the same footing: microcontrollers and audio/video engines. So let's have a look at * writing a small egg interpreter in cps * have a minimal interface through stdin/stdout Entry: immediate words Date: Thu Mar 2 15:58:06 EST 2006 what about this syntax on a microcontroller forth: (dup) (1) ifte or are we completely stuck with this: if dup else 1 then the problem with allowing these kinds of constructs is they almost require execute, which is not always cheap and not always possible, say for 8bit machines. ok. this is just a prelude to badnop. how to port it? basically, disentangle optimization and core. then port core, followed by optimization. Entry: interpret/compile Date: Sun Mar 5 14:11:25 CST 2006 during the coding of the parser rule compilation step, i run into another explicit interpreter. which means, it can probably be mapped to a lambda directly. this is a patter which seems to occur a lot. Entry: deep compilation Date: Mon Mar 6 10:44:15 CST 2006 yesterday i differentiated between >s-scheme and >r-scheme. the first converts only a top level expression to scheme, the second recurses, and converts all lists containing symbols to lambdas. the reason for this is mainly speed (i'm trying to find a way to compile to C using chicken scheme->C comp. as a side note, guile seems to be reasonably fast. slow startup time but fast run time.) (actually, i discovered lazy evaluation here :) Entry: SSA Date: Mon Mar 13 18:45:11 EST 2006 I'd like to know some more about GCC. And it's time to dive into my optimizing compiler book (Advanced Compiler Design and Implementation, First Edition by Steven Muchnick). I don't think i'm ready yet to write a frontend, but it might be interesting to use a subset of C which is close to the intermediate language(s) used for optimization. http://gcc.gnu.org/wiki/GIMPLE http://gcc.gnu.org/wiki/SSA Especially this one: Almost all tree optimizer passes work on SSA GIMPLE, while some work on normal low level GIMPLE. SSA is single static assignment. To make forth run really fast, you need a forth chip. C compiler and most target chips are drenched in register transfer lingo, so it seems the only place where forth can have some advantage is really simple chips, like the PIC architecture, and on some higher level (above the inner loop), with inner loops expanded to fill the risc pipeline. Entry: Functional programming and state machines (monads) Date: Tue Mar 14 15:08:46 EST 2006 I'm finally starting to get the functional programming thingy. In order of appearance, these are the things that pulled me over: 1. writing cat in a purely functional way 2. writing a parser state machine in a functional way 3. nondeterministic programming + fp seem to be made for each other 4. a pure fp oo model: http://okmij.org/ftp/Scheme/pure-oo-system.scm Now it's time for monads. Basically, i'd like to combine the clean functional semantics of cat with a target forth state machine, for compilation and interpretation. I'd like to see what this monad stuff is all about. From http://en.wikipedia.org/wiki/Monads_in_functional_programming It seems the trick is in "Since the bind operator is non-associative, this constrains the sequence of execution." I don't really get it yet.. Maybe i should concentrate on an intermediate representation for compiled forth code. One which is easy to optimize on. Badnop works nice because it has an 'undo' function. This works well for linear (one-pass) things, but for multipass it is not really suited. Also, the way it is implemented in Badnop (modify the tail of the allot stack) is easily mapped to a functional approach without mutation. It's not all that spectacular, especially not in the concatenative/stack view: sequential execution is nothing more than function composition, which is associative, but not commutative. Entry: next gen forth Date: Wed Mar 15 02:08:17 EST 2006 Not being humble or anything, but i was thinking about the need for parsing words and compile/interpret semantics. Since for tethered dev, there's no real need for interpret semantics: code needs to be compiled before it is run. This should be there for packet forth too. There's too much confusion about parsing words and compile/interpret: vs. [i] etc.. In colorForth, there is no such thing as interpret semantics, well.. you can type numbers and execute words, but you can't execute macros. (well, you probably can, but they will just compile stuff, not run it). Since the same tactic (compile, then run) is used in rscheme, i don't see why i shouldn't simplify forth and packet forth to the same paradigm, at least for the readline console. Entry: dynamic binding Date: Wed Mar 15 02:13:26 EST 2006 I realize this is a pitfall, but dynamic binding can be damn handy. Maybe what i'm looking for is fluid-let, a way to set a 'context' and restore it afterwards. I guess that's why this http://www.gnu.org/software/mit-scheme/documentation/mit-scheme-ref/Dynamic-Binding.html is called 'dynamic binding' :) Entry: concatenative scheme Date: Wed Mar 15 11:45:48 EST 2006 Damn, my mind really likes this compositional (dataflow) way of thinking. Probably it's just a remnant of forth induced imperative programming style, or DSP box thinking... I keep want to do things like (define (composition state args) (apply (lambda (state args) .. (list state args)) ; receiver (sender (state args)))) ; sender Which actually is 'named forth' or 'braided, sorted dataflow', with which i mean, forth with names and without stack shuffling. For example: (define (dosomething state args) (define (fun1 state args) ...) (define (fun2 state args) ...) (define (fun3 state args) ...) (apply fun3 (apply fun2 (apply fun1 (list state args))))) Maybe i should have a look at GOOPS and CLOS. What about PD in lisp? Something like: ((generator)) ((do-left) (do-right)) ((join-it)) I really need syntax for the above because (apply (lambda ...) (preproc ...)) is really unreadable. So i went on to create: (define-macro (flow initial-args sequence) (reverse-accumulate (lambda (fn exp) `(apply ,fn ,exp)) (cons (car sequence) initial-args) (cdr sequence))) Now what about this: (define-box name (inA inB inC) (outA outB outC)) (lambda-box (inA inB inC) (outA outB outC)) This works well, but there's still name duplication. I guess that's the price to pay for flexible 'braiding' of data. So what about fanout? Basicly that can be solved with lexical scoping, or putting 2 processors in one box. As long as ins and outs are the same, we have plain named let / letrec / let* syntax: Entry: parallel forth Date: Wed Mar 15 15:48:12 EST 2006 What about looking at a compiler as a parallel forth with 2 data stacks? One is the computation stack, the other the state stack or dictionary. Writing badnop.scm the way it is now is really feels like i want to write it concatenatively: i'm stumbling over names. Entry: closures -- more stacks! Date: Thu Mar 16 15:31:16 EST 2006 PF Deja-Vu. So the deal is, you really want to have 'temp stacks' when writing in CAT. I've gone to great pains trying to not use a single muting operation in badnop.cat, and i realize it is really hard to do. Sometimes you just need more space. So we go the same route as PF, from stack-only words to words operating on stack + VM, which is just a bunch of other stacks. (RS and DICT). In the badnop compiler, it would be terribly handy to be able to use input (source code) and output (dictionary) as 'global objects' so there is no need for so much juggling. Note that this is not absolutely necessary, but it seriously contributes to conciseness by avoiding much stack juggling. The most convenient way to do that instead of having a data stack (1 2 3), to have the first element contain other state information like ((a b c) 1 2 3). Where (a b c) can basically be anything, written in the functional OO style mentioned above. The thing to keep in mind there is performance: every 'muting' operation will cons alot. Note that even when trying to avoid mutation, this will be have an equal amount of 'global effect' as using global variables. So the thing i am really looking for is closures in CAT. http://www.haskell.org/tmrwiki/HaskellSchoolOfExpression Any book on Haskell must be appraised for its explanation of monads in general and IO specifically. Monads are a purely functional way to elegantly carry state across several computations (rather than passing state explicitly as a parameter to each function). They are a common stumbling block in learning Haskell, though in my opinion, their difficulty is over-hyped. So case is closed atm.. My if .. then code is starting to look ugly, because i can't use smart object words like comma without having the dictionary object on the stack. Entry: Environment Date: Thu Mar 16 19:03:24 EST 2006 I'm going to try to enclude an environment in the stack machine. This will give an extra cons for each operation, but seems to be the only workable, and for now, understanable way to tackle the problem. The environment is just another stack, basically. Entry: The point Date: Fri Mar 17 12:14:31 EST 2006 Every once in a while i have to restate my objectives, because this thing is so rich in cool new concepts and other distractions, i don't remember what i'm doing. I'm a simple guy. I like arithmetic. I like numbers. And i like to build things with them. The forth approach is good for this, because it is terribly concise to build huge things from small blocks. There is of course merit in other languages/paradigms, but for doing what i like to do (My Dear Aunt Sally), forth is simply the way. I think there's only one importante thing that's remaining to be solved. How to add objects to a forth in an imperative and a functional way? As I explained above, sometimes you really need to do a lot at once (hubbing), which means you need a specialized language -- or at least specialized local words and context -- to express what you want to do. This seems to be solved by the monad concept as used in Haskell (which i still need to study a bit more), and fluid-let (dynamic binding) in scheme. Entry: Local functions Date: Sun Mar 19 14:57:20 EST 2006 Was thinking about cat.. (emit-scheme '(dup swap drop)) gives (lambda stack (apply (find (quote drop)) (apply (find (quote swap)) (apply (find (quote dup)) stack)))) Find could be replaced by wrapping a let around the lambda or any other word. The only problem here is that it needs to be done in a single function. Like: ( in> bla bla bla >out ((>out ....) (in> ....)) with ) : Which would give something like: (lambda stack (let ((find ...)) (apply (find (quote >out)) (apply (find (bla )) ...)))) This might be exactly what i'm looking for. The external stack effect ( in out -- in out ) could be locally managed by the functions in> and out>. Or something like: ((word dup swap drop comma) (word comma) with-i/o) Ok, what is this about? Dynamic and static scoping. The latter is definitely easier with dynamic scoping. Using a contextual 'in' and 'out' mechanism reminds me of inheritance: macros are subclasses of some class which has these methods defined. There seems to be two ways of solving it: 1. always passing around the current environment, which is kind of a deep change to the interpreter: words map (env . stack) -> (env . stack) instead of stack -> stack. 2. adding support for 'functional objects' with macros implemented as subclasses. Then there is the practical question, do i really need static scoping? All non-quoted symbols are interpreted as functions in CAT, so static scoping would be only a namespace trick really. What about an interpreter modification trick: 1. if word is found in context on top of stack, execute it as an object mapper: (env . stack) -> (env . stack) 2. if word is not found in context, execute it ducked. Entry: object calls Date: Sun Mar 19 19:10:26 EST 2006 Ok. I got mlit/mcall and >o-scheme, to implement method calls, but i'm not really sure if that solves enough of my problem. The interesting thing about scheme is the ability to implement lots of things as closures, instead of things operating on data. The functional OO (classless prototype) system mentioned above is like that. So i wonder wether it's not easier to implement objects as closures.. Anyway, now i got things like: (x) :symbols # x is a symbol 1 2 # arguments x ((+) duck) >code push () push # build dictionary (x) send # send message -> 3 ((x . #)) The first real problem to solve is to implement a stack this way. A stack would have methods: - stack # return the whole stack - push # push element - pop # pop element - top # return top element - set # set new stack So i got two ways of doing things now: compiling a subprogram which searches in a dictionary first, and a closure based object system. None of those really solve my basic problem: 'with-input' dynamic scope. Entry: State data Date: Mon Mar 20 12:34:43 EST 2006 Currently, the data passed from call to call is just a stack. This is not enough for the level of convenience (more global state) i'd like to have. The first thing that's wrong is the fact that data is not abstract. To be able to decide to pass more context around boils down to extending the data type. (While still not 100% clear to me, I think this is the idea of monads, where a type can contain extra data about a computation or state.) When the stack is abstract, it can be implemented as just a stack, or as a stack + extra context. CALL CONVENTION: all words accept a context, and return a context. this should be abstract. it can be just a stack, or a stack together with bindings. We call the context 'state'. It might be wise to keep the stack words as they are, since that gives extra information about what they actually modify, and solve the calling convention in the defining macros. the only thing that needs to be changed is local change of control flow in primitives.scm Entry: State data 2 Date: Mon Mar 20 18:15:51 EST 2006 It seems to be easiest to solve everything in the definition macros, predenting all primitives are just stack ops, wrapping context constructors around the stack effect, and lexically overriding call/prim/lit. This means the latter 3 are best taken to be functions. So, how to tranform a (lambda stack stack) into (lambda (context . stack) (cons context . stack)) with possibility of doing calls and taking continuations etc. Wrapping a stack processor is trivial, but something that does (call 'word stack) -> (call 'word (cons context . stack)) The thing here is that such a call already returns the new context, so wrapping the whole stack op in a (cons context (exp ...)) is not an option. We need an explicit 'return'. Seriously, the only way this is ever going to work without major hacks is by treating the stack as something abstract. This teaches again another lesson i already knew but ignored anyway: never use cxr/cons to define a datastructure ALL OVER THE PLACE. write localized access methods, or macros, and use those. It's payed me before.. Entry: anaphoric macros Date: Tue Mar 21 17:02:01 EST 2006 The prototypical example adapted from Paul Graham's On Lisp, p. 189 (define-macro (aif condition . rest) `(let ((it ,condition)) (if it ,@rest))) So you can do things like: (aif big-long-calculation (foo it)) I think i can use this to get a minilanguage going for defining primitives, without explicitly referring to the stack or vm. (prim: execute (locals (word) (prim word))) (prim: duxecute (locals (stuff word) (prim word) (push stuff))) Entry: badnop OO compiler Date: Mon Mar 27 08:54:48 CEST 2006 Doing the OO thing: This is not a core feature. It is important to keep the core machine stateless, except for definitions. When objects are necessary, or local context, local accumulation, it is possible to run another interpreter on top. This does raise the question about code representation. Maybe it's not really necessary to do this compilation to lambda thing. Some remarks: 1. i see no real reason to have dynamic binding for the core words 2. there is a true need for dynamic binding for the methods and messages where local context is used: context change == object mutation, and to do this functionally, all those things become context maps, so nothing is really fixed. 3. to work in 'pure forth style' i really only need one context object. Why is this so confusing? Entry: math of joy Date: Tue Mar 28 11:53:17 CEST 2006 http://www.latrobe.edu.au/philosophy/phimvt/joy/j02maf.html Abstract: Joy is a functional programming language which is not based on the application of functions to arguments but on the composition of functions. This paper describes the theoretical basis of the language. The denotation of Joy programs maps a syntactic monoid of program concatenation to a semantic monoid of function composition. Instead of lambda abstraction Joy uses program quotation, and higher order functions can be simulated by first order functions which dequote quoted programs. Constants are functions. Is this true for cat? Sort of. More about mathematical foundations. Semantics of Joy is a morphism between syntactic monoid (concatenation of symbols) and semantic monoid (composition of functions). A monoid is a group where the operation is not necessarily invertible, but there is a unit eliment, and the operation is closed and associative. These are the basic traits of a category. Actually, a monoid is essentially a category with one object. http://en.wikipedia.org/wiki/Monoid Entry: depressed but ready Date: Fri Mar 31 09:26:02 CEST 2006 Damn. Programming is hard. I start to see the way to go with pf, badnop and the whole collection of brood VMs, but there's so much i need to tackle to actually get there. Let's see what the real problems are: * I/O in CAT (badnop compiler VS. state <-> late binding / dynamic scope) * I/O in PF (pre-emptible parser + select based mainloop) Entry: dynamic scope again Date: Mon Apr 3 09:49:54 CEST 2006 I am on the verge of just using set! to implement i/o for the compiler. Yes, i need to remind myself why i can't do that: to be able to do backtracking on input and output. That's a very practical reason. So there's not much left. The whole thing requires a separate explicit interpreter. So why not make the whole language like that? More forth like, but without true assignments (all previous state's are accessible if you observe them..) Entry: pith Date: Fri Apr 14 10:12:55 CEST 2006 The pure functional state machine seems to be working. The dictionary is passed along. There is complete emulation of @ and !, without using assignments. This enables efficient forking of the whole VM. Since token replacement always puts the token on the top, shared constant code will not be copied. Entry: interpretation Date: Fri Apr 14 10:16:21 CEST 2006 Since the above pith mechanism really requires dynamic binding, it might be easier to just use plain old interpretation, instead of wrapping everything in lambda expressions. This should in principle be implementable by replacing 'compile' and 'execute'. Actiually, it's simpler. Just replacing the functions prim did the trick. This does change the meaning of prim from 'execute primitive' to 'execute code segment', meaning execute and run have the same semantics here. I wanted to make it load faster, but hat is just as slow as pith. Maybe it's just the guile bytecode compiler? So, in short: hat has an explicit interpreter. This moves the 'interpret' (code/data distinction) and highlevel code / primitive distinction (apply/run) to run-time. I'm not sure if it's actually so useful, if it doesn't speed up things. Entry: templates Date: Fri Apr 14 14:18:52 CEST 2006 Been thinking.. In lisp i use quite a lot of templates. And in the end, (lisp) functions are really templates for transforming arguments into something else.. One pain for forth metaprogramming by explicit list copnstruction is the absence of templates. It might be interesting to add a primitive to do this kind of stuff, to ease metaprogramming CAT. Such a primitive would take arguments like: This works perfectly. A very powerful construct for metaprogramming. Of course, this is just lambda abstractions in a language which doesn't name parameters. Entry: poke/badnop and linear lisp Date: Fri Apr 14 18:18:17 CEST 2006 The next thing to tackle is to write poke as a target for badnop, and start working on the sceleton of the lower level packet forth replacement. How to proceed? Badnop is a native code compiler. It requires 3 primitive instructions: jump/call/prim. Other instructions can be incorporated in the form of macros. What is what in badnop/poke? VM Instructions are either: * C code (prim) * execution tokens Primitives are macros that compile code adresses. Their midlevel representation could be (prim "c_function"). Entry: literature Date: Sat Apr 15 01:59:18 CEST 2006 from http://www.latrobe.edu.au/philosophy/phimvt/joy/j00rat.html "However, Raoult and Sethi (1983) propose a purely functional version for a language of unary stack functions. The stack has an everpresent extra element which is conceptually on top of the stack, a state consisting of variable-value pairs. Most activity of the stack occurs below the state, only the purely functional fetch X and assign Y perform a read or write from the state and and they perform a push or a pop on the remainder of the stack. The authors also propose uses of continuations for such a language." That's PITH. J.-C. Raoult and R. Sethi. Properties of a notation for combining functions. {\it J. Assoc. for Computing Machinery}, 30:595, 1983. Entry: badnop and poke Date: Sat Apr 15 10:50:55 CEST 2006 So.. The thing to get going here is the tethering, and then work from the ground up. We need basic support for loading eggs. This works now. The other thing is to clarify the target system. Basically, there are 2: (1) C -> C functions (2) C functions -> forth The second system is a basic non-optimizing forth simple. It's done, save the trivial assembler that translates lit,jump,call,prim into binary. The first system is much more interesting, because it will largely consist of macros. The PIC target will be similar: peephole optimization for most direct functions, with assembler trivial. So maybe it's best to go this route: - write the assembler for poke translating badnop's list rep -> binary, finishing (2) - write a new target for assembling the primitives Entry: prototype programming Date: Sun Apr 16 00:55:30 CEST 2006 Since i have dictionaries and dynamic binding and everything, prototype based programming is easily introduced. An object is just a dictionary (association list), sending a message could have a simple syntax like (obj selector) @ since i'm not using deref. Defining a method could go like (obj selector code code code) method Making clones is then just obj (obj2 obj3 ...) set On problem though. There's no self pointer. Entry: quines and generators Date: Sun Apr 16 11:24:56 CEST 2006 Been trying to write quines in pf. The same is doable in CAT, which is the same as Joy. The simplest quine (to 'run') is ((dup cons) dup cons) Now, manfred has a page on using this kind of structure for lazy lists, or generators. I was trying that today to implement the simplest of such examples: a counter. http://www.latrobe.edu.au/philosophy/phimvt/joy/j05cmp.html Writing a counter functionally, the code must do two things: - return the value - define a new counter bound to new state Modifying the quine above to copy and split off a payload on each iteration gives something like this: ((0 swap dup cons) 0 swap dup cons) Now the 'dup' just needs to be replaced by something which updates the top copy -- the one that will be quoted. Let's change this to (x swap dup cons) : ((0 x) 0 x) To make updates, change this to (update 1 +) : (x over update swap dup cons cons) : (0 (x) x) Putting some more code in in a word gives (y (over) duck run swap dup cons cons) : (x (update) y) : (0 (x) x) So you can create a new generator like this: (y (over) duck run swap dup cons cons) : (generator (y) cons dup cons cons) : 0 (1 +) generator Entry: interpret vs eval Date: Wed Apr 19 18:39:07 CEST 2006 Om nog eens terug te komen op interpret/compile. Het is wel leuk aan 1 kant om alles om te zetten naar scheme, maar het vereist wel runtime eval. Als ik nog ooit wil overstappen naar een andere implementatie (bv chicken) is het misschien toch beter om die compilatiestap achterwege te laten, en volledig geinterpreteerd te werken. Vooral ook omdat ik met late binding zit. In ieder geval, dit is voorlopig een pure implementatie kwestie. Misschien best om toch oplossen van het probleem uit te stellen tot het zich stelt. Een praktisch voordeel is meer introspectie. Truuken als die hierboven zijn mogelijk als code gewoon data is en in runtime veranderd kan worden. Serializatie wordt ook simpler. Entry: old school 1200/2400 FSK Date: Wed Apr 19 19:05:03 CEST 2006 I'm in need for a protocol to send power & data to chips, for very simple control only data. The 14bit synth does something like that, but it's not very stable or standard or simple due to its 3-phase nature. I've been thinking about Frequency Shift Keying, the protocol used for old audio tape data storage, but then the raw MSX version. It's sexy and old school and doesn't sound too bad either :) http://www.terra.es/personal7/miguel.colom/msx/codif_en.htm Entry: code transformation Date: Wed Apr 19 19:47:28 CEST 2006 It might be interesting to add a feature to: 1. read a source file, with comments, into a tree structure 2. modify the source file 3. write it out again This would facilitate refactoring tremendously, and would allow me to probably rewrite badnop code into cat code, since i do have a forth parser. Entry: badnop - getting organized Date: Fri Apr 21 12:36:11 CEST 2006 Best to disentangle all namespaces: * constant: constants * target: target code flash addresses * variable: target data ram addresses * monitor: target boot monitor * macro: target language macros * assembler: target assembler Entry: compiler/assembler structure Date: Sat Apr 22 14:02:20 CEST 2006 Badnop is currently structured as: P B forth --> assembler <-> binary The first one is a projection from forth code to assembler code (both represented as lists). The second one is a bijection between symbolic assembler and binary machine code. Some remarks about this: * B make the assembler/disassembler reversible. This is a good idea, since we can import binary code. * The assembler can be in polish notation, instead of RPN in old badnop. This is a good thing, since the symbolic assembly notation can be made the same (or very close to) the standard assembly language notation. * P contains peephole optimizations. It is worth finding out wether we can write these optimizations on the forth level, to get to something like: P B B forth --> optimized forth <-> assembler <-> binary Probably this is overkill, since the assembler can be translated into some pseudoforth, which will assemble back to the assembler code. So there's no need to get rid of the assembler really. But, it should be possible to do a lot more dirty tricks by writing the forth optimizer on the forth level instead of straight-line peephole opti. The functions themselves are implemented as stream processors, since there's no direct 1-1 map from opcodes to words (some words are 2 instructions), and definitely not from forth to assembler. Entry: compilation & speedup Date: Mon Apr 24 13:24:41 CEST 2006 CAT is really really slow. So it might be a good idea to add some form of early binding: for example, compile a single word to a primitive (using the forth->lambda translators), but keep some reserved words dynamic. These can be specified in a list.. So it would be sort of a manual compile operation. And the reader (parser) is terribly slow too. I don't really understand why. Entry: watching videos Date: Wed Apr 26 00:49:14 CEST 2006 Ralph William Gosper, Jr through Guy Steele, Jr: "What is a data structure? A data structure is a very stupid programming language." Entry: asssembler labels Date: Thu Apr 27 20:06:21 CEST 2006 Right now i have bijective translation between machine code and assembler code using absolute addresses. I wonder wether i should make the disassembly use relative, absolute or symbolic addresses. For sure, symbolic addresses are the most interesting. Address resolution is a fairly lowlevel thing, and really belongs at the machine code level, not the assembly level. This would enable editing of machine code (grabbed from C code for example), which would be a very good thing. Anyway, not a trivial problem.. Badnop, the assembler on steroids :) cat> 0xFBAD disassemble>asm cat> .asm nopf 0BAD cat> So I'm in easter egg mode: cat> 0xFBAD disassemble>asm cat> .asm badnop cat> har har har gasp har har har Entry: better Date: Thu Apr 27 20:52:53 CEST 2006 Maybe it wasn't such a bad idea after all to port badnop to CAT. I'm starting to like cat, though there's some problems i need to talk about later.. It's harder than i thought, mainly because old badnop has a lot of knowledge encoded in the form of optimized forth hacks, which is hard to read. I'm moving away from that. Anyways, i start to see new patterns that are interesting to explore: Symbolic assembler snippets: dissect an image, chunk it into functions, transform to symbolic representation and re-assemble. In short, better data structures, more 1-1 maps which make life a lot easier and the code a whole lot more malleable. I really like CAT too. The way it's possble to just 'set code' makes programming a lot of fun. No red-tape, just set behaviour. This way of programming is so different to what i'm used to. I'm curous about how to optimize it though. But that's something for later.. There's one big problem though: i really need name spaces. Currently i can work in forth-style, but i ran into several occasions already where i forgot names, and accedentally redifined an already used term. Is not the same as forth where these kind of problems only arise for interface words due to the early binding. I'm already doing a lot of namespace stuff with the bindings for the assembler, disassembler and compiler, but i probably need to make it more explicit. Funny how i walk the path of stack -> (dict . stack) -> (namespace . (dict . stack)) Next to name spaces, (or using name spaces), a mechanism to connect processes should be installed. Entry: non-programmables Date: Fri Apr 28 05:02:56 CEST 2006 The question is really, what about the non-self-programmable chips? Should i support them, or should i adapt the WISP to program them? They are fairly easy to program. I could forget about the plan of writing programmers for the ones with self-programmable eeproms, since a bootloader is really sufficient for those. Entry: getting it to work Date: Sun Apr 30 21:15:39 CEST 2006 I got CAT running in cygwin, and the MPLAB ICD2 seems to cooperate. Everything is ready for the final leap: compiling the monitor. A lot of things need to be fixed before i can do that * forth file loading * inline assembler * macro definition * assembler address resolution * symbolic assembler labels * 2constant * if/then * for/next * begin until * long jumps Entry: quoting and syntax Date: Mon May 1 12:39:53 CEST 2006 COMPILE TIME vs RUN TIME CAT has no compile mode. It is completely dynamic. This makes life a lot simpler. BADNOP (target forth) has no interpret mode. This doesn't make sense anyway since it's not its own meta language. This is a very important choice, central to the design of the whole system. Since the two languages are not the same, they can be optimized for what they actually need to do. [Edited. It turns out that the darget _does_ have an interpret mode: it was too easy to add the word 'trun', which takes a list, and interprets each atom. Numbers are loaded on the stack, and symbols are resolved to addresses and executed. However, this is not a special mode of the monitor. In addition to ordinary runtime compilation, there is again a 'scratch' command, which takes a list of forth code, _compiles_ it, uploads it and then runs it. These are implemented as just words operating on lists in cat. However, the distinction between macros and nested words is a real one which cannot be ignored by the programmer.] The compiler doesn't need to be fast, but it needs to be very expressive. CAT is a (almost pure) functional lazy language. The term 'lazy' is not really properly used here since CAT is not based on lambda abstractions and aplications (eval/apply). What i mean is that binding of symbols and lists of symbols to executable code is as late as possible. The target language needs to be fast and resource friendly. The relation to machine code should be very transparent. BADNOP's target forths are therefore real forths: they bind as early as possible, and avoid high level data constructs which depend on higher level data structures. The interplay of both systems is where the exceptionality of this system lies. You are bound to run into problems that are hard to express in the target language because of its static nature. However, in such cases it is usually not so hard to manually divide the solution in two parts: one that can be solved at compile time, and the remainder needing to be done at run time. The compile time operation usually amounts to code and data structure generation, or higher order processes like code generator generators. In general, this decision cannot be made by a compiler. Rephrased, it is quite hard to make a general purpose compiler that is as good as a human to solve very constrained resource allocation problems. But, for some things special purpose compilers can be generated, and that's exactly what CAT is for. Having a flexible dynamic language at hand to express these ideas in is a big plus. Using CAT you can do it either in CAT or Scheme or anything else that can use lisp style data structures. SYNTAX The target forth is compile only, which brings it closer to the semantics of an assembler where ';' just means return and ':' just defines a code label. This is close to the semantics of colorForth. CAT has no compile mode, so it is a trivial syntax choice to plug CAT in the target forth's [ ... ] Now, there is one huge point of possible confusion: the macros implementing the target forth are all implemented in CAT. The target forth's syntax is chosen to be as close as possible to traditional forth. The big difference between CAT and forth is the quoting mechanism. In CAT, *all* symbols encountered in the input stream are interpreted as functions. The only quoting mechanism is the list (a b c). Lists are treated as data, but can be interpreted as code using 'run' and derivatives. This is the basic machinery for function abstraction and delayed evaluation. So, in order to change the default semantics of a symbol, the only option is to shield it from interpretation by placing it in a list, and explicitly operate on that list. In BADNOP Forth, *most* symbols are compiled to function calls. Some have different meaning. There are no lists. The main quoting mechanism to change the default compilation semantics are 'immediate parsing words'. These are functions that run at compile time, and can modify the meaning of words following the data stream. An example to illustrate both quoting mechanism. Declaring a constant with a value computed at compile time could be done in two ways: [ compute-constant ]L constant MAGIC [ compute-constant (MAGIC) constant ] In the first approach, 'constant' is a Forth parsing word, altering the meaning of 'MAGIC', while in the second approach, 'constant' is a CAT word which uses a list containing the symbol 'MAGIC'. Basically, this is where CAT & BADNOP meet. The basic syntactic choice to be made are all centered around which quoting method to use. When should I respect as much as possible the standard forth approach? When should i deviate from the standard? For some things, CAT can not be hidden. So the choice is really about hiding the dynamics of CAT in the forth source. At one extreme, deviating from standard too much by bleeding CAT through to the level of the target language's syntax makes things really confusing. Compare these: (1 high) (0 low) ifte if 1 high else 1 low then By clearly defining 'if' 'else' and 'then' as compilation macros, the second line has a very clear semantics. At compile time it will be transformed into a set of conditional jumps with the remaining code compiled. In the first line, the code can be dynamic. It might be so that the stuff that's quoted is not available at compile time yet, though we really don't know. Wether the quoted code can be bound early depends on what's actually done with it. This requires a smart compiler, which is a bit more than I can do at the moment. The example above shows why I really need forth's explicit early binding semantics in the target language. Because of the importance of target resource efficiency, binding should be at the control of the programmer, and default binding should be immediate. So the obvious road to follow is to stay as close to forth as possible. Which road to follow in the 'constant' example is not very clear always, but i guess it will become clearer in the future when special cases arise. Entry: faster CAT Date: Tue May 2 15:38:54 CEST 2006 * VM: store immediate context (dict, stack, continuation) in a vector * compiled code can be stored in vectors as well.. the question is wether the symbols referred ever change binding.. Let's first try manual binding. Given a word, compile it to a scheme function, resolving all occurences of symbols down to their respective primitives, except for a set of given late binding words. But, until I have a proper profiler there's really no point.. Just running a trace shows that read-file is already pretty slow. Maybe writing CAT in POKE will be a nice exercise after all. Explicit early vs late binding trade--offs. Entry: asm deque notation Date: Wed May 3 12:46:51 CEST 2006 A note about notation: i use >x and x> as the input/output operations for both queues and stacks. this is NOT the same as the notation used for deques: (see cat/library.cat) >x and access the tail stack >x and x> are both standard forth notation for stacks, and i like it for general input/output operations, like queues, for it's visual qualities. BUT for the assembler, i make an exception: the buffer is not a queue, but an input-limited deque: it is not possible to write to the reading end (assembler input), but it is possible to read from the tail (disassembler or compiler output). this represents the 'undo compilation' operation, or a way to share the assembly buffer using it as a stack. the notation becomes: >asm write to end of asm file read from start of asm file This is implemented as a reversed deque. The most common operations >asm and asm>, which are standard notation, and the exception asm. Deques are represented as a (HEAD TAIL) pair. deque will read and write the head, and deque> and deque< will read and write the tail. There's still one degree of freedom left, which is what you call head or tail.. Entry: machine code decoder Date: Wed May 3 15:08:39 CEST 2006 Currently the disassembler is not very efficent. Instead of incremental search, I could take advantage of fact that for risc architectures, the decoder is a tree. Pseudo ops need to be solved somewhere else then. A decoder tree for the pic18f instruction set. The structure of a tree is tree == (maskbits (result tree) ...) / opcode-list opcode-list enumerates opcodes starting from last result, and counting up by 1< print. Dot is now reserved for pair notation ( . ) Added a shortcut p for print, like the unix 'dc' command. Entry: debugging by printf Date: Sun May 7 18:28:35 CEST 2006 Or really, printing traces. Still the best way :) The easiest way to do so in CAT is to add hooks to certain points in the code. Currently i have hooks for all module inputs: compiler, assembler, disassembler, image buffer. Still need to find some proper way to give feedback about _where_ an error occurs. Entry: lazy evaluation Date: Mon May 8 00:31:12 CEST 2006 Watching lecture 1b of http://swiss.csail.mit.edu/classes/6.001/abelson-sussman-lectures/ and i think i'm starting to get it. Lisp macros are really nothing but a way of working around strict evaluation, i.e. to have local non-strictness. Entry: interactive monitor Date: Mon May 8 17:01:17 CEST 2006 So, monitor is working. The result code is exactly the same as the original, so i think i did a good job porting. Next question is how to manage state. I'm probably going to extend the monitor with a set of compiled macros for basic interaction. Time for a bit of reorganization, because now i'm at a point where complexity can explode. The whole application should have convenience maximized for this kind of developement: DIY: * create a bootloader for a certain architecture (chip + dev interface) * burn it to the chip using a 3rd party tool PREFAB: * buy a pic chip (usbpicstamp) with a built in badnop bootloader From that point on, it should be possible to add new code, without having to erase anything, one building on top of the other. Also, the forth image should be able to be saved, so it is possible to continue working with all the data and meta data defined. Some questions/remarks * Why do i have token-map and labels? * assembly and disassembly don't need to use the same interface (here) * assembly buffer is transient? [Edit. All of those are fixed. Incremental code upload is working now, and saving is implemented together with version control using darcs.] Entry: linear vs. functional Date: Tue May 9 19:12:04 CEST 2006 Did i actually learn something re--implementing badnop? (a) Flat lists as dictionaries are REALLY slow (b) Higher level functions are very convenient to work with (c) Ticketing: save context, try, backtrack I already knew (a). I recall doing some profiling on old badnop, and I found out more than 99% of the time was spent traversing the symbol table. In order to speed up CAT/Badnop that's the place to start looking, since everything is symbolic. I am amazed by (b). It's really another world when you don't have to care about binding times, and can juggle code on the fly, taking it apart and assembling again. This works really really well. Although i haven't used (c) to its full extent, since i'm only using the linear (stack based) construct, i can already see it is a blessing. I suppose proper continuations would make it even easier in some spots, but i'm not sure about how yet. One place would be to automate 'undo' for literal and other operations, which means that at the point a literal is to be compiled, it is either compiled, or passed to another function. This i need to think about more. Using (c) requires functional data structures though. So, for the purpose of writing an assembler/compiler -> symbol tables should be faster (hashes?) -> data buffers (stacks/queues) should remain functional Entry: backtracking Date: Tue May 9 20:14:38 CEST 2006 There are a lot of beautiful possibilites for simplifying the code using non--deterministic programming. Examples: * for..next: use a dup / drop .. drop around * literal compilation: either compile, or pass to another macro The interesting thing here is that most choice points can be resolved rather soon in the compilation. Instead of using pure continuations, only the data state could be saved. Entry: pseudo state Date: Tue May 9 20:16:13 CEST 2006 I have 'functional' state, which enables functional 'undo/fork' tricks, but for some constructs state can get in the way. Let's see what i use it for: * name/value bindings * having 'current' things like current state of compilation target. * caching results of compilation, assembly, disassembly... The last one are 'current' things too, but they serve only the implementation and are not very useful from the user point of view, so their extent should be kept minimal. [Edited. I think this is mostly fixed. The only kept state is the state associated with the target machine: dictionary, macros and compilation point. Image could be removed even, since it's present in the target anyway.] Entry: pattern matching Date: Wed May 10 00:23:36 CEST 2006 I was wondering wether writing the optimizer as a pattern matcher isn't a better idea. Since these are really a lot easier to express, and probably easier to compile too. What i mean is this: i have a lot of comments like # (dup drop) == () | (x drop) == (x drop) # (drop save) == () | (x save) -> (x save) # (save drop) == () # (save unsave) == () # (fn ;) == (asm (jump bra fn)) # (number ;) == (asm (retlw number)) # (x ;) == (x asm(return)) # (0x00 reg !) == (asm (clrf reg 0)) # (0xff reg !) == (asm (setf reg 0)) # (x reg !) == (x asm (sta reg) save) # (dup x reg !) == (x asm (sta reg)) # (number +) == (asm (addlw number)) This should really be code, not comments. But, it's probably not so easy to do without giving up the serial execution semantics of the compiler, like for for..next and begin..until macros. ( a ) Can it be done to write only a part of the optimizations in this form? ( b ) Can this be generalized to a compiler for a proper concatenative language which uses early binding? I've been musing before about constructs like '(a b c) if', but decided not to go for it because of late binding problems. However, having a syntax like that and using it for late binding are independent things.. The thing i want to get at is how to write an optimizer based just on rewrite rules? This can probably be dragged quite far when the language is really concatenative. Forth is not completely. What about compiling forth from right to left, since most optimizations really do 'eat their arguments'? What i really should do is to not compile to assembly language, but to something which looks more like forth, but where each word is one instruction. The nasty thing is of course, how to solve multiple argument instructions. food for thought.. Entry: cleanups Date: Wed May 10 01:08:37 CEST 2006 * getting rid of labels/tokens distinction? * reloading the host application does no longer overwrite data state * state saving / image saving The last one requires some thought. There needs to be a distinction between host and target code, so they can evolve together. The main question is how to make them merge properly. Entry: random number generators Date: Fri May 12 16:04:25 CEST 2006 I finally got it :) The easiest way to get a decent random number generator is to use a Galois field GF(2^N), consisting of polynomials with coefficients in GF(2), which is the set of elements {0,1}, together with addition (XOR) and multiplication (AND). The polynomials are taken to be modulo m(x), a irreducible polynomial of degree N. There is more than one such polynomial. The algebraic structures corresponding to each such polynomials are all isomorphic -- there is only a single field GF(2^N). However, they are represented differently. For use as a random number generator, the easiest approach is to take an m(x) such that x will generate all the nonzero polynomials. Such a field is called primitive. Multiplication by x corresponds to a left shift. Suppose s(x) is the polynomial representing the current state of the generator, then there are 2 cases. * The result xs(x) is of degree less than N. In this case it will not require a reduction. * The result xs(x) is of degree N. In this case, it is possible to obtain a polynomial of degree less than N by adding m(x), since both have a monomial x^N which cancels out after addition. This leads to a very simple algorithm. For N=8, a primitive polynomial is m(x) = x^8 + x^4 + x^3 + x^2 + 1, which has a binary representation 11D, taken by identifying each bit with the corresponding coefficient of m(x). After left shifting the state (multiplying by x), the result is either 0?? or 1??. Only the latter has to be reduced by adding m(x), which means performing an XOR with 11D. The high bit always results in zero, so the algorithm becomes: - shift s right - carry? -> s = s XOR 1D or "clc rot a process b process c process Now, this makes sense, but the 'map' one does not. Without highlevel aggregates these constructs do not make sense. So, if the idea is to run this stuff on a microcontroller, it at least needs lists, or vectors, as built in types, so they can be loaded on the stack. In plain forth it is still possible to do buffer for-each process Where for-each is the parsing word that compiles iterative calls (and that knows how to stop, so buffer returns somehow a delimited object.) So, i think i have to conclude that, yes, it is possible to do this automatically, like a smart 'ifte' which compiles if possible. But doing this without explicit syntax for binding time might be confusing. In the end, that's why PF's { } is not the same as (). So this requires higher level data structures (lists/vectors). For true lowlevel languages it might be better to have special purpose data structures (delimited arrays) and define highlevel functions/macros for them. Stick with forth. Which means i have 3 layers now: * CAT (dtyped, symbolic, late, functional, GC) * PF = FORTH + LISTS (dtyped, compiled/symbolic, early/late, imperative, reuse lists/trees) * FORTH (untyped, compiled, early, imperative, vectors) The middle one can be seen as a sorth of transition layer. Both CAT and PF can be written in a lower level forth. I'm not really sure wether it is interesting to write CAT in PF. Entry: roadmap Date: Sun May 14 17:45:53 CEST 2006 So, what do i need to do next? Give things a name. * CAT = functional concatenative language * BADNOP = compiler * POKE = in-image forth + linear lisp / packetforth experiment contains forth->C compiler (compiler for poke VM is trivial) * PURRR18 = 8bit forth for 18f pic + crapsynth / sheepsint * PURRR30 = 16bit forth for 30f pic Brings me to things to do: - move to 18F1220 as base processor - write synth for 18F1220 - start working on 30F architecture - start working on POKE + C macros. Entry: pic 18f chips Date: Sun May 14 22:28:03 CEST 2006 discontinued: 252 452 replaced by, new cod 2520 4520 small, orig code: 1220 1320 new code, usb 2550 4550 usb ins+ intosc 1220 - - + 2520 - - - 2550 + + - Entry: some old notes Date: Wed May 17 01:22:59 CEST 2006 Going through some old notes, and trying to distill the basic ideas from it. Some questions: * How to efficiently implement type multimethods vs. typless systems. * Can fast reuse cache opti and general GC be combined? * Functional versus Imperative. * GC lists vs. LINEAR lists vs. STATIC ARRAYS * Static process core + dynamic config language. * Synchronization for parallel processing. * How to represent grids: where to move from arrays to lists * Grid combinator control flow (transpose symmetry and APL?) The answer to the GC question is probably no. Not explicitly freeing as soon as possible is in conflict with automatic (non RC based) GC. Unless we use the ideas from linear lisp, functional programming seems to require general GC. There are some ideas about 'observation of data' which is really about fanout and data ownership / linear lisp. Most of the ideas about static process and dynamic setup/compile are stable in my head now, however the quest about data structures does not talk about the approach where a program has two phases: the first one is effectively lisp with GC, while the final stage runs on static data structures. This is mostly the idea behind CAT/PURRR, only there is a host/target distinction at the point where the dynamic/static distinction is made. Moving this toward the target might prove beneficial. (I briefly looked at a paper about this but lost the pointer..) Essentially this is again 'the interpreted compiler' pattern, where you have a slow flexible language to express compilation of rigid but fast specialized code. The next set of questions is about dataflow optimization. This is something I should not attempt before educating myself to the state of the art of compiler technology a bit more. Entry: monads Date: Sun May 21 09:59:48 EDT 2006 This is killing me :) I probably need just a couple of hours of concentration to get it, and then hopefully rewrite the whole of CAT to incorporate the ideas. Entry: dynamic scope Date: Mon May 22 17:58:30 EDT 2006 So, i really should call dynamic scoping dynamic scoping, meaning reserving some syntax for it, like 'with-dynamic' or just 'with'. In general block based languages, lexical scoping is preferrable. However, CAT is not block based, so there is no such thing as lexical enclosing, only run-time enclosing (function nesting). This means dynamic scoping is the only way to have 'local' data, where local data here means 'alive for a certain function nesting depth'. Using combinators effectively amounts to the same thing: data stored on the parameter stack (or the return stack if it is accessible) in the form of an implicit 'zeroth' item. Again. - no lexical scoping possible due to language paradigm - dynamic scoping is the only alternative left I do wonder how to modify the language in such a way that lexical scoping does become possible. It seems to me that the OO approach is here simpler to do than the lambda/closure approach. Probably i will some day need a name space mechanism for pure practical purposes. Making this hierarchical makes it an object/module system. Need to think about this a bit more. Entry: chip state Date: Wed May 24 17:25:26 EDT 2006 Need to have a way to automatically save the entire chip state for a single project. Now.. do i let this depend on -- project id present in bootloader -- unique chip id What about this: 1. standard way to create a bootloader, given chip type and osc freq 2. this creates monitor.hex and a state file in a certain directory so, state dir, contains subdirs per project ID.. ID is present in the bootloader, so CAT can ping the chip, get the project ID, and load the corresponding state. Continuing on this subject. Currently there are two 'home directories' for Brood. One is the system directory, which is used by 'brood-resolve' and 'use', and the user state directory and taken from the environment variable BROOD_HOME, which is set automaticly if not present. The other is the user state directory, which is by default $HOME/.brood, and which contains state associated to projects. The purrr projects are for example (purrr sheepsint), where the state file loaded by '(purrr sheepsint) restore' is ~/.brood/purrr/sheepsint/state.cat So, how to initialize a monitor? Entry: mode changes Date: Mon May 29 13:31:51 EDT 2006 Syntax question. I already have 'asm (ins) (ins) ...' which is self--terminating. Why not use this for other things like 'vectors'? Maybe it's best to make mode changes a bit more explicit, since '[' and 'macro' are already implemented as such. Is it possible to share code here? Hmm... maybe this is a bit too much nitpicking.. Entry: name spaces Date: Tue May 30 10:27:40 EDT 2006 This needs to be thought out a bit better. Since the dictionary is actually a set, name collisions are more frequent, so namespaces are a necessity. Look at this pieace of code in brood/comp.cat # save only the compiler state (app-variables ( image # binary code labels # symbol -> address map target-words # list of 'true' target words macro-dict # compiler macros constant-map # constants & variable names (ram addresses) forth-log # log of entered forth code project # project meta data prog-word # code flash compilation point byte # data ram byte allocation bit bucket # data ram bit allocation deferred # data ram deferred word vectors )) : (print-app app-variables (() cons print-var) for-each) : (save-app (print-app) with-output-to-file) : Ugly innit? This really has to change. All this should go into the 'project' namespace. Things to do: * get/set for hierarchical dictionaries. * code/data distinction: be clean from start * interpreter: dictionary stack GET var dict find SET (var) set (dict) add So, simple 'word' semantics doesn't change. However, it might be interesting to use explicit fetch/store. Or just to have both.. (dup swap drop) (nop) ! which is really (dup swap drop) (nop) dict add-recursive Alright. Everything is packed away in the (project) namespace. Much better. Entry: todo Date: Tue May 30 19:01:24 EDT 2006 * erase project (revert to clean monitor) * boot hook (incremental) Entry: flash programming Date: Wed May 31 13:04:49 EDT 2006 One lesson i learned: this chip, though it has its charms, is a bitch to program! Using only flash, without abstracting it as ram, is quite something.. Maybe it is better to abstract it as ram anyway. There's this trick you can use with flash programming. There are two operations, which both work in blocks. Erase (x->1) and program (x->0). For the 18f chips, erase goes in blocks of 64 bytes, while program goes in blocks of 8 bytes. It is possible to overwrite an already programmed word, but this will result in the already present word ANDed with the to be programmed word. Since both 0xFFFF and 0x0000 are nop instructions, this can be used to an advantage: emulating ram does not require an erase-block operation every time. This seems to work.. Now about the boot monitor. What if I use the reset pin as a trigger to enable monitor or not: high means run program, low means execute monitor. Entry: icd2 linux Date: Thu Jun 1 17:40:52 EDT 2006 Seems to work using icd2prog. I modded it a bit so the config works for the chips i'm using. Might be interesting to snarf some functionality to talk directly with the icd2 from within CAT. Entry: time for reflection Date: Fri Jun 2 10:46:11 EDT 2006 Pun intended. It's crazy how fast you start wanting to rebuild the language when starting an application. For the synth, i am in need for deferred words, like doer..make or defer..is Both of these constructs seem a bit hard to use for the simple things i'm doing: choosing different control/synth engines at runtime. The 12f synth in previous badnop had a relatively simple design: it's an interpreter with a standard tree-structured decoder, where the last phase of decoding is a jump into a table. The only disadvantage here is the fact i'm using flash memory, so i can't just start overwriting things, and the other thing: one word branch ops are 11 bit, meaning they reach to +- 1<<10 words. Entry: teaching Date: Sun Jun 4 10:01:41 EDT 2006 The thing i want to teach in the sheepsint tutorial is: an application is an interpreter and a language. So, what i really need are tools to make this kind of approach a bit simpler. Entry: flash Date: Tue Jun 6 10:27:39 EDT 2006 Deferred words with addresses in ram are not a very nice way of doing things. They are clumsy to implement and are not persistent, and so need initialization from constants stored in flash. Why not store everything in flash in the first place? Using byte-based jump tables instead of word-based deferred words makes life easier. So, it's easier to go this way: support jump tables explicitly. They are present in sizes of 2^n, which can be muted only from the monitor prompt, one table at a time. The operation that does so will erase an entire block, and then reprogram it. It's better to do something more general: redefine / overwrite a word, since that allows arbitrary hacks that keep bindings. (name code ...) overwrite -> get address of name, save it + org -> compile + assemble code -> get end address -> determine blocks + upload them This seems to work, in the form: 0x20C (a b +) hack Where the numerical argument is the start address of the patch. Entry: sweet noises Date: Wed Jun 7 16:43:54 EDT 2006 So.. 1 bit sound. Time to go algorithmic. An interesting class of sounds are simple synthetic waves, reso filtered, and then limited. If there's a way to do this without too much computation, i'd like to find out. It's probably fast harmonic sweeps, but in such a way that phase correlation is maintained? So let's see what we can do with harmonics only. Using an 8 bit counter does not give a lot of resolution. Anyway, time to get the thing done i've been postponing for ages, which is.. Entry: TODO Date: Fri Jun 9 13:23:38 EDT 2006 * PIC usb * pd interfacing tutorial * sheepsynth tutorial * cat language documentation * patch the forth mode, so it's possible to upload or overwrite (hack) blocks of forth code Entry: convenience Date: Wed Jun 21 11:51:49 EDT 2006 Time to get some convenience things working first: * forth mode for emacs, with upload keys etc.. * fix parse problem * automatic darcs archive init First two are fixed now. Forth mode with built-in upload is a lot easier to use indeed. Parse problem i hacked around. Needs to be fixed properly some day. So, I'm about to start a new project: the seriousint, a bit more serious approach to squarewave synthesis. Going to use that one as a test for 'how to create a project'. Entry: namespaces Date: Thu Jun 22 12:48:14 EDT 2006 So.. time to clean up namespaces. Basically, you can't really do without considering the binding scheme i'm using. Compared to (what i call -- i don't know the 'proper' name) forth-style linear binding this kind of global name space approach i'm using breaks too easily. In linear binding, redefined words are only bound to new code in code loaded after the redefinition, which enables the re-use of 'internal' words. Just redefine them, since they only serve purpose at load-time, and supposedly are not to be used by other modules. The main thing that needs to be patched is the interpreter. Instead of using a dictionary directly, it needs to use a list of namespaces to search. There could be one global variable indicating this: search order. ((core) (assembler) (compiler) ()) (search) ! This seems to work. Now for task two: eliminate all self modifying code, or at least, isolate things that are modified in a separate namespace. Done. The variables now live in the (var) namespace, how original! The next thing that could be done is to eliminate variables that are only meant to be temoporary. To do this automaticly would require a whole different name binding and scoping approach. Maybe leave it for later. Entry: search cache / compiler Date: Fri Jun 23 09:58:20 EDT 2006 Once a large part of the dictionary is marked EXPLICITLY as read-only, it can be cached in a faster lookup structure, or maybe even bound permanently, directly to the code/data itself. The first level of compilation is really straightforward. Just perform recursive concatenation on each symbol until you reach primitives. A better approach would be to keep the nesting of words so they can be re-used. This would require a data structure that represents a compiled word. We could use just a list as before, but that requires an explicit 'run' every time such a list is executed. The second level is a lot more difficult, since i give no information at all on how to do other kinds of nesting. Even the simplest () () ifte can't really be compiled without prooving that the two lists are not generated at runtime. So, while CAT is flexible due to its simplicity, it's also terribly hard to compile properly, because of the lack of meta information on the words. The FACTOR language is a good example of how to aim for compilation. It's also more forth-like, and has declaration to be used by the compiler. All in all, I think speeding up the compiler/assembler can best be done by writing it in some other sub-forth, piece by piece, and keep CAT as simple as possible. Entry: usb and indexed addressing Date: Sun Jun 25 12:11:14 EDT 2006 Started working on the USB driver again. It seems relatively straightforward, but a bit boring. I do think it's a good idea to start using the extended instruction set for it's indexed addressing when dealing with USB messages. Will ease coding a lot. This does require moving the stacks around, probably best to the region 0x000-0x05F, and it limits the globally accessible variables to the range 0x060-0x07F, which means 32 in total. In general, the choice between extended set or not will mostly depend on the programming model for the application. The extended instructions themselves are interesting for more efficient moving of data, but usually a lot of moving can be avoided in simple self-contained apps. Before starting to implement that thing, it might be wise to do the config bits too. Entry: usb decoder Date: Mon Jun 26 18:51:00 EDT 2006 Since USB is going to be quite an important part of the PURRR project, it's a good idea to spend some time thinking about how to implement the decoder. Basicly, USB is an RPC mechanism: * host sends request, client responds * all requests have the form of tables This is not very exciting. Doing this directly in PURRR given a table of name->value maps is really really boring and error prone. So is there a way to specify the whole USB protocol in a more high-level fashion, and then compile it to forth code? see usb.txt Entry: sheepsint simplification Date: Wed Jun 28 12:22:10 EDT 2006 - 3 frequency levels: synth, control (/40), note (/25) - note sequencer uses programs Entry: caching Date: Wed Jun 28 13:57:56 EDT 2006 What about writing a caching mechanism for method lookup, instead of the very slow dictionary traversal employed now? CAT is impractical for larger projects. Even the monitor and sheepsint.f take quite a noticable time to compile. How to solve this? I tried to cache the symbol binding at the level of 'state-find', which gives a significant speedup, in the order of 4-5 I wonder if it's possible to cache things at the level of 'dict-find' or 'assoc-ref', using the dictionaries themselves as keys to map to hash tables. There's 4 distinct operations on a dictionary non-muting: * FIND muting: * ADD * REMOVE * REPLACE The goal is to find those dictionaries that are constant, cache them with hash tables, and ignore all the rest. The thing to avoid is to pile up the cache with thing that are normally GC'd, so the cache keeps the only reference leading to a blowup. Don't seem to be so easy. Some ad-hoc approaches: Every N searches, decrement the TTL of each dictionary. On each reference, set the refcount to some constant value. This will guarantee that dead dictionaries will be deleted, and often used ones will stay alive. There are just two tuning parameters. Complicating requirement: hash needs to be able to reply 'false', so should have a pretty clear picture of what's in the dict. Maybe this is even easier to do: one to one correspondence between dict and hash, and updating hash from dict update operations. for example: not to cache: - toplevel & var definitely to cache - core & badnop & purrr - macros and asm after init What can be done to avoid registering the main dictionary, or one of its clones? It would be good to start caching something only when it's been referenced a couple of times, end then keep it synced. What about randomly doing this? This is not going to work in an ad-hoc manner.. Too many special cases. What i need is a way to detect constant dictionaries, and cache only those. How do i know if a dictionary is constant? If it keeps being referenced. So, what i have is a hash which contains just all dictionaries that are referenced, and on each reference their count goes up, while once in a while, every count goes down. Ok, some testing with ad-hoc algo teaches me that it's not going to work if i include the top level in searches, since it's the one that's always referenced, and changes any time anything else changes. Plus, it's small. So toplevel is out. What about moving the cache point? Things that need to be cached because they are largely constant are: - all executable (core/badnop/purrr) words - assembler: machine class-map - compiler: macros (+ update) What about explicitly locking/unlocking a dictionary? Then locking could turn it into a hashtable. This seems to work, but speedup is not so big as i thought it to be. Entry: caching / compiling Date: Thu Jun 29 10:36:15 EDT 2006 Again, what's the deal? Making it run faster. Period. Turning things into hash tables doesn't seem to work that well. I get a factor 2 speedup for compiling the boot monitor. Maybe this does have to do with the compiler dictionary though. What about compiling after all? See bind.scm It seems to work reasonably well. Don't even need hashes! The monitor compilation goes from 15 -> 5 seconds. Good, but i thought it would be better.. The compilation step still takes quite some time, so it could be that dictionary lookup for code words is not the only bottleneck. Maybe the macros should be compiled too. There is a clear difference between compiling the monitor code the first time, and doing it again with a 'dirty' project. It looks like a significant amount of time is spent in maintaining the functional data structures. Time to take a closer look at Okasaki's PhD thesis "Purely Functional Data Structures" section 6.2.1 Binary Random-Access Lists. Entry: backtracking Date: Fri Jun 30 01:31:32 EDT 2006 Since I went through the trouble of writing CAT's data structures purely functional, why am I not using true backtracking for things that are not easily undone like the for..next inner/outer drop? Maybe afraid of efficiency issues? Maybe it still is overkill? Manipulating assembly is probably still cheaper than recompiling the same code on 'fail', but it would look better. I do use this mechansim, but only through contained single direction 'exception' backtracking. So the real question is: backtracking on functional data structures, or linear logic and explicit coding of inverses? Thinking about it, this particular problem might benefit from treating the assembly code as a tree. If a particular loop's body is kept isolated, inserting it into the parent code is a simple manner. Entry: profiling Date: Fri Jun 30 15:08:12 EDT 2006 Eliminating the dictionary lookup for code was a safe guess as an optimization point. However, the rest is probably harder to guess, as I already made some wrong ones. So, time for some proper profiling. How to tackle that one? Entry: return stack Date: Fri Jun 30 15:11:12 EDT 2006 I'd like to find out how to gain access to the guile return stack, or to implement CAT only continuations. Mainly for debuggin purposes. To at least know WHERE an error occurs. Entry: todo Date: Sat Jul 1 20:53:38 EDT 2006 * sheepsint note & controller stuff & docs (1 day) * interrupt driven 3 oscillator hires timer synth (1 day) * connect purrr to pd * usable emacs configuration for purrr * proper forth mode Entry: disassembler Date: Sat Jul 1 21:10:30 EDT 2006 About synchronization between chip and memory image. The (project image) dictionary should really at all times be in perfect sync. If it's foreign software, there's no monitor running. If it's self-modifying, the machine code should be read before disassembling. So disassemble = * read code block * verify * dasm symbolicly using the symbol table This means the disassembler is connected to only one source, and the dirty ugly hack can be removed. Done. Looks cleaner now. Do need to add verify operations. Entry: forth & puredata Date: Sun Jul 2 11:25:06 EDT 2006 Might be best to get both working at the same time. What exactly is necessary to connect pd and purrr? Doing interactive dev in pd is probably not necessary. Getting data as fast as possible might be. So i have two basic solutions: * make PD module understand dictionary format * add OSC to guile/CAT Been working on the forth mode. It stays in forth mode now, even on error. And it behaves as a proper console with possibility to upload code through -> interpretation of lists as CAT objects -> import of some cat words, including 'r' In short '(: + + ;) r' still works in forth mode. Entry: old badnop Date: Sun Jul 2 17:00:53 EDT 2006 There's 2 features i didn't port yet * branch 'snapping' * 14 bit core compiler The latter i'm probably going to skip, since it has no support for the paging madness. The former might be interesting though, but is really an image level optimization: dasm branch? follow asm Entry: sheepsint Date: Thu Jul 6 15:57:24 EDT 2006 Time to simplify. The idea is this: to split the synth design i did for the 14bit forth in the original badnop, which goes as far as designing a new language for synth control, into two project seriousynth: a serious attempt to generate high quality sound using the resources available on the 18f, like the 3 high resolution timers and the hardware pwm, adc's and high frequency oscillator. sheepsint: a scaled down version of the original 14bit design, not using interrupts, using internal oscillator, and purely synchronous pure forth code. Entry: todo go forth Date: Fri Jul 7 14:05:52 EDT 2006 * emacs mode * bootloader for all chips * pd interaction? * sheepsint: interrupts? Entry: grids Date: Fri Jul 7 14:08:58 EDT 2006 Here we go again. Loop transformations for grid/matrix operations. I'm getting a bit sad over the code dup in the matrix and image code in pf. Most of these things could be split up into some higher level constructs. What pf needs is: * better type system * automatic generation of iterators * better polymorphism the first one means that executing type checks for each primitive should be minimized. the system should be designed in such a way that these checks can be disabled, so polywords need a different implementation. the second one means i need a way to represent iterators, ideally a subset of C that can be easily generated in the form of S-expressions. the third one is something that needs to be thought out well. program specialization is important. another thing i need to accept is that either i write an opengl app, or i write a video app. both at the same time is too complicated. Entry: pre-scheme Date: Fri Jul 21 00:18:16 BST 2006 scheme48's low level stuff is written in pre-scheme. can't i do something similar in CAT? reasing in abstraction could do things like: * turn on late binding * turn on GC the main problem with cat is things like (bla bla bla) (boh boh boh) ifte which, to be efficient, should have the lists bound as code, not data. more things can be added: * moving from linear code to linked lists (different 'enter') * moving from untyped to dynamicly typed code (types as pre-xt's) Entry: workshop Date: Sat Jul 22 10:44:05 BST 2006 first workshop went well. i'm updating doc/sheepsint.txt the most important thing to change is to make the target mode a bit more mature, so you can at least reload the code: messing up the code happens a lot when you start to do the wrong things. so TODO: * need 'reload' word * need run time 'toggle', 'high', 'low' Entry: minimal osc listener Date: Sat Jul 22 14:27:17 BST 2006 (receive 10001) osc-receiver (osc-console receive unswons interpret osc-console) : (/purrr t) : Entry: new purrr features and fixes Date: Thu Jul 27 16:04:47 BST 2006 - stack underflow detectiong - better synchronization (random numbers / counter ?) - recording file from target mode - host, target = idempotent Entry: workshop notes Date: Fri Jul 28 12:34:30 BST 2006 - either programming or electronics prior knowledge - more tools: soldering, tweezers, screw drivers, ... - not combined with pd/pf: to much scatter Entry: mzscheme Date: Thu Aug 10 19:08:54 CEST 2006 problem with macros not being able to use functions defined in the same compilation unit, and some more things with macros being different. but, as dave put it, might be interesting to switch because of more active community around plt scheme compared to almost dead guile. Entry: done being lazy Date: Sun Aug 27 19:10:51 CEST 2006 lots of things to do. most important are simplification of pf to lock-free design, which i'm going to accomplish by first making it completely single-threaded, then moving things to more threads using lock-free data transactions. this should lead to jack/pd/desiredata/pnpd interfaces. pf is going to be top-down simplification, while keeping the code running, while in cat/poke i'm going to do the same + base myself on some ideas from retroforth. second thing is switch to mzscheme for cat. guile is dead. this requires macro cleanup. in summary: * transactional memory + lockfree stuff * mzscheme -> clean up macro code * study retroforth design Entry: building pf from the ground up Date: Wed Aug 30 04:52:14 CEST 2006 start with poke: - 4 instruction vm: PRIM/CALL/JMP/??? - single block of linear core memory features: - multimethod dispatch (generic functions) - true object model - refcount object management - hashtable - symbols - lists or pairs? + memory management - binary serialization + comm (all objects either serializable, or reconstructable by design. which means 'write' should write anything to a stream, such that 'read' will either verbatim reconstruct it, or reconstruct it using "trusted constructors", or just arbitrary code) so.. to summarize. the 2 most important problems are 1) memory management for things that can actually be deleted. maybe i should explicitly avoid this? linear lisp mm + pure packet leaf nodes with reference counts based on their respective types are trivial to implement. the 'drop' method for a packet moves it to the global reuse-stack in a lock free way. a (suboptimal) 'real' GC should then be possible to write on top of this by re-packing the core memory and discarding the tree. 2) multimethod dispatch how to make sure this also works on composite types? (lists of packets) Entry: todo september Date: Sun Sep 3 20:18:38 CEST 2006 * move to plt scheme * fix workshop bugs * usb driver * make it run on windows (easier with plt?) Entry: eforth Date: Tue Sep 5 03:04:29 CEST 2006 reading eforth book "eForth and Zen, C. H. Ting", which can be found at http://www.eforth.com.tw/academy-n/basic.htm charles mentions it in the vaticForth source, which is modeled after Ting's eForth17. some comments * purrr sacrifices a lot of ease of use (16 bit numbers, single address space, ...) + simplicity of implementation (loads of special case macros for target machine code optimization). it is not really suited for teaching. * i need a serially controlled, universal, programmable data interface i can plug into any old computer and use the terminal to do debugging without needing a host system to run. this is an ideal place to test this eforth thing. this could be an interface to any app written in a more hard core language like purrr.. so this leads to * purrr -> 'visible' part of the forth project. relatively portable language, adhering to more conservative forth standard practice * cat/badnop -> my playing ground Entry: flashforth Date: Wed Sep 6 03:11:30 CEST 2006 something mikael is doing in flashforth, and from which purrr might benefit, is to always have kernel words be looked up before user words, to make the system more robust. one thing i was thinking about in addition to that, especially aiming at teaching, is to have a 'safe mode', where dup and drop are actually functions that perform some checking. this was a common error. another easier solution is to give some guard bytes, and have the interpreter check state.. Entry: stand alone 16bit forth Date: Thu Sep 7 20:55:51 CEST 2006 After reading some things i see it like this: * ease of implementation is most important * need ans? maybe not. but a compatibility layer could be nice. (eforth?) * dict in lower caps some more * @ and ! are the only memory access words even used in NEXT Entry: todo update Date: Fri Sep 8 17:34:15 CEST 2006 * usb! * minimal ANS forth read docs * sheepsint -> interrupt driven synth core * sheepsint -> highlevel interface forth.. (note layer) * small module system Entry: name change Date: Fri Sep 8 20:25:53 CEST 2006 so.. 18f, short for badnop/18f is now the 8bit compiled 18f forth. a nice and clear name for the real deal. the 'sexy' name is reserved for the 16bit interpreted forth front end, which is in dream up phase, but will probably be very similar, or a port of, charles' standalone 18f forth. Entry: purrr 16bit vm Date: Sat Sep 9 13:18:29 CEST 2006 got the memory model and execution model figured out. time to write the compiler. i was thinking.. maybe it's easiest to just write it in forth, and write a simulator to run the forth in cat? i managed to get cought again with the indirect threaded code deref story. somehow it's really unintuitive. but it looks like it works now.. Entry: usb Date: Sun Sep 10 09:30:09 CEST 2006 another try. tomorrow johannes is coming over, so i need to get things in good shape! since i have strings now, i might just add a debug mechanism to send printouts over the serial cable. this might be one of those good ideas, tom! another thing i need is loadable modules.. about the loadable modules, i'm just using files now, which is good enough. loadable modules are only necessary when there are a lot of macros that need shared support code. cotton feature postponed till later. so.. how to go about getting this usb thing to work. i read the specs, and have some kind of idea about how to get things working.. basicly, usb is a pull system: host requests data. so, i got the 'init-usb' part right :) lol now, the usb reset interrupt UIR URSTIF apparently configures all the buffers. a nice opportunity to start looking into that relative addressing. hmm.. i'm not so sure any more.. i had it running through a 4 6 6 6 6 6 6 6 6 3 sequence, (activate, reset, ..., suspend), but get these kinds of errors in linux (-71 is EPROTO) usb 3-1: new low speed USB device using uhci_hcd and address 86 usb 3-1: device not accepting address 86, error -71 usb 3-1: new low speed USB device using uhci_hcd and address 87 usb 3-1: device not accepting address 87, error -71 usb 3-1: new low speed USB device using uhci_hcd and address 88 usb 3-1: device descriptor read/64, error -71 usb 3-1: device descriptor read/64, error -71 usb 3-1: new low speed USB device using uhci_hcd and address 89 usb 3-1: device descriptor read/64, error -71 usb 3-1: device descriptor read/64, error -71 this seems to happen after only init-usb ??? sometimes it works and sometimes not, i guess i need to sleep.. Entry: random stuff Date: Sun Sep 10 16:56:25 CEST 2006 * easier 'empty' ? * robust to ctrl-C during upload? disable? * fix assembler jumps !!!! (ed. it's an error now. >8KB code is problematic) * move to byte addressing model.. word addressing is crazy Entry: standard forth Date: Thu Sep 14 05:19:25 CEST 2006 do i really want to write one? actually pretty boring stuff.. what i'd like to have is a standard forth for teaching, mainly for documentation and 'api stability' and a simple standalone forth for myself in case i need standalone apps with console. the real question is, can i live with IMMEDIATE? if so, then this is not a problem, and i should just implemement a MAF kernel. (ed) actually, i probably do need this thing for myself. it would be a nice excercice to write it in a couple of layers 1. inner interpreter + memory model (general vs flash only) 2. standalone interpreter 3. standalone compiler 3. ans layer Entry: forth console + speed Date: Thu Sep 14 06:26:47 CEST 2006 i didn't deem this so important at first, but it is. the console needs to be conceptually clean. cat is nice for actually implementing the whole system, and for writing tools that eventually become macros, but interacting with the machine should be as simple as possible, and emulate a true forth basicly. one important part is the uploading of code. blocks are a bit spartan if you don't have to use them, but as an interface '123 load' they work very well. another thing that's important is speed. cat is really really slow. i should find out how to make it a bit faster. the strange thing is that compiling words makes it only a tiny bit faster. i'm assuming that most of the other time is also spent in looking up macro and word dictionaries, or in rearranging the data tree. but i need a tool to actually measure this. some simple things that could speed things up is to speed up the io. maybe add some buffering. terminal io is relatively fast. reading is terribly slow also. Entry: roadmap Date: Fri Sep 15 21:43:13 CEST 2006 1. interrupts 2. 3 x hires interrupt synth algos 3. controller in 16bit highlevel forth 4. interpreter 5. compiler Entry: problem, solved? attempt to get organized. Date: Sat Sep 16 03:57:34 CEST 2006 trying something new. if i write code that's not clean or has missing features, i usually add FIXME or FIXME: blablabla. what about starting to do this in the dev log? a place for more general as opposed to one line problems. FIXME: 18f block 0 init FIXME: overflow in relative jumps FIXME: no parens in forth mode FIXME: code upload: paragraph instead of block? + define keys FIXME: bootblock updater Entry: boot stuff Date: Sun Sep 17 01:20:36 CEST 2006 back on line.. so, the boot block. maybe it's best to use some special purpose instructions instead of 'org' to do the compilation. FIXME: org messes up upload Entry: byte addresses Date: Sun Sep 17 19:19:36 CEST 2006 bite the sour apple. if you EVER use byte addresses, and the machine itself is byte addressable. NEVER use word addresses as representation. it makes life more diffictult, even if at first glance things seem easier. how to tackle.. (project labels) -> (project symbols) a name change is i think the best way of making the transition ok.. looks like its done. some bugs here and there, now the code looks a bit clearer. all addresses are byte addresses now. this would enable a better treatment of config bytes too. Entry: driving leds Date: Mon Sep 18 06:48:44 CEST 2006 fun fun :) rediscovering the L,H,Z matrix trick: using N pins you can drive N(N-1) leds for example 0 1 2 3 0 a b c 1 d e f 2 g h i 3 j k l where for example a = H H Z Z e = L Z H Z number of leds = N * N(-1) 2 times (> and <) the number of ways to choose 2 terminals from N nodes coding hack: for time multiplexing purposes, it is possible to refresh independent (orthogonal) subspaces simultaneously. for 8 x 7 it is possible to use only 8 pins. for 4 x 7 = 28 leds, the closest is 5 x 6 = 30, so 6 pins is enough. then for the number of resistors.. a naive approach would be to use one resistor per 2 leds. however, it is possible to use one per pin giving --/\/\/\---|X|---/\/\/\--- where the middle symbol is two leds antiparallel. however, wether this works depends on what access you have to the terminals. i'm using CA digits, one anode per digit, all segment cathodes free, so this won't work. the maximum number of shared anodes is 5. this means i need to use 8 pins. looks like that's going to be the D bus. something like D port: 0 1 2 3 4 5 6 7 digits: + a b c d e f g g + a b c d e f f g + a b c d e e f g + a b c d using rotates rather than inserts (a + b c ... ) makes it easier to handle in software. a path from 1 -> 0 will light a led, but will not light 2+ leds connected in series between the same terminals. according to don lancaster's tech musings 152, only one led can be driven at a time. i don't think this is correct. you can not have both more than one 1 AND more than one 0, but it is possible to switch on a whole 'row' connected to a common 1 or 0 terminal, with the OFF condition represented by high Z. however, switching more than one LED at a time does create a problem when using one current limiting resistor for each port. since the common 1 or 0 resistor is shared, it has a larger voltage drop proportional to the number of lit leds, resulting in less overall current: per LED ~ 1/(n+1) with the total current ~ n/(n+1) this can be solved by using a bypass diode for each resistor. this makes the circuit assymmetric, which isn't a problem, since there is already a preferred assymetry: it's easier to drive all segments for one digit at a time as opposed to one segment for each digit. to get a decent brightness, it might be possible to drive the led with a higher current than its maximum rating, as long as it has a limited duty cycle. this does pose some problems for testing, since you can't have a non-cycling lit segment. Entry: analog filters Date: Mon Sep 18 17:40:05 CEST 2006 so.. why are 'musical' analog filters special? because of the varying time constants. the basic building block for a filter is an integrator, which is easily built using an opamp or ota. the reason CA3080 otas are used in old designs is because of their variable gm, which is ultimately a bipolar current bias. this translates in variable gain and so variable time constant for integrators. so, the basic entity is a multiplication, in the disguise of transistor bias which influences voltag -> current gain. analog multipliers should be able to fix this. an analog multiplier can easily be built using switched circuits, in the form of a PWM modulator. this requires analog switches, a comparator, and a triangle or saw generator. i order some switched cap filters (MAX265) which have exposed sections so i can give this a try. Entry: 30f and 12f/16f Date: Mon Sep 18 22:35:51 CEST 2006 just put a dspic on the breadboard. since this story is mostly software, i can do it later.. lots of things to do. wonder how this can best be tackled. write assembler first, or write monitor and pre-forth first? anyway, the forth machine model needs to be determined, requires mostly reading for which i don't have time now. then, the 14bit cores. they have some drawbacks. banked program memory and non-writable flash. these make interactive dev really hard. i think it's safest to abandon them for now. an option could be a hardware incremental programmer. so, both are closed for the moment. too much work to actually get somewhere interesting in a couple of days. means i'm sort of done, except for the sheepsint portable radio mod, and can start working on sheepsint and the 16bit idtc forth. Entry: standalone sheepsint Date: Wed Sep 20 00:29:39 CEST 2006 the only thing i need to build before piksel is a little standalone sheepsint board. i'm not sure about the input methods yet.. maybe this is not necessary at all. maybe just a serial port is more than enough? Entry: video output Date: Thu Sep 21 05:30:04 CEST 2006 there's a lot of stuff on the web for making PIC chips and other microcontrollers drive PAL/NTSC/MGA/VGA/... for monochrome this is really not so hard. the only downside is you can't really do much else when sending out bits since pixel clocks are so close to the instruction frequency. so i was wondering if it's possible to use the PIC's hardware serial port to do this: that way it should be possible to drive only at byte intervals. making the interrupt routine as fast as possible should leave open some space to do other stuff below isr. if the clock freq is something like 2x pixel clock, this leaves at least some room for other things to do, while giving full resolution output. first, a look at the serial output hardware on the most basic pic18f. the simplest version is EUSART. the thing i need is synchronous master: can send out bits at instruction tick, clock 2x without interleaving. ( don't need async uart: the standard start/data/stop with sync on stop/idle->start transition, synchronous slave rx/tx and synchronous master rx. also, in the bigger 18f chips there is an MSSP module which implements I2C and SPI. briefly looking at the timing diagrams, it should be possible to use these also, maybe together with address registers etc.. but it looks a lot more complicated, so i am going to forget about it. ) so, text based output with flash character lookup is something like this MOVF POSTINC2 MOVWF TBLPTRL TBLRD*+ MOVF TABLAT MOVWF TXREG clearing of the interrupt flag is not necessary. it happens automaticly when TXREG is written. using an extra layer of caching which gets the current pixels from ram instead of flash gives a simpler interrupt routine MOVF POSTINC2 MOVWF TXREG but this requires another task to actually fill that ram area from rom characters and ram character framebuffer. if it is possible to do this fast enough, i think this is the way to go. it would enable scan line doubling to leave room to do other things. the only thing that still needs to be done in the isr is the H an V sync pulses. the trickiest part is to do the 'end of scanline' checking fast enough in the ISR. probably something like this: BTFSS INF2 6 RETFIE 1 this would check the line readout for overflow, assuming the data is placed in memory so end of line will cause a particular bit to change state. that seems pretty near optimal. per line code can run after that, and doesn't need to be so fast since there's plenty of room between lines. if not, it is easily created by using less characters. then, what's left is the flash->ram copying. the inner loop of this is just TBLRD*+ MOVF TABLAT MOVWF POSTINC1 where POSTINC1 needs to be saved so outside this loop ordinary forth code can run. (actually, this is not correct, since there needs to be something that determines the address to read from). the interrupt latency for any internally generated interrupt is 3 cyles. an iret is 2 cyles, and a btfss is only one if the branch is not taken. so, this gives 3 entry 2 load next char line + store TXREG 1 check end of line 2 return totalling 8 instruction cycles for the isr. bummer. this means it's not posssible to output at instruction cycle frequency and do something else too. it is very possible though to run at half that frequency, freeing half of the cycles for other stuff, but still 5 out of 8 cycles are overhead. it's probably easier to do it without interrupts. what about something like MOVF POSTINC2 MOVWF TXREG x x x x x x MOVF POSTINC2 MOVWF TXREG x x x x x x with the x's filled up with payload. a line loop like that can be started by syncing on the TX interrupt bit. it is possible to squeeze the flash->rom copying inbetween this. TBLRD*+ takes two cyles, the other 2 take 2 also. this leaves 33% of a line to fill up with useful instructions. hmm... (error in reading character set.. wont make it) back to square one. this code takes 6 cycles to run: MOVF POSTINC2 MOVWF TBLPTRL TBLRD*+ MOVF TABLAT MOVWF TXREG same number of cycles, but not affecting the W register so it can be used to do useful stuff in the remaining instruction slots. MOVFF POSTINC2 TBLPTRL TLBRD*+ MOVFF TABLAT TXREG this leaves a 2 instruction slot to do useful stuff. only straight line code can be used to fill the slot. filling this up could be automated using expanded forth code, combined with instruction timing tables. alternatively, the 2 cycle slot could be filled up with a jump, which can be interrupted / disabled by a 15kHz timer interrupt for vertical sync. giving more compact code. timings. dot clocks are only a guideline for analog video. you put on a line what you want. line frequencies are strict. proto pixel (MHz) line (KHz) --------------------------------- NTSC 13.5 15.75 vga640 25.175 vga720 28.175 mda 16.257 conclusion: using a multisync vga monitor, it might be possible to use a nonstandard pixel clock. halving the dotclock makes it all very possible. really, the only thing that's inportant is to get the H/V timing right. lines are just analog. for tv and vga: from http://lipas.uwasa.fi/~f76998/video/modes/ "The harsh reality of analog video is that there is no fixed, defined, horizontal resolution. There is just a squiggly analog waveform which lasts 52 µs. If we want to turn it into pixels, we need to determine a sampling rate, and sample the signal at that rate - over the whole 52 µs active length." so i guess there is quite a lot of freedom, even with vga. the important thing is that the horizontal sync is within tolerance. doing 12MHz pixel clock for video still gives 640 pixels width, wich is more than enough. probably too much to read on most tv sets. a simple old scool 40x25 mode would be quite ok. forgot about this: the 'safe area'. a reasonable percentage of the width (and hight) cannot be used due to overscan. so doing text automaticly gives some black space which can be used to perform other mcu tasks. another hack. when reducing the horizontal resolution, it's only natural to also reduce the vertical resolution. using the non-interrupt way of hardcoded interleaving, scan doubling can be used. for video, this is frame based, so there is no real gain (no space to cache a whole frame on a pic) but for vga, it is possible to make one line write out the line data to ram (8 instructions filled), and have the other line just cache it, using up just 2 instrictions, giving a net gain of 1 instruction per pixel. looks like interlaced/noninterlaced depends on the time you wait from vertical sync to start your next scanline. for noninterlaced, alwayas start a line at the same time, for interlaced, the 2nd field's first line should be delayed a bit do the beam moves inbetween scanlines. see http://www.williamson-labs.com/480_tv.htm Entry: pic sample player Date: Thu Sep 21 09:28:09 CEST 2006 there's the PIC18F24J10, which is the cheapest 18f. it has a 31kHz internal oscillator and 16 kbytes flash. seems like the ideal 1 bit sample player. its a 28 pin package, so plenty of room to add some flashing lights and knobs and switches. Entry: fast multibyte math Date: Sun Sep 24 04:14:48 CEST 2006 maybe that's one of the most important things in writing a kernel that will support the synth core audio part: it needs to do higher resolution calculations. immediately it becomes less optimal, since there's no real way around moving stuff out to registers, perform calculations, and put the result back on the stack. so can't really help slow.. start with the beginning. the 18f has an 8x8 -> 16 unsigned multiplication. to build an 16x16 -> 32 unsigned multiplication we have (a1 X + a0) (b1 X + b0) = a1 b1 X^2 + (a1 b0 + a0 b1) X + a0 b0 given the byte on the stack, it's easiest to compute this low byte first. Entry: more TV signal generation Date: Sun Sep 24 14:15:23 CEST 2006 Thinking about using the SPI ports, so it is possible to still use the serial port for comm. since it has a 1 byte buffer, it could be polled during line sync. nope.. won't work. SPI uses the same output pin as the USART. so, to do i/o too we'll need something different. the easiest is probably just a master synchronous port, done in software, running at line frequency, or 1/2 thereof. this could poll for commands. this does complicates things. for now, i can probably just multiplex it. does need some resistors at serial input though. wait.. looking at the wrong data sheets. MSSP/SPI uses SDI/SDO/SCK, which are separate from RX/TX on 252,452. the 1220 does not have such a module, while the 4550 shares pins with RX/TX, but provides a streaming parallel port. Entry: tasks Date: Mon Sep 25 12:03:40 CEST 2006 very simple task switching. need to save the 3 stack pointers. the parameter stack and aux (control) stack pointer can be saved on the return stack, so a single byte containing the stack pointer is sufficient. this is a can of worms.. there are lots of options. if you run tasks from the isr, you need to save lots of state. better be safe: a task is saved FSR0L FSR1L FSR2 STKPTR need to make a distinction between cooperative tasks, and pre-emptive tasks which are interrupt driven. ok, byte the sour apple. what is needed is: - a task is a piece of ram memory, reserved to save state - each task has a 'suspend' and 'resume' word - tasks invoked from an ISR are pre-emptive tasks, they wait for events (IO), and are woken up as soon as this happens. - tasks invoked from the mainloop are cooperative tasks, and are easiest implemented in a round-robin fasho implementation: the ram memory is easiest taken to be the data stack. since the return stack is a limitied resource. storing saved registers in ram or datastack doesn't make any real difference, except that data stack is faster. ok.. draft done, not tested yet. what i miss is initialization. so, each task needs rp,dp,xp while the initial a reg can be assumed to be zero. wow. it works! it's really simple :) no more boring state machines!! Entry: speed Date: Wed Sep 27 05:58:55 CEST 2006 working on some applications lately, i found out that * speed is more important than i thought before. * pure functional design doesn't always work, since the microcontroller itself has a very pressing 'state' associated to it. * 'mark' / 'empty' are valueable things. * having CAT lists in forth code is really confusing. * i need better emacs integration, especially for uploading Entry: 23 key toy piano Date: Thu Sep 28 12:07:44 CEST 2006 bought new for a couple of euro. crappy, doesn't work well. runs on 2 1.5V AA cells. has a chip-on-board package connected to 23 lines, one for each switch. all switches have a common connection to VDD. speaker is connected to an S8550 PNP transistor \EBC/ configured as a switch with emitter to VDD and collector to speaker, which is connected to ground with the other terminal. now, how to use an 18f1220 for this.. 23 <= 25 = 5x5 23 <= 24 = 6x4 so, 10 pins are the minimum, with 2 possibilities to connect them. it is necessary to cut some PCB tracks to connect the switches to output ports. software is easier if reading is 4 bits (a nibble), so 4x6, with 6 scanning phases, is the preffered configuration. PORTB has weak pullups. switch resistance is about 150 Ohms. choose pins by elimination. RB1 = TX/CK RB4 = RX/DT 76543210 ...x..x. hm.. this already messes up the nice bit alignment :) however, 0765 is possible, which gives a nibble after a single rotate instruction, so we choose the following read ports with pullup resistors, giving up associated functionality: RB5 = KBI1 wit RB6 = P1C/KBI2 geel RB7 = P1D/KBI3 oranje RB0 = AN4/IN0 rood then, 6 driver ports are needed. RA4 is no good. candidates RA2 = AN2/VREF- wit RA3 = AN3/VREF+ geel RA1 = AN1/AN1 oranje RA6 = CLKO rood RA7 = CLKI paars RB2 = P1B/INT2 blauw leaving available 1 pwm output that can be used to drive the speaker. RB3 = P1A/CCP1 and 2 analog inputs for maybe 2 controller knobs. RA0 = AN0 RA1 = AN1 Entry: ICD2 compatible monitor Date: Fri Sep 29 05:38:41 CEST 2006 instead of having a forth monitor run over the serial port, why not use the PGD/PGC as a slave or master serial port? this frees up the real serial port for other tasks, and since the interface protocol is adhoc anyway.. it would require a single interface board with a pic that does host serial/usb and slaves to the target. Entry: connectors! Date: Fri Sep 29 07:11:08 CEST 2006 before i go into the previous, i probably need an intermediate solution, since the only connectors i can easily make myself are 5x2 ribbon cables. lets see: VDD VSS VCC PGD PGC PGM TX RX xxx xxx if possible, make it compatible with FTDI TTL serial cable. anyways. i'm sticking to just 2 connectors. but, the moral of the story is: if a host connection is permanent, use the serial port. if it is debug only: use an ICD2 connector and create some code to use the PGD/PGC pins for serial debug comm. since this involves an intermediate that translates to host serial/usb, i can't do that yet.. Entry: piano keyboard Date: Sun Oct 1 09:58:13 CEST 2006 piano keyboard seems to work fine.. the hardware itself is really crappy, and the wires are a bit in the way, but it works. now i just need to finish the synth itself, which brings me to the next: Entry: interrupts + standalone boot Date: Sun Oct 1 09:58:54 CEST 2006 i need a decent, safe, standard way of patching the boot record for interrupt routines. in addition to that, i also need a way to make an application boot standalone. Entry: wanted features and changes Date: Sun Oct 1 13:49:26 CEST 2006 - project macros = union of system / project macros. just like 'console' words and constants. - automatic tethered / standalone mode for serial operation - safe interrupt setting (not messing up boot loader) - upload of binary files, like font tables, audio samples, without going through source code. the first 2 will break compatibility, but i guess the whole project is not really ready for stable config files, if ever :) for the monitor detection, i'm thinking about sensing a break condition or so. Entry: cleanup Date: Mon Oct 2 09:41:48 CEST 2006 - bulk of macros now reside in (badnop macros) - no separate (18f) namespace: just dump in (badnop) - command completion fixes: wait + forth words - upload binary data - isr install Entry: piksel sheepsint gig Date: Mon Oct 2 18:17:43 CEST 2006 target: - piano - 3 knob 16/ noise box Entry: old sheepsint Date: Mon Oct 2 18:52:25 CEST 2006 The clock/power/data thing is actually a lot easier to do using existing codes: DC-less codes, i.e. manchester codes, PSK, FSK, etc... This allows easy transport over audio channels + easy regeneration. Looks like differential manchester code is interesting, especially when reserving the possibility to control over audio channels. Entry: sheepsint controller Date: Tue Oct 3 09:40:17 CEST 2006 i'm swithing to 3 x 16bit timers for sound generation. it would be a good idea to switch to a 16bit forth to control these thingies. the thing i got stuck in last time was the multiplier. it's not so hard, just need to be a bit carryful. (a x + b)(c x + d) = ac x^2 + (ad + bc) x^1 + bd x^0 x_32 x_321 x_10 things to realize: x^0 doesn't carry (max FE01) x^1 carries into the third byte, (max 1FC02) x^2 doesn't carry so the easiest thing to do, is to compute x^0 and x^2 independently, and then compute the middle x^1 and have it carry over into the most significant 3 bytes. of course, there are a lot of people that have done this before, so i wonder if i can't write it as a serial update equation, instead of a parallel one. can't find much, and i can't see much myself either, so i guess i do it as described above. turns out the routine in the 18f datasheet is the same. good enough. now, in the synth i'm in need for multiplication, only for some exponential/sinusoidal period modulation, so i don't need to solve the general problem -- that's for purrr some day. this means basicly 1D and 2D discrete linear systems. poles are close to 1, so approximation is possible: 1D x <- (1+d) x 2D | x | | 1-d^2 -d | | x | | y | <- | d 1-d^2 | | y | Entry: equally tempered music Date: Tue Oct 3 18:02:26 CEST 2006 from don lancaster's "The Saga of the Dripping Stalactite", an account about discrete optimization (i.e. magic sinewaves). an 8bit optimum for an equally tempered scale is: 116 123 130 138 146 155 164 174 184 195 207 219 232 Entry: binary time constant Date: Tue Oct 3 20:49:47 CEST 2006 by Roman Black - December 2001 http://centauri.ezy.net.au/~fastvid/picsound.htm page was down, i got it from google cache. this idea is not new of course, it's classic sigma/delta A/D conversion. the philips super audio CD if you want. the interesting thing about this particular approach is the encoder: by using a 1 pole filter in the feedback loop which has a time constant that's easily implemented using shift + add, a "pic friendly" encoder can be built. since the 18f has a multiplier, this is probably not even necessary. however, this circuit could be interesting to use the pic as a digital delay in a digitally controlled analog circuit. from Frequency Detection on an Ubicom SX Microcontroller -- Chris Fogelklou a s/d converter is easy enough to build as R R comparator out 0---/\/\/\/\---X---/\/\/\/\---0 low Z analog in | comparator in 0--------------X | === C | V the interesting part here is that the 'comparator in' can be connected to a comparator, or an A/D to get more bits, but also to a digital input! this will behave as a comparator on half the supply voltage. the only problem being that near this threshold, substantial current is drawn. Entry: synthstuff Date: Tue Oct 3 19:59:59 CEST 2006 - xmod / reso - sample playback - wavetable - lfsr noise LFO modulation: - 16bit sine wave / exponential optional: - ordinary 8 bit pwm: sawtooth/sine - geigerclicks - one-shot vs. transition : audible difference? ok.. looks like it's time to get started. the time hierarchy is essentially the same, just a little more complicated due to the presence of the 3 timer state machine. i only have 2 algorithms: * decoupled xor : 0 1 2 xor xor * fake reso : 0 1 2 xor and the main deal is in the resynchronisation: * decoupled * 0 -> 1/2 * 0 -> 1 -> 2 now i know why i've been procrastinating. i didn't make up my mind yet for the ultimate design.. there's too much freedom and too many silly problems due to the 16bit numbers used. it's strange how easily you take atomicity for granted. working with 2x8 bit numbers, a lot more 'locks' are necessary. anyways. it can probably be limited to locking the store to the period and phase variables. sheepsynth used a 'bus system' where the algorithm determined how the outputs are combined. this can be reused. Entry: timer update Date: Thu Oct 5 08:50:47 CEST 2006 periods. it's probably easier and more precise if the timers just get added to instead of reset in the course of an isr. need to check the sheets on that. for the synth, a little bit of phase jitter/delay is probably not so bad. however, frequency drift is not a good thing, so i'm looking for a way to update a 16 bit timer register. for 8 bit it's simple, just add/subtract a value. for 16bit, an explicit read modify write is necessary. the counters need to be in 16bit read mode to give a predictable behaviour. however, there's a hack. always a hack :) when a timer overflows from FFFF -> 0000, an interrupt is generated. in this interrupt we respond by resetting the timer and doing some other stuff. the reason for adding to the timer instead of setting it, is that the exact point is not know, or tedious to obtain by counting instructions every time something changes. however, it is a lot easier to guarantee response before the low byte overflows for the first time. hold on. it is possible to directly add to a timer, since the only thing you need to know is wether a carry occurred or not. so, adding to the low byte produces carry, which is then used to add to the high byte. however, this requires a directly writable (read-modify-write) high byte for the timer. it turns out the latching behaviour of the 16bit timers is programmable for TMR1 and TMR3, but not for TMR0. so i guess the previous hack will be better: construct timer value on stack, then perform 16bit write. this is a constant-time operation which can be compensated for. so, evenually it's something like this: : reset0 TMR0L @ ( read low timer value. this also latches the high byte. ) posc0l @+ ( advance timer lo, write carry bit ) posc0h @ ( read timer high increment on stack ) TMR0H ++! ( update timer high latch ) 12 + ( account for duration of this routine ) 0 TMR0H ++! ( update timer high latch again ) TMR0L ! ; ( write out timer lo, which also writes the latched high byte. ) butt ugly.. wonder if there's no more elegant way. haha! there is. since it's possible to assume TMR0H is zero, this is possible: : reset0 TMR0L @ ( read low timer value. this also latches the high byte. ) 12 + ( account for duration of this routine ) TMR0H rot 0x80. i was thinking about using dither to give a bit more resolution: instead of using 0x80 for rounding, a pseudo random number can be used. for example, if aa x zz == 2, then 2 out of 0x100 times it will carry over, resulting in motion. this is a nice occasion to figure out the difference between PWM, dither, "nyquist PWM", S-D modulation and the Bresenham algorithm. Entry: weird bug Date: Tue Oct 10 10:52:43 CEST 2006 something which happened in the previous workshop, and now i have it again: when a 18LF1320 boots, it sends a 0x80 on the serial port. i thought this was a bad connection or bad chip or so, but it has to be something else.. if it's received by the host system, it is a valid byte. such a byte physicly looks like: idle start data stop idle 1111111 0 00000001 1 1111111111111111 so it is nothing more than a pull to ground of the serial output for a certain period of time. possibly the time between clearing the TRIS bits and enabling the serial port. all points in this direction, since the chip runs at low frequency during boot. edit: indeed. setting the port to all one before clearing TRIS fixes the problem. Entry: pic dev station Date: Tue Oct 10 11:10:25 CEST 2006 so. everybody wants to design a programmer. if i were to design a programmer, what would it look like? - use PGD/PGC pins for monitor comm: standard ICD2 header - usb/serial interface to host - multiple tethered targets - charge pump, so smaller pics can be programmed too - possibly ICD2 compatible (pure software problem) Entry: serial port based app/mon select Date: Tue Oct 10 12:54:07 CEST 2006 so.. the zero byte is set as nop. monitor should boot up in case there is no byte within a couple of msec. why is this so hard? because i want to have some kind of failsafe. it's risky to overwrite the reset vector, so i need an approach that will - launch monitor when there's input data on serial port - launch monitor when there's no application code (bulk nops) - launch application in other cases the first dash is important (not done in application) so that in case the application fails to start the monitor, the monitor can still be run. in short, the real reset vector should not be touched: monitor is considered stable code, while app might fail. ok, it seems to work most of the time, and i can't pinpoint the errors. sometimes i get weird results, if for some reason the target starts sending bogus due to some faulty init. this needs to be tweaked a bit. Entry: 2^n ratio software interrupts Date: Tue Oct 10 18:34:18 CEST 2006 if i run it at 8192 kHz and only use power of 2 rations, things get simpler. software interrupts can be generated by xoring one of the high bytes of the counter with a previous value. Entry: piano Date: Wed Oct 11 22:29:35 CEST 2006 what about using the left 5 keys and right 6 keys as modifiers, and the middle octave as 2 tone poly synth? Entry: poke Date: Sat Oct 14 21:14:20 CEST 2006 - brood symbol lookup accellerator - mole in pd - writing cat in poke is insane -> switch to plt Entry: optimization Date: Sun Oct 15 12:37:11 CEST 2006 - write parse/unparse in C : syntax is simple and load/save really needs to be fast. maybe plt scheme solves this problem? guile's strings are really slow. - functional data structure : dict/dictroot/stack avoid consing. (need to profile, see if it really matters) Entry: shallow binding Date: Tue Oct 17 22:41:13 CEST 2006 i just read "Shallow Binding Makes Functional Arrays Fast (1991)" by Henry G. Baker. the only thing i really use is - convenience of variable (imperative) programming semantics when there are too many variables to effectively use purely functional concatenative approach. - ability to completely undo all side effects of a routine (exceptions + speculative execution / simple nondeterminism) - map and accumulate (in the form of for-each) what i really need is a state machine with a fancy undo. i wonder wether checkpoints aren't a lot simpler. somehow the first item is really ugly, but still terribly convenient. in the end, the program (incremental compiler) really deals with current state and mappings that bring it to the next state. the whole compiler application is really a data base, but the (var) dir needs to be cleaned up. checkpoints only occur at a 'catch' boundary, meaning an assignment that's not enclosed cannot be undone. problem is that every command is already enclosed in the interpreter catch loop. ACID: see http://en.wikipedia.org/wiki/ACID In databases, ACID stands for Atomicity, Consistency, Isolation, and Durability. Entry: big jumps Date: Tue Oct 17 23:44:06 CEST 2006 1. due to the incremental nature, forward big jumps are really scarse: mutual recursion is best kept as local as possible. 2. backward jumps can be resolved in the first pass. 3. big forward jumps are vectors, which can be handled separately Entry: cat optimization Date: Wed Oct 18 00:07:16 CEST 2006 thinking about it, there can only be 2 places where cat is inefficient, except for underlying efficiencies of guile. 1. name lookup 2. functional data structure maintenance the first one can easily be cached (bound) for constant code. this already gives a significant speedup. the second one requires a better treatment of mutation and the dictionary. implementing the (var) and other often changing variables as a vector which is part of the machine struct and is checkpointed on 'catch' might solve a lot of problems. Entry: mzscheme tough nuts Date: Sun Oct 22 11:40:22 EDT 2006 exceptions: things are different in mzscheme, and i'm really guile-centric. differences: * raise takes a single object / throw takes a vararg list * predicates instead of key objects Entry: tv sync generation Date: Sun Nov 5 00:04:19 EST 2006 something i didn't think of: sync pulses can be generated using PWM. probably only makes sense for fast MCU's though. Entry: conceptual problems Date: Tue Dec 12 15:12:31 CET 2006 alright.. i've had some rest, working on PF. some reflections: he problem is state. the whole idea is to have incremental update of state to the microcontroller. the functional/stateless approach of CAT is just to compute the update functions. currently, there is probably too much time wasted in keeping persistent state. the solution here is to clearly separate the functional and incremental parts. one of the core problems here is 'local context' which should really be solved as lexical variables instead of dynamic scoping. so, my question is. * is it possible to create proper lexical hierarchy in a forth based concatenative model, or do we need a different approach. * what about viewing the problem as a database problem? have a coarser granularity in terms of transactions that either succeed or fail. so, the buzzwords: * transactions that can be rolled back (darcs?) * core = functional, with lexical scoping: have a true scheme-like environment. Entry: TV/VGA generation Date: Tue Dec 12 21:10:02 CET 2006 thinking about how to simplify this video stuff to build a character terminal from a pic chip in a way that leaves some space for other work to do during drawing. what i didnt try yet is to use a simpler chip, and use it's general purpose serial port for blitting out stuff. what i use now is the SPI port which has a don't care in the output. the generic serial port should really be able to draw just one line. however, using that one gives up the serial hardware port, so it would be cool to be able to receive serial or other bytes manually. blitting out a bitmap image from ram is faster than an indirect text character image. 2 cylces versus 6. however, there's not enough ram to actually display an entire screen worth of characters. on the other hand, using two chips, this problem is not too hard to solve: have one chip do the display, and have it poll data from the other chip synchronously during sending. serial/USB/... <-> [ I/O ] --> [ DISPLAY ] -> TV/VGA maybe it's overkill.. maybe a simple high-speed shift register would do it better. the real problem here is the overhead. at 6/8 cycles per pixel, the overhead is too large to do anything useful that requires strict timing. as long as it's hardare, it's ok. maybe the comm should be SPI? use the SPI to pull in data from the I/O chip, and feed the VGA with the general purpose shift EUSART register. decreasing the clock by 2 will already give 6/16 instead of 6/8. maybe that's a better approach: just reduce resolution. so, why, if i'm using two chips. why not use something that can dump out serial data as the main blitter, like a serial memory, and have the character set in memory? problem is: they are dumb. what i need is on-the-fly generation of the bitstreams, where one byte in gives say 8 to 16 bits out (character width), or, when the scanning is made horizontal, this can be increased. this is an interesting problem by itself. output wavelets.. ok. shift registers. that's stupid. they need to be loaded too, and require lots of leads.. the only reason to have them is to clock them at a higher frequency. however, generating the write strobe in this case is problematic.. so forget about it. Entry: mesh crickets Date: Tue Dec 12 21:46:53 CET 2006 was thinking about this: what about - pic chip - single speaker - oscillator - battery and create a mesh network of those? using simple but robust BPSK this should really work. i'm not sure about the linear distortion though, since there's not enough horsepower to do adaptive filtering. * reflections could be a problem, have to see. keeping packets very short might solve this problem. * poles/zeros: probably no problem for higher frequencies. using narrow bandwidth channels reduces the risk for linear distortion. this would at least sound nice. maybe different channels can already solve most of the linear distortion problems. gert says: * it already exists: PSK31, but try something simpler first * crosstalk is going to be a problem * hidden node effect * ignore distortion, you're doing low bit rate * use simple error correction (read-solomon) * use some frequency hopping to avoid problems (FHSS) let's see if i understand. BPSK is suppressed carrier (dual sideband) AM. the widest signal is 01010101.. represented as a minimal bandwidth AM signal this is sin(Wc t) sin(Wb/2 t) where Wb is the bit frequency. Due to the two sidebands, the bandwidth is Wb. PSK31 - uses cosine shaping for transmit - receiver shaping: sort of a 'matched shaper' wrt to the transmit the idea is: transmit shaper is kept simple and standard (shaper instead of filter) so the receiver can be improved upon later. Entry: FSK/ASK Date: Sat Dec 16 23:30:11 CET 2006 PSK might be nice, but requires coherent demodulation (carrier phase recovery). FSK and ASK require only power recovery, and are a lot easier to implement. ASK: the output is the power of the output of a bandpass filter, or the output power of a shifted version of the signal. FSK: 2 such filters to detect silence / 0 / 1 / (noise = both) since i don't have a lot of control over amplitude, probably FSK is best. the limiting part is going to be the mixer + decimating filter, since that's the only component that runs at audio rate. the filter could be a very crude flat average. this gives an inner loop like this: (quadrature receiver with product integrator) x = sample() x_i = x * I x_q = x * Q update(I,Q) (-> use tables) y_i += x_i y_q += y_q the only thing that's problematic here is the multiplication. 8x8 signed. orthogonality: if all signalling functions are harmonics of 1/T, then the integral over a period of T will cancel out all frequencies except the one we're interested in. hmm.. denkfout: de 2 carriers moeten afzonderlijk naar DC gemixt worden the above is too complicated: talking about it a bit with axel and gert gives the following: - non-synchronous FSK - goertzel algo (2nd order filter with 1 multiplication per sample) - dtmf frequency relations to reduce effect of nonlinearities Entry: plt porting Date: Thu Dec 21 20:13:15 CET 2006 let's see.. if i recall, the problem atm are the exception handlers. what's needed is a way to unify guile's and plt's system error handling, and possibly the language's catch/throw. the thing i run into first is: error: ok. fixed: object->string now output flushing. -> set by default need 'system' function.. OK Entry: i/o files Date: Fri Dec 22 16:59:47 CET 2006 looks like plt doesn not use I/O files. each direction has a separate port. however, it does define 'open-input-output-file' which gives back 2 ports associated to the same device. this function returns multiple values. as a recap, this works like this: (values 1 2) ; returns multiple values (call-with-values (lambda () (values 1 2)) ; thunk that returns multiple values (lambda (x y) (list x y))) ; function that processes the values so, to deal with it one could do (call-with-values list) ; return multiple values as list or (call-with-values cons) ; return 2 values as a pair next problem: string-match apparently, i just ignored this: (define (string-match a b) 'failed) looks like i need the perl regexps here. OK next problem: select() looks like mzscheme is trying very hard to be portable.. and apparently select() doesn't go there. so what i need is something that's more posix like, the way guile does it. there should be at least some module i can use. it looks like mzscheme uses select only in it's thread implementation. so there is a higher level api to do what i would do with select. in brood, select() is used in polling for serial port input. maybe it's best to focus on abstracting this polling behaviour, i.e. 'busy looping'. i don't really need tasks yet. Entry: kerstboom Date: Fri Dec 22 18:30:41 CET 2006 so i got my 452 board, and a led connected to two naked copper wires, and two balls of aluminum foil. the idea is to make a 4 wire maxcon network, which is topologically a tedrahedron: 4 alu balls and 6 edges, with 2 leds per edge. this is connected to the micro through a couple of resistors. next: port selection. PORTD looks most interesting. soldering an 8 pin DIP socket to PORTD. got 100 ohm resistors connected to the led now. this gives about 10 milliamps through the led, assuming led is between 2.5 and 3 volts, and power supply is 5V. Entry: sheepsint Date: Wed Jan 3 02:59:48 CET 2007 chatting with aymeric about the future of sheepsint. i need to figure out where to put the stress. the pf interpreter cleanup is probably best forgotten. turbulent time thing.. board is important. USB not so important. maybe for later gadgets, but probably boards and how to design them is more important. from the software side, i need to really fix the mzscheme port, and then do some benchmarking. so 2 sides of the coin: HARDWARE - build a board from scratch - smd soldering (optional, board could be dip + perf) SOFTWARE - mzscheme port - profiling (slow speed is really annoying, it should be possible to get 10x) !! check aymeric's feature list in irc logs Entry: profiling Date: Fri Jan 5 17:19:50 CET 2007 ok. time to make it a bit faster. i need to know where the thing spends most time. i would guess it's in binding and searching, but it could also be data structure updates. something i can't put in real words yet: i want to eliminate global state variables that have nothing to do with state of the microcontroller. in other words, i'd like the compiler to be purely function, with state updates only occuring when a compilation is done. in short, all the stuff in (var) should be replaced by some other construct, preferably lexical scoping. Entry: brood internals Date: Wed Jan 10 17:22:06 CET 2007 we zijn er bijna. ik zit nog met 1 vraag: hoe elimineer ik dynamic variables, en implementeer ik een forth met syntactic scope? brood is leuk en zo, maar het gebruik van dynamische variabelen is slordig. ik moet eens kijken of het mogelijk is om een concatenatieve taal te schrijven als een pure syntactische extensie van een (lazy) scheme, zodat ik beter kan bootstrappen op scheme. hoe geraak ik daar? A* wat met lambda? B* begrijpen wat de essentie is van transactionele updates (microcontroller heeft state, daar kan ik niet buiten) in een functionele taal. C* begrijpen wat precies nodig is om scheme lazy te maken. D* in hoeverre kan ik overschakelen op haskell? nu, wat bedoel ik met lexical scope.. lambda. ik geloof dat retroforth iets heeft met locale functies, ergens.. ik heb de indruk dat het gevaar ergens dieper schuilt.. uiteindelijk gaat het hier gewoon over expliciet benoemde locale variabelen, en forth/stacktalen zijn combinatorisch ipv random-access. maar.. wat ik doe is toch zowiso al introduceren van random access. misschien best om het dan 'juist' te doen, in de vorm van static scope.. proper lambda dus. http://lambda-the-ultimate.org/classic/message11828.html : "There is nothing preventing a `stack-based language' (this is a meaningless term, BTW, like `interpreted language') from implementing call-by-name, if that's what you mean." "And I don't think that their approach `is not expressed in the same way' in languages like ml, scheme and haskell', since, in all those languages, one can define combinators. It's just that no one wants to do it because taking advantage of lexical scope makes things much easier." punt.. de reden waarom het wel werkt voor mij is dat ik denk (en zie) dat er een 'natuurlijke orde' is in het gebruik van argumenten, zodat in de meeste gevallen namen noemen meer de rol krijgt van 'vervelende overhead' ipv. extra feature dat programmeren gemakkelijker maakt. er is een vrij belangrijke opmerking te maken over 'dup'. het maakt op een vrij elegante manier fanout duidelijk: waar gaat mijn data naar toe. bij namen noemen is dat niet altijd duidelijk. "Speaking of Forth, it's interesting to note that Curry's combinators B, C, K, and W, which form a basis for the lambda calculus, are essentially the Forth words execute, swap, drop, and dup. B, C, and K are present in the Haskell Prelude (as $, flip, and const)." more interesting.. need to look into that. anyways.. ik ben weer in de war. combinatorisch: pro = compositie gaat gemakkelijk con = 'routing' is problematisch symbolisch: con = compositie kan omslachtig zijn (veel haakjes) pro = 'routing' is triviaal uiteindelijk gaat het hier echt wel om SMAAK. wat is beter? tja. bier of wijn? dit is wel een mooi uitgangspunt voor de foam paper. maar ik denk dat er niet veel discussie mogelijk is over het feit dat lexical scope beter is dan dynamische scope. ik ben een beetje aan't sukkelen met BROOD. om het effe enorm te versimpelen, ik wil een syntax zoals (a b c) (bla1 bla2 bla3) lambda om dit weer te geven (lambda (a b c . stack) (apply bla3 (apply bla2 (apply bla1 stack)))) dit is waarschijnlijk gewoon rechtstreeks in te pluggen.. ik moet gewoon lambda toevoegen. tot zover voor *A interlude. aangezien dit hele experiment vooral dient om mij een gevoel te geven over functioneel programmeren, lijkt het me aangewezen nog even verder te gaan over types, of er ten minste later op terug te komen. nu. transacties. in feite is het compileren van een chunk code niet meer als het berekenen van een 'diff' voor de microcontroller state. het uitgangspunt is incrementele compilatie, dan moet ik dat ook maar waarmaken. het probleem nu is dat dit nergens expliciet ingebouwd is. hier moet ik nog eens op terug komen. scheme en lazy eval is voorlopig niet echt aan de orde. en haskell is voor later. Entry: lambda Date: Sat Jan 13 15:25:37 CET 2007 autotrip heeft me hetvolgende geleerd: wat ik doe is functional (de manier dat ! en @ werken) maar het is niet referentieel transparant. die twee zijn niet hezelfde. ik kan niet zeggen dat mn code 'geen side effects' heeft. ik kan er wel voor zorgen dat de side effects hersteld kunnen worden. dit is al bij al een vreemde situatie. nu vroeg ik me af of het niet mogelijk is dat zo te houden, en hier bovenop een 'echte' lambda te kreeeren: * een functie maakt een nieuwe env aan (dict) door zijn dict te catten met de globale dict ontvangen van caller * als deze funtie andere functies oproept, wordt alleen de parent dict doorgegeven. de locale dict wordt enkel doorgegeven aan een nieuwe lamda struct. das een groot verschil. misschien niet goed uitgelegd.. Entry: practical lambda Date: Sat Jan 20 11:48:09 GMT 2007 * reading paper by Lynas (hmm.. not what i needed. too clumsy) * the expression problem? misschien moet ik eerst eens kijken hoe ik de badnop compiler meer stateless kan maken (for referential transparancy) dit wil zeggen: elimineer (var) dit zit toch wat dieper dan ik dacht: - ihex -> kan evt vervangen worden door lambda - read / write -> 'user interface' state - asm queue -> dynamicly bound everywhere.. (like "print") als ik eens begin bij het begin: de ihex variabelen hebben daadwerkelijk een vrij locale scope. conclusion: distinguish these - 'real' state variables -> target state -> user interface state (current inspection address, asm buffer) - 'dirty' state variables -> ihex printing state if state variables never affect the user interface state, they should be part of lambda expressions (local variables). Entry: patch / transactions Date: Sat Jan 20 12:37:37 GMT 2007 is it possible to get to a point where the assembly of a line of forth code translate into a single transaction (patch) object. so env -> env code -> env patch in the evaluation of this function, some local dynamic scope might be introduced, but it is cleaned up after the compilation. rationale: the whole idea of BROOD is to have an incremental compiler. the proper way to solve this in a FUNCTIONAL language is to compute the increment of the state instead of the new state. wat ik aan't doen ben is 'functional data base programming'. i should really have a look at the 'darcs algebra of patches'. reading: "Database States in Lazy Functional Programming Languages: Imperative Update and Lazy Retrieval", by Yoshihiko Ichikawa. - "state transformer" approach for referential transparent interface the rest is a bit to hard to read atm.. no patience :) i probably need to read a bit more on the 'state-transformer' approach in haskell. Entry: apply Date: Sat Jan 20 13:54:13 GMT 2007 net effe het woord 'apply' ingevoerd: werkt als 'run' maar negeert alle veranderingen aan environment: dit om manueel referentiele transparantie te verkrijgen. (apply for referential transparency). this effectively introduces dynamic variables with limited scope. the lambda solution would be a lot cleaner, but this gets us on the way of introducing more transparancy. look at ihex now, there's still the ad-hoc symbol prefix namespace i'm using, which would be a lot cleaner if there was a real local environment. maybe i should give it a try to change ihex to a local function approach. ok, this seems to work.. there are a couple of performance issues though: when i'm using '!', which happens iimplicitly in ':', the whole environment gets searched, which will replace occurences instead of just hiding them. so i need a different word, i.e. 'let' which will just shield without replacing, to avoid costly substitutions and searches. this might be another gentle ramp-up to proper lambda. (it's still dynamic though, since this env propagates..) it seems to me that 'lambda' requires some core changes to the current CAT vm, since local names should not be propagated to functions called from the body. i think i'm missing something. Entry: 18f / 30f pinouts Date: Sat Jan 20 16:35:54 GMT 2007 a bit disappointing, they are completely different for the 18pin versions.. this means a sheepsint board has to be redesigned for the dspic. looks like i should call them pic18 and pic30 though. Entry: lexical scope Date: Sun Jan 21 12:15:19 GMT 2007 after trying to implement dynamic scope 'with' and have it clash with my namespace searching mess i think it's time to add proper static scoping. there is only one huge problem: i don't have the lisp-like syntactical thing in cat: code is data until it's interpreted.. i do wonder how the substitution problem is solved in scheme though. basicly this is the problem: ((a 1) (b 2)) (a b +) lambda this will return a function on the stack which accepts 2 arguments. the code of this function will look something like: ( + ) Entry: scheme piggy back Date: Sun Jan 21 13:17:01 GMT 2007 it might be interesting to write a concatenative language as a macro extension of scheme, using a module namespace, and re-using the scheme scoping rules. in some way this is back to basics. * all functions return a stack (a b c) -> (apply c (apply b (apply a))) (define _lambda (body parameters . stack)) thinking about it a bit, this absence of true syntax makes it really hard to actually do something. as an exercice, i'm trying to write a CAT as a pure syntactic extension of scheme. (define-for-syntax (code? thing) (symbol? thing)) (define-syntax compile (lambda (stx) (syntax-case stx () ((_ thing stack) (if (code? (syntax-object->datum (syntax thing))) (syntax (thing stack)) (syntax (cons 'thing stack))))))) (define-syntax cat (syntax-rules () ((_ stack) stack) ((_ stack first rest ...) (cat (compile first stack) rest ...)))) still, how do you introduce lambda? i can do this as a prefix form like: lambda p (p p p) this works reasonably well.. it does seem that a 'parsing word' is necessary, since this is a syntactic operation which happens at compile time. this should be easily extendable to lambda (p q) (p q +) it is, only i've got the order wrong. Entry: remarks about lexical scope Date: Mon Jan 22 12:52:21 GMT 2007 introducing lambda necessarily introduces block structure. scoping is about introducing block-dependent names. as long as the interpreter is not aware of 'activation records' or something similar, lexical scoping does not make sense. in CAT there is no such thing. it's easy enough to add, but it requires modification to the interpreter itself. it's probably easiest to change the name space structure in a such a way that it will support proper closures = (code . env) and make sure each word is bound to its own environment. this is quite a deep change, but the only one that will solve the problem. currently in CAT, the environment is a run-time thing (dynamic). what i need for proper lambda is proper closures, where environment is a lexical thing: whenever a lambda expression is evaluated, an environment is frozen and tied to a function. in order to have the best of both worlds, i need to keep the current (persistent) dynamic environment (which is a property of the interpreter), and a static environment (which is a property of a function). Entry: static AND dynamic Date: Fri Jan 26 19:27:49 GMT 2007 looks like this: - static binding: environment tied to function (lambda) - dynamic binding: environment tied to interpreter the latter is as if it is tied to a continuation. also, this is closely related to the monad thing in haskell, which is understand very remotely as 'passing along hidden variables without violating the pure functional paradigm.' now a note on efficiency. - once a bunch of symbols is associated with a (static) lexical environment, the code can be compiled. since names will never refer to other objects, the indirect references (symbol names) can be removed. - all symbols that are not in the lexical enivonment, are necessarily in the dynamic environment. they cannot be compiled and have to be resolved at run time. if i implement proper lexical scope and use it for most important words, speedup will be significant. i think i found my main optimization point: both for speed and sanity :) but, somehow i don't really understand how scheme does it though.. maybe this is just because define is actually 'set' ? Welcome to MzScheme version 352, Copyright (c) 2004-2006 PLT Scheme Inc. > (define (AAA x) (+ x 1)) > (AAA 1) 2 > (define (BBB x) (+ (AAA x) 10)) > (BBB 0) 11 > (define (AAA x) (+ x 100)) > (BBB 0) 110 > so, i guess i can't escape monads now. because it really looks like that's what i'm trying to do, but then more general. Entry: R.W. Floyd's Turing Award Lecture Date: Sat Jan 27 12:17:41 GMT 2007 http://www.ias.ac.in/resonance/May2005/pdf/May2005Classics.pdf I now see clearly and concretely the force of Minsky’s 1970 Turing lecture [19], in which he argued that Lisp’s uniformity of structure and power of self reference gave the programmer capabilities whose content was well worth the sacrifice of visual form. I would like to arrive at some appropriate synthesis of these approaches. To sum up, my message to the serious programmer is: spend a part of your working day examining and refining your own methods. Even though programmers are always struggling to meet some future or past dead-line, methodological abstraction is a wise long term investment. To the teacher of programming, even more, I say: identify the paradigms you use, as fully as you can, then teach them explicitly. They will serve your students when Fortran has replaced Latin and Sanskrit as the archetypal dead language. To the designer of programming languages, I say: unless you can support the paradigms I use when I program, or at least support my extending your language into one that does support my programming methods, I don’t need your shiny new languages; like an old car or house, the old language has limitations I have learned to live with. To persuade me of the merit of your language, you must show me how to construct programs in it. I don’t want to discourage the design of new languages; I want to encourage the language designer to become a serious student of the details of the design process. Entry: haskell and monads Date: Sat Jan 27 12:18:28 GMT 2007 http://sigfpe.blogspot.com/2006/08/you-could-have-invented-monads-and.html - monads have something to do with side effects. many types of container object can be viewed as monads some tutorials find it had to reconcile the 2. - 3 examples of a -> m b : Debuggable, Multivalued, Randomized - We're given a function a -> m b but we need to somehow apply this function to an object of type m a instead of one of type a. - In each case we do so by defining a function called bind of type (a -> m b) -> (m a -> m b) and introducing a kind of identity function unit :: a -> m a. The triple of objects (m,unit,bind) is the monad. - You don't even have to explicitly use the bind function, you can ask Haskell to insert it into your code automatically. In Haskell the names are '>>=' for bind, 'return' for unit, and in order to overload the names >>= and return, we need to make use of type classes. The 3 patterns above are called Writer, List and State monads in Haskell. -- It looks like what i'm trying to do is something like a writer monad, but since the assembler can optimize, it becomes the more general state monad. It looks like 'do' in Haskell is actually that strange thing in concatenative langauges where functional and imperative seem to collide: sequential statements as composition of functions. However, in Haskell the state that's being updated is always properly isolated, while in an imperative machine, the state is the entire machine. The pattern i'm after is something like forthcode -> forthcode, assembly sort of 'shit a turd' kind of approach. http://sigfpe.blogspot.com/2006/01/eleven-reasons-to-use-haskell-as.html Haskell isn't very good at dealing with arrays. In most programming languages it takes time O(1) to access the ith element of an n element array. In Haskell you can get this kind of performace if you don't mind using the various painful to use array operators available, or you can sacrifice performance, keep the elegance, and get O(log n) performance. On the other hand, you'd be surprised just how many useful matrix algortihms don't require random access to the ith element. Entry: static scope Date: Sun Jan 28 12:24:02 GMT 2007 how to add static scope in CAT? * add static environment to each function = what lambda does * adapt interpreter to search static environment or * make lambda compile immediately all static refs the latter is actually pretty elegant, since it doesn't require any changes to the interpreter. it probably does get tricky when deep binding occurs, since there is really no lexical structure (i'm thinking 'ifte' here) (a b c) ((la la a) (jam jam b c) ifte) lambda hmm..... somethings not right what about jit? whenever code is run, it is compiled using the current stack of activation records. first: create explicit closures as (env . code) second: find out how to add toplevel environment to this Entry: restate objectives Date: Sun Jan 28 14:00:42 GMT 2007 * basis = forth VM on a microcontroller for simplicity * write cross compiler in a language similar to forth * use as a FP approach for the compiler language Entry: confused Date: Sun Jan 28 18:09:05 GMT 2007 i'm learning new things which comes with a lot of confusion and desire to start over. let's see if i can recall the road again? -> wrote PF to have explicit symbolic code + processing -> found out that extreme late binding might be an interesting addition -> discovered the benifits of functional programming (basicly 'no dangling references') -> came up with the idea of writing a lowlevel language with a very powerful highlevel meta language: cat/purrr, where the combinatorial language is clone of Joy. -> found out that having no variables at all is quite drag, so worked around a 'functional objects' model where mutation is allowed (so it feels imperative) but underneath the whole previous state is preserved. i'm not sure how this is called, but it has a lot of advangages. -> found out that the simulated imperative programming has actually some of the disadvantages of real imperative programming: not referentially transparent. basicly, i'm using dynamic binding where i shouldn't, and shoot myself in the foot plenty. -> found out that i really need static binding and a proper lambda. learned some more about how to implement lexical scoping. -> found out that fully dynamic implementation is way too slow. -> thought that it might be interesting to start over. started to experiment with a concatenative language on top of scheme, and some ideas to write an interpreter which uses static scope. found out that it's not really possible to change current CAT to static scope without deep changes. -> came back to an old idea of specifiying the optimizer as a declarative language (a formula rewriter). i.e. (dup drop) -> () -> learned about haskell monads and started thinking about how i could use this to implement a compiler. -> started thinking about pure functional approach where state is really just what's on the microcontroller. -> started thinking about eliminating cat entirely, and keeping only its spirit: the interface to the compiler should be from the perspective of the target forth. i.e. compilation is a task spawned on the host from the target console. Entry: incremental improvement day Date: Wed Jan 31 11:54:26 GMT 2007 home alone today, so time to get some work done. i've been ranting about rewriting CAT. this ranting full of pathos about rewriting and starting over seems to be an essential part of my development process. i've learned not to give in to it, but instead to fork off test projects to see if the core ideas are actually valid. i t's not good to act on your impulses immediately. i've been doing that too much. there is the practical element of bugfixes. so, to crystallize, what needs to be changed? * static scope in CAT * new optimizer using rewrite rules before i can do any of these, i need to clean up the current interpreter code. so. what if i change the VM implementation to one which uses 4 registers instead of a list, after the classic SECD design? State Environment Code Dump. To make things a bit more readable, it might be wise to change all macros to define-syntax form. DONE. Ok, let's continue not slashing it. If i change dict in a way that is compatible with the current prototype, i should be able to keep most of the code intact. The main change being the following: - dynamic dict is always past along - static dict is * re-instantiated on 'call' * augmented in 'lambda' actually, i can just hack it in! if i use a path called 'static' which has precedence on all searching, and a binding structure like run-static which just stores defs in the (static) variable and restores it afterward, i have all i need! is this true? something i don't get... fuzzy. getting tired. i'm thinking.. the functions themselves don't need access to the static environment, since static effects don't change when code is running, so the VM prototype can remain just like it is.. the change is in the interpreter completely. i'm going to remove the tracer to make things a bit clearer. indeed. i've isolated the appearance of a lexical binding to the point where a symbol is resolved to a (lex . code) closure pair. now i need to update the dictionary format to store code as either compiled scheme functions (primitives) or symbolic closures. ok.. this seems to work. the next part is to implement lambda in a way that makes sense... since environments is one thing, but semantics for cat and scheme are different..