Computer Science. This is a collection of blog articles about misc computer science topics. Practically though, this is 50% about re-wiring the brain to learn functional programming (Haskell). Entry: Parsing and Automata Date: Mon Feb 16 11:21:54 CET 2009 Before storming into a dark room writing yet another ad-hoc recursive descent parser, it might be a good idea to turn on the lights and look at languages first. AUTOMATA -------- Automata are classified by the class of formal languages they can recognize. Types of Finite Automata[1] : DFA: deterministic finite: each state has a exactly one transition for every symbol in the alfabet. NFA: nondeterministic finite: .... zero or more transitions ... (symbol with no state: input is _rejected_) It can be shown that these [D|N]FA can accept the same languages. These languages are called _regular_. Extensions to Finite Automata: PDA: push-down: FA + stack. NPDAs accept the _context free_ languages. LBA: linear bounded: a turing machine with a tape length proportional to the input string. LBAs accept the _context sensitive_ languages. TM: turing machine: equivalent to algorithms. Turing machines decide/accept recursive languages and recognize the recursively enumerable languages. FORMAL LANGUAGES ---------------- Let's stick to formal languages defined by a formal grammar[3]. A grammar is a set of rules for transforming strings. These rules are called _productions_. All the strings in the language are generated by applying the grammar rules to a collection of _start symbols_. If there are multiple ways of generating the same string, then the grammar is said to be _ambiguous_. Note that from a practical point (parsing) this is upside down: parsing consists of two steps: - validate a string (is it part of the language?) - as a side effect: find the exact production rule(s) to build a parse tree representation of the input (convert concrete syntax to abstract syntax) Ambiguity is problematic for the latter part: you really want a single parse tree to which to attribute semantics later. Note: The production rules approach is very different from recognition-based PEG - parsing expression grammars, where the language is the set of inputs recognized by the parser expression (a formal representation of a recursive descent parser). A context-free language can be recognized in O(n3) time, but there are a couple of subclasses for which linear algorithms exist: LR Left-to-rigth, Rightmost derivation LALR Lookahead LR LL Left-to-right, Leftmost derivation LL(k) LL with k lookahead, without backtracking LL(k) Can be implemented with recursive decent parsers. Lisp is LL(1) A derivation[4] is a convenient way to express how a particular input string can be produced by fixing a replacement strategy (i.e. always replace Leftmost or Rightmost non-terminal first) and listing the rules applied using that strategy. This is not unique for an ambiguous grammar. [1] http://en.wikipedia.org/wiki/Automata_theory [2] http://en.wikipedia.org/wiki/Context-free_languages [3] http://en.wikipedia.org/wiki/Formal_grammar [4] http://en.wikipedia.org/wiki/Context-free_grammar#Derivations_and_syntax_trees Entry: YACC Date: Fri Feb 27 11:07:08 CET 2009 "Why Bison is Becoming Extinct" http://www.acm.org/crossroads/xrds7-5/bison.html Generic parser references: http://www.meta-environment.org/ http://accent.compilertools.net/ (works with LEX) The first one seems quite interesting. Entry: coroutines and "join" Date: Tue Mar 10 14:47:23 CET 2009 Start with a simple 2-coroutine network: a process coupled to a controller: | | --> | | | C | | P | | | <-- | | The controller's outputs are the process' inputs, and vice versa. This can be executed using synchronized channels (a read is woken up by the write on the other end). 1. This can be implemented using globaly named variables and globally accessible synchronization events (signal/wait). A disadvantage here is that to name such a variable "input" or "output" is ambiguous. 2. The ambiuitiy can be removed by introducing "positive" and "negative" coroutines. Negative routines read from "output" and write to "input". It is natural to call these negative coroutines "busses". 3. Coroutines with multiple inputs that are read at the same time need a synchronization mechanism. This is usually called "join": a process that continues when all its inputs are available. (aka a "barrier"). It is possible to avoid "join" by buffering all "output" registers (bus task read channels) in one direction and adding explicit clocks that only occur when all data is guaranteed ready. This basically clocks the "input" -> "output" state machines. Another question is: Given a "signed" network of coroutines (every corotine is connected only to coroutines of opposite polarity) is it sufficient to start the even ones in output and the odd ones in input (or vice versa) to avoid deadlocks? Entry: data - codata Date: Sun Mar 22 22:13:57 CET 2009 http://blog.sigfpe.com/2007/07/data-and-codata.html http://en.wikipedia.org/wiki/Corecursion well-behaved recursion for recursion on data and codata - structural recursion: recurse on strict subparts only - guarded recursion: recurse only inside constructors Entry: CTM Date: Fri Apr 3 16:51:13 CEST 2009 Partial values <-> Complete values. The single-assignment store is remarkable (2.2 p. 42). Especially the use of assignment to both construct data structures and take them apart. See p80 2.6 : The binding operation performs [unification] which is a symmetric operation. Maybe I should try to implement it? CTM 2.8.2.2 p. 101 has the algorithm. In the context of an evaluator, the nontrivial part is the implementation of the data structure (i.e. functional w. sharing vs. imperative/hash). [unification] http://en.wikipedia.org/wiki/Unification Entry: Evaluation Strategies or Lambda Calculi? Date: Wed Apr 8 15:17:26 CEST 2009 Been reading a bit on dave's blog the posts surrounding this[1]. The main idea being that it makes little sense to talk about a single lambda calculus (LC) with different reduction strategies like applicative order or normal order. It's better to make a special-purpose calculus to model call-by-value and call-by-name languages (the CBV-LC and CBN-LC). This generalizes to other language extensions. This one[2] was informative: In call-by-name lambda calculus, the definition of a reducible expression is an application whose operator is a value. The operand can be any arbitrary expression. As a result, non-values end up being passed as arguments to functions. In call-by-value, we likewise only perform reduction when the operator is a value, but we also require that the operand be a value as well. [1] http://calculist.blogspot.com/2009/02/history-lesson.html [2] http://calculist.blogspot.com/2006/03/observable-differences-between.html Entry: invertable data structure pack/unpack Date: Mon May 4 12:51:53 CEST 2009 Instead of using a zipper, use polymorphic accessors that will automatically perform the correct pack/unpack when modifying a data structure. I.e. an operation becomes: (unpack dosomething pack) but instead it will be left at (unpack dosomething lazy-unpack) such that the next 'pack will cancel the lazy-unpack. Now explain this a bit better.. Entry: Futamura Projections Date: Mon May 4 21:03:55 CEST 2009 sigfpe is talking [1] about the Futamura projection. I find it a bit hard to follow so let's try to explain it here again. ( or not? ) I'm actually more interested in determining why partial evaluation is not a trivial problem. If it is just about folding constants, it really shouldn't be too hard. The problem as I understand is recursion: you can't unfold general recursion as it will lead to infinite code structures. At some point, run-time recursion needs to be introduced to break the loop. This is mentioned in Wadler's paper [2] on deforestation (blazed treeless form). As far as I understand, doing this in general can be quite involved. Using transformations of higher order functions instead of raw recursion seems a better approach. [1] http://blog.sigfpe.com/2009/05/three-projections-of-doctor-futamura.html [2] http://homepages.inf.ed.ac.uk/wadler/topics/deforestation.html [3] http://www.itu.dk/people/sestoft/pebook/ Entry: The 2 x 2 of functional programming. Date: Mon May 4 21:24:54 CEST 2009 Pick any two and see what they have in common. (bind) (reference) abstraction | application (code) -------------|------------- destruction | construction (data) There is also a diagonal dimension: creation and consumption of the abstracted object itself. Entry: Stacks Date: Fri Jun 5 12:04:02 CEST 2009 I'm trying to figure out some abstract structure of 2-stack Forth systems, combined with parsing (input and output stream). The structure of Forth is quite remarkable. It is implemented very compactly by using different stacks and streams. However, its (sparse) use of direct mutation to break cycles during compilation makes it difficult to analyse and cast into a stream / stack / tree / directed-graph / graph structure. The idea I'd like to develop serves to answer the question: "Why are 2 stacks different than 1 stack?" From programming in Forth I can answer this in vague terms as: "It's possible to save work to the 2nd stack when you need to perform a subtask using the 1st one." This can probably be generalized to: "It's possible to save work to the Nth stack when you need to perform a subtask using the other N-1 stacks. This doesn't need to be ordered, so you could see all stacks as equivalent and pick one to save work while the others are used in a subproblem. The question then is: can this be formalized a bit more? Can Forth, its (self-hosted) compilation algorithm and 2-stack computation and composition model be expressed as some structure with morphisms, and are there relations of transforming N-stack machines to N-1 stack machines? Pardon my rambling.. A stack might be seen as a prototypical complexity reducer. The most basic data structure to encode composition mechanism in a way that is simple to express in hardware. A stack is the "mother of locality". And locality is an essential concept in (physical) computation: if the data ain't in your hand, you can't do anything with it. Stacks connect "now" with "later" in a flexible way. The mutation that's present in Forth actually only happens as part of a mechanism to compute forward and backward loops. It is what turns a flat code stream into a directed (control flow) graph. So, in order to simplify Forth, this mechanism needs to be replaced by something that can be _reduced_ to it, i.e. higher order functions and their partial evaluation (jumps are PE of applications). Entry: Cached Continuations and Synchronized VMs Date: Mon Jun 15 13:19:05 CEST 2009 I've been thinking a bit about continuations and how they can be used in situations where you have two parties that communicate without shared state with the following constraints: * There is a possibility of one side giving up without notice. * The link is expensive. This is the typical interaction scenario of a web server talking to a web client, but there is no need to make it a-symmetrical. Communicating parties are adequately modeled using continuation passing style. In general, if the two communicating parties just exchange the whole continuation (placing the ball in the other court), the first constraint isn't a problem: if one of them dies the conversation simply stops without any side-effect on the other party. However, because the link is expensive and because non-trivial continuations tend to be quite large, one tends to keep the continuation stored on one of the parties, call this the "server", and exchange references over the channel. The problem here however is that if the other party (the "client") dies, this creates garbage on the server. If you separate this problem in two it might be easier to manage: * Logically, continuations are always exchanged fully between the two parties. There is never any local state. * A caching channel with knowledge of the protocol can use an an aggressive expiry strategy to manage the communication. An effective caching channel requires analysis of the continuations by the caching mechanism: i.e. two subsequent continuations usually share a lot of common data. Eploiting this redundancy is the task of the cache. Also, the expiration strategy probably needs to be based on common usage. This needs to be tuned in the field. So I wonder, can't this be solved by "synchronizing" two virtual machines, say two CEK machines, one on the server and one on the client? This would be quite similar to the way two humans talk: we don't really have a mechanism to transfer our continuations, but we can try to model the other's state to get by with very little information traveling accross. TODO: * Look at some work on continuations in web programming (i.e. Shriram Krishnamurthi[1] or Jay McCarthy[2]) to see if this idea is already there. * Look at prog@vub work on this[3][4]. [1] http://www.cs.brown.edu/~sk [2] http://faculty.cs.byu.edu/~jay/home/ [3] http://prog.vub.ac.be/amop/research/dgc [4] http://prog.vub.ac.be/amop/research/ambientrefs Entry: Filesystems as Graphs Date: Mon Jun 15 13:48:49 CEST 2009 I ran into an interesting pattern trying to solve .tex -> .dvi -> .png conversion. It is a way to manage temporary files used in orchestrating the invokation of external programs. Classical file systems, by the way they are _used_ in unix-like utilities, are quite low-level data structures. They cannot support garbage collection because _references_ to files are not explicit. A filesystem is a finite function (hash table). Since it cannot be guaranteed that this function won't be evaluated at some arbitrary point, it has to be kept around in its entirety. This kind of late binding makes garbage collection impossible. By replacing this data structure with a graph (a Scheme code/data structure) files can be managed using the graph memory manager. Wrap temporary files in graph nodes, and ensure a 1-1 correspondence to these (meta)objects and the file's content, either in memory or on disk. Practically: * This is essentially independent of the data storage / caching strategy. It is possible to perform operations on objects by temporarily serializing them to disk, running external programs that produce more files, and bring those back into memory. The most elegant solution would be a filesystem interface towards external programs, but simply save+execute+load is good enough as a first attempt to implement the essential logic. * The effect of external programs can be localized. Filesystem operations (unix utilities) still work in this view. What is better though is that effects can be managed locally: create a temp directory with files, perform some external processing on it, and map the relevant results back into the graph store. Entry: Goodbye Smalltalk? Date: Mon Jun 15 14:07:43 CEST 2009 The previous post[1] advocates the explicitness of all references. Instead of using this just for temporary file management, can't we use it for _all_ file management? Can we view all files as temporary? This is a shift in paradigm about how to think about data: instead of looking at data as a dumb collection of bits, implicitly connected to a program that uses it (an interpreter), you never disconnect it from its use (semantics). More generally, it sort of advocates the abolishment of the principle of late binding. Goodbye Smalltalk? Actually, it seems to put late binding into a clear perspective: it acknowledges data use cannot be anticipated. However, it seems to be exactly that which makes turing machines so hard to understand. This is like making the leap from turing machines to (static) boolean circuits as is done in complexity theory to make the subject more managable. [1] entry://20090615-134849 Entry: GC and Cache Date: Thu Jun 18 13:50:28 CEST 2009 They are related but not quite the same. What if it would be possible to use non-GCd data as a cache? I.e. define a collection of objects that are essentially stored on disk, but which sometimes need to be loaded into memory. The object in memory is GC-able, but is it possible to provide some special mechanism to see if it got GC'd? I guess I'm looking for weak references. [1] http://docs.plt-scheme.org/reference/weakbox.html Entry: 64bit Date: Thu Jul 9 16:52:40 CEST 2009 Oh yeah, I forgot. This is a mess[1]. The different ways of being incompatible: Data Type LP32 ILP32 ILP64 LLP64 LP64 ---------------------------------------------------- char 8 8 8 8 8 short 16 16 16 16 16 int32 32 int 16 32 64 32 32 long 32 32 64 32 64 long long (int64) 64 pointer 32 32 64 64 64 In 1995, a number of major UNIX vendors agreed to standardize on the LP64 data model for a number of reasons... So, basicly you only need to care about ILP32 and LP64, where long is the size of a pointer. [1] http://www.unix.org/whitepapers/64bit.html Entry: Partial Evaluation Date: Wed Jul 15 09:37:11 CEST 2009 Program analysis (PA) is undecidable because of state dependent control flow. The culprit is the "IF" statement, or anything that boils down to turning a form of data into an execution branch target. Partial evaluation (PE) is a form of PA. Effective partial evaluation is essentially about trying to figure out which "IF" statements can be computed at run time. Picking the wrong one can lead to infinite code size. An interesting approach is to mix lazyness with PE to keep infinite code structures under control. -- The problem with PE is time/space resource analysis. It is not always possible to assess how much time or space recursive/looping code will take. Given the right representation I think it becomes more practical to get close to some reasonable definition of optimal behaviour. You need to `dodge the recursion' by concentrating on combinators with easy-to-manage time and space properties. Entry: Small-step vs. Big-step operational semantics Date: Thu Jul 16 14:20:11 CEST 2009 Wikipedia[1] isn't very clear here: In computer science, small step semantics formally describe how the individual steps of a computation take place in a computer-based system. By opposition big step semantics describe how the overall results of the executions are obtained. From Pierce[2] 3.4 p.32: (reformulated) The small step style of operational semantics is sometimes called "structural operational semantics" and specifies reduction by means of a transition function for an abstract machine. The meaning of a term is the halting state of iterative application of the transition function. Big step style evaluates a term to the end result in a single transition. [1] http://en.wikipedia.org/wiki/Small_step_semantics [2] isbn://0262162091 Entry: Constraint Programming Date: Thu Jul 16 15:48:19 CEST 2009 Constraint programming is rougly based on replacing functions as a primitive building block by equations (relations or multi-directional functions). To make things feasible, constraints are supposed to be _locally enforced_ with some global issues handled using backtracking. It looks like in first approximation it's best to start with [2] as it contains virtually the same as the introduction of [1]. [1] http://www.ai.mit.edu/publications/pubsDB/pubs.doit?search=AITR-595 [2] http://mitpress.mit.edu/sicp/full-text/book/book-Z-H-22.html#%_sec_3.3.5 Entry: Managing External Resources Date: Sun Aug 2 09:28:46 CEST 2009 I'm wondering if it is possible to replace external resource management (open/close) with garbage-collection. Is this remote resource management a `real' problem or is it merely an implementation artifact? It seems to me that the reason is that "reachability" isn't always observable[4]. A good example is webserver continuations. Will the client come back, or did it loose interest? We can simulate reachability using timeouts. I've talked about this before: it seems to be essentially a caching[5] problem. I wonder what is the link with linear languages (possibly extended with lazy copy) where open/close resource management is quite natural. [1] http://okmij.org/ftp/Haskell/misc.html#fp-arrays-assembly [2] http://okmij.org/ftp/Scheme/enumerators-callcc.html#Finalization [3] http://okmij.org/ftp/papers/LL3-collections-enumerators.txt [4] http://prog.vub.ac.be/amop/research/dgc [5] entry://20090615-131905 Entry: Connected Ideas Date: Sun Aug 2 09:54:52 CEST 2009 Explain how these are related: Linear / stack-based memory management. Deforestation. Task scheduling and dependency analysis. Enumerator inversion and finalization. Distributed GC and Remote continuation/context cache. Concurrency-oriented programming. Message passing concurrency. Entry: Engine vs. Coroutines Date: Mon Aug 3 11:12:34 CEST 2009 The difference between an engine[1] and a coroutine is that an engine uses timed preemption, while a coroutine only uses volontary preemption. ``An engine runs until its fuel runs out''[3]. It seems to me that this is somewhere between full nondeterministic preemption and cooperative multitasking, by making the preempt points happen only at control points with consistent state -- i.e. not in the middle of some low-level routine that uses a shared resource. [1] http://en.wikipedia.org/wiki/Engine_%28computer_science%29 [2] http://list.cs.brown.edu/pipermail/plt-scheme/2002-September/000620.html [3] http://www.scheme.com/tspl2d/examples.html#g2433 Entry: Task-based C interface Date: Mon Aug 3 14:36:10 CEST 2009 Here's the basic idea I'm trying out for writing reusable C primitives for different kinds of scripting languages (operating systems, in essence Scheme and PF, a linear concatenative language). C code shouldn't CONS. C code should only communicate with the outside world using _channels_ which have a limited number of primitive types, but do not allow for aggregates. All aggregate data types should be transferred using _protocols_ : explicit sequencing of primitive types to represent data structure. Doing it this way makes it possible to write C code that doesn't perform any memory management, except for allocation of local variables. This makes automatic wrapping very simple, and allows a single abstract object: the task (zipper). Moreover: the code itself can be incorporated in a static scheduling policy (whole program optimization: compile time weaving). This technique is essentialy premature deforestation: eliminating intermediate data structures using compile-time transformation. The slogan is something like this: Replace data structures with protocols. From the outside, it doesn't matter that C code is stateful, as long as _all_ the state is contained in the continuation. The essential insight is that all memory and control management can be abstracted. Funny how I got here. I've rediscovered concurrency-oriented programming by looking for the simplest way to interface C code with Scheme. Entry: EWD concurrency Date: Mon Aug 3 23:58:57 CEST 2009 I believe these are the original semaphore papers. [1] http://www.cs.utexas.edu/~EWD/transcriptions/EWD00xx/EWD51.html [2] http://www.cs.utexas.edu/~EWD/transcriptions/EWD00xx/EWD54.html [3] http://www.cs.utexas.edu/~EWD/transcriptions/EWD00xx/EWD57.html [3] http://www.cs.utexas.edu/~EWD/transcriptions/EWD00xx/EWD74.html Entry: Finalizers Date: Sun Aug 16 16:37:50 CEST 2009 It's generally agreed that automatic, non-synchronous GC and finalizers interact badly. Where does the problem lie? That depends on how you look at it. - GC should be synchronous - Represent resources as ``pooled-with-spare'' to behave more like memory. Which one is best? Ultimately I believe that representing all resources as pooled-with-spare is not realistic: essentially we're bounded by _physical_ resources, and if there's only one, you better have synchronous GC. But, if these resources _are_ modeled as memory, it might work: any pooled resource that runs out of candidates to hand over will issue a _global_ GC to determine if it is still reachable. The global nature of GC makes it rather hard to manage. This makes me think that RC based management really isn't going to go anywhere, unless you can propagate the ``out of'' event all the way up from device drivers to the toplevel memory GC. It is probably possible to remove some of the arbitrary RC managed resources (i.e. Unix file handlers) by ``peeling them open'' to reveal the real resources (hardware signalling the device driver), and have them propagate these signals all the way to the top level GC whenever they occur. Entry: Synchronization Date: Mon Aug 17 17:44:46 CEST 2009 So what is a monitor[1] exactly? [1] http://en.wikipedia.org/wiki/Monitor_(synchronization) Entry: Concurrency Date: Mon Aug 17 17:44:58 CEST 2009 Explain the difference between these different control structures: - Pre-emptive unix tasks and threads - Cooperative tasks (yield) - CSP processes - symmetric coroutines - asymmetric coroutines - partial and full continuations - one-shot continuations - Icon's generators[1] and goal-directed evaluation[2] - Iterators[3] [1] http://lambda-the-ultimate.org/classic/message1851.html [2] http://www.cs.arizona.edu/icon/books.htm [3] http://home.pipeline.com/~hbaker1/Iterator.html Entry: Tagless Interpreters Date: Sun Aug 23 12:16:37 CEST 2009 I am going to try to understand this[1]. It is about building a ``tagless staged definitional interpreter''. It touches on some ideas that I've seen vague hints of writing the PF and SC interpreters, and trying to see the tradeoffs in compiling/interpreting LC-based languages. First, some terminology: Initial algebra[2]: ``In mathematics, an initial algebra is an initial object in the category of F-algebras for a given endofunctor F. The initiality provides a general framework for induction and recursion.'' It seems to be used related to recursive types, which are the yin of a yang: recursive functions operating on the types. I guess the Coalgebraic[3][4] structure is those of recursive functions? HOAS: higher order abstract syntax. COGEN: code generator. 1.1 TAGS The paper starts with explaining the use a universal type `u' (a tagged union) to represent a dynamic type, to be able to write something like: eval : u list -> exp -> u Where `u list' is a DeBruyn environment (variables are then DeBruyn indices). The disadvantage is that in this representation, `eval' is a partial function: i.e. it needs to handle cases where it is passed invalid input, i.e. non-closed terms or ill-typed ones. In practice however, when a term is closed and well-typed these cases do not occur. Essentially, the algebraic types fail to express in the meta language that an object expression is closed and well-typed. 1.2 TAGLESS Current approaches to solve this uses complex data types like GADTs or dependent types. The paper presents an approach that doesn't require this, by representing object programs using ordinary functions instead of data structures. This approach turns evaluation of open object terms into _ill-typed_ terms in the meta langauge. Neat! REMARKS There is a link between this kind of representation and Staapl's Coma abstraction: representing target code as procedures operating on a stack machine code stack. [1] http://okmij.org/ftp/tagless-final/APLAS.pdf [2] http://en.wikipedia.org/wiki/Initial_algebra#Use_in_programming_theory [3] http://en.wikipedia.org/wiki/Coalgebra [4] http://en.wikipedia.org/wiki/F-coalgebra [5] http://okmij.org/ftp/Computation/tagless-typed.html [6] http://lambda-the-ultimate.org/node/2438 Entry: Eager rewriting vs. ``something more general'' Date: Mon Aug 24 19:59:35 CEST 2009 In the light of peephole optimizations and the Joy machine in [1]. How can you keep a rewriting system managable, if you don't pin it down manually by performing only eager substitutions? I guess what I'm looking for is confluence[2]. And I have this hunch that _interesting_ rewriting systems for optimizations are _not_ going to be confluent: they probably lead to a large number of possible irreducable forms which need extra constraints to isolate a single solution. I read in Muchnick[3] (Chapter 6: Producing Code Generators Automatically, 6.2.1 p.140) that the standard peephole rewriting techniques are essentially SLR(1) parsers. Does this have anything to do with that? [1] http://zwizwa.be/darcs/libprim/pf/joy.ml [2] http://en.wikipedia.org/wiki/Confluence_%28abstract_rewriting%29 [3] isbn://1558603204 Entry: Monad transformers / Arrows Date: Thu Aug 27 19:12:49 CEST 2009 I'm interested to find out the link between: State threading using Monads and Arrows Monad / Arrow transformation (building new such constructs by composing the others). More specifically, I've run into the case in concatenative programming where you want to thread things different from the stack, and where you also want to combine different threading mechanisms (i.e. a data stack and a compilation `writer' monad). However, the way (i understand) that monad transformers work is that you always need to pick an _order_ of wrapping: it isn't a side-by-side thing like i.e. linear operators and vector subspaces. My hunch is that the latter is about Arrows. Entry: Parser Combinators Date: Fri Aug 28 14:49:21 CEST 2009 [1] http://en.wikipedia.org/wiki/Parser_combinator [2] http://shaurz.wordpress.com/2008/03/11/haskell-style-parser-combinators-in-scheme/ Entry: Partial Continuations Date: Fri Aug 28 14:49:55 CEST 2009 From the bit-twiddler's perspective, a partial continuation is not much more than a segment of the call stack represented as a function, where the full continuation is the whole call stack, represented as a ``function that doesn't return''. The nuances come from how you specify the marking that enables you to isolate the a segment between the current point and the one marked, and how you execute code in the continuation that remains when this segment is removed. From [3]: ``The operators shift, control, control0, shift0 are the members of a single parameterized family, and the standard CPS is sufficient to express their denotational semantics.'' According to Oleg in [3], ``Shift to Control''[4] is the one to read. This paper is about CPS transforms, and not directly of interest to me atm. I'm already quite happy with an understanding of how to implement it dynamically. Anyways, the gang of 4: +F+ shift +F- control -F+ shift0 -F- control0 Here eFk denotes whether the shifted expression will be run inside a delimited context e=+ or not e=- and whether the continuation will be delemited k=+ or not k=-. [1] http://okmij.org/ftp/Computation/Continuations.html#generic-control [2] http://lambda-the-ultimate.org/node/966 [3] http://lambda-the-ultimate.org/node/606 [4] http://www.cs.rutgers.edu/~ccshan/recur/recur.pdf [5] http://docs.plt-scheme.org/reference/cont.html Entry: Monads and shift/reset Date: Fri Aug 28 17:23:03 CEST 2009 Two directions: Representing Monads[1] by Filinski. and a paper by Wadler[2] about CPS / Monads / Delimited control / ... From a very high level of intuitive understanding: PCs are abstractions around segments of executions context and jumps -- Monads are abstractions around threaded state and operation sequencing. Put in those words, a link seems plausible. EDIT: see a Scheme example in [4]. [1] http://eprints.kfupm.edu.sa/62283/1/62283.pdf [2] http://www.brics.dk/~hosc/local/LaSC-7-1-pp39-56.pdf [3] http://homepages.inf.ed.ac.uk/wadler/papers/essence/essence.ps.gz [4] entry://20090906-105507 Entry: Linear Lisp and Uniqueness Types Date: Sun Aug 30 10:59:41 CEST 2009 Look at Clean[1] and Uniqueness Types[2] and try to see the relation with libprim's linear PF machine. [1] http://en.wikipedia.org/wiki/Clean_Language [2] http://en.wikipedia.org/wiki/Uniqueness_type Entry: Linear Type Systems Date: Sun Aug 30 11:24:57 CEST 2009 Main idea: variables disappear from scope once referenced. From the logic pov.: the linear logic is concerned with _transformation_ of resources as opposed to an ever growing accumulation of facts (classical logic). In a nutshell, a logic is built of a set of axioms and a set of combination rules. In classical logic, you can start from a set of propositions and construct larger sets of propositions by chaining combination rules (inference rules). In linear logic, to add a new proposition to a ``context'' (multiset), you need to _consume_ one you already have, preventing you to ever use it again. It is mentioned in the wikipedia article[2] that that linear logic can be seen as ``the interpretation of classical logic by replacing boolean algebras by C*-algebras''. This vaguely rings a bell, conforming to my intuition of computation as an endomap of a constant-size space instead of an ``explosion'' of an initial seed. Conservation of resources in linear logic works a bit like conservation of energy in a classical hamiltonian system[4]. Upto now, I've mostly been talking about ``linear memory management'' as ``conservation of memory'', mostly in the form of binary tree nodes (CONS cells). [1] http://en.wikipedia.org/wiki/Linear_type_system [2] http://en.wikipedia.org/wiki/Linear_logic [3] http://en.wikipedia.org/wiki/C*-algebras [4] http://en.wikipedia.org/wiki/Hamiltonian_mechanics Entry: Proof Calculus Date: Sun Aug 30 15:44:55 CEST 2009 A proof calculus[2] corresponds to a family of formal systems that use a common style of formal inference for its inference rules. A rule of inference[3] (also called a transformation rule) is a syntactic rule used in a formal system which is used to produce valid statements within that system. Examples of proof systems: * Hilbert-style deduction system[4], where a formal deduction is a finite sequence of formulas in which each formula is either an axiom or is obtained from previous formulas by a rule of inferences. * Natural deduction[5] is an approach to proof theory that attempts to provide a deductive system which is a formal model of logical reasoning as it "naturally" occurs. This approach is in contrast to axiomatic systems which use axioms. The sequent calculus[1] is a widely known proof calculus for first-order logic. The term "sequent calculus" applies both to a family of formal systems sharing a certain style of formal inference, and to its individual members, of which the first, and best known, is known under the name LK, distinguishing it from other systems in the family. The sequent calculus LK was introduced by Gerhard Gentzen as a tool for studying natural deduction. In the sequent calculus all inference rules have a purely bottom-up reading. In natural deduction the flow of information is bi-directional: elimination rules flow information downwards by deconstruction, and introduction rules flow information upwards by assembly. Thus, a natural deduction proof does not have a purely bottom-up or top-down reading, making it unsuitable for automation in proof search, or even for proof checking (or type-checking in type theory). [5] In Kleene's Metamathematics, Chapter V, p.86[6] ``The labour required to establish the formal provability of formulas can be greatly lessened by using metamathematical theorems concerning the existence of formal proofs.'' -- remarks The way I understand this is that natural deduction has no axioms (no primitive assumptions) but contains all its structure in the inference rules: the way propsitions can be combined to create new propositions. In contrast, a Hilbert-style deduction system has both axioms and inference rules (i.e. Modues Ponens). Reading more in [6]. It seems that meta-mathematics is quite similar to lisp macros: the meta math serves the purpose to abstract over the _construction_ of formal proofs = meaningless pure syntax. Because formal logic's composition mechanisms at the foundations of mathematics are so barbaric, a macro system makes sense to lift the tedium. What I find surprising however is that this meta system itself is _not_ formal: proofs in the meta system are not formal proofs, but are intuitive justifications of correctness. I'd say this is because of the chicken-and-egg problem: you can't use a formal system to do this, because you don't _have_ one yet (remember, your using this meta-math to _build_ a formal system). This sounds like compiler bootstrapping to me. [1] http://en.wikipedia.org/wiki/Sequent_calculus [2] http://en.wikipedia.org/wiki/Proof_calculus [3] http://en.wikipedia.org/wiki/Inference_rules [4] http://en.wikipedia.org/wiki/Hilbert_system [5] http://en.wikipedia.org/wiki/Natural_deduction [6] isbn://0720421039 Entry: Dependent Types Date: Sun Aug 30 19:13:49 CEST 2009 In [1] I read: ``Dependent type theory in full generality is very powerful: it is able to express almost any conceivable property of programs directly in the types of the program. This generality comes at a steep price — checking that a given program is of a given type is undecidable. For this reason, dependent type theories in practice do not allow quantification over arbitrary programs, but rather restrict to programs of a given decidable index domain, for example integers, strings, or linear programs.'' In [2] I read: ``If the user can supply a constructive proof that a type is inhabited (i.e., that a value of that type exists) then a compiler can then check the proof and convert it into executable computer code that computes the value by carrying out the construction. The proof checking feature makes dependently typed languages closely related to proof assistants. The code-generation aspect provides a powerful approach to formal program verification and proof-carrying code, since the code is derived directly from a mechanically verified mathematical proof.'' [1] http://en.wikipedia.org/wiki/Natural_deduction [2] http://en.wikipedia.org/wiki/Dependent_types Entry: Dynamic vs. Static Types Date: Sun Aug 30 19:18:57 CEST 2009 I've been reading up a bit about logic and type systems. This is seriously complex stuff, in the sense that there are a great number of different ways to approach static structure. If you compare this to the simplicity of dynamic typing (predicates / set membership without static structure) I sometimes if people that use this heavy machinery get anything done at all. I do see that there can be a lot of payback if your applications are complex, but exhibit some arbitrary but well-defined static structure. Heavy types allow some of the correctness burdon carried by the compiler, at the expense of 5+ years graduate studies of the programmers :) It seems that typing is about defying undecidability. What you really want is the machine to write your program for you. Since that's quite difficult, maybe you scale down expectations and ask the machine to at least tell you that what you just did is not what you really intended. When this ``intention'' can be codified in a structure that doesn't lead to undecidable problems when trying to interpret it, you can offload some of the thinking to the machine. For dependent types this can be taken quite far: as long as you (the programmer) can help the verification system to solve the undecidable part of the problem (provide a proof for certain parts) then it can use this to check the rest. Entry: Recent discussions with Dominikus about Concatenative Languages Date: Mon Aug 31 14:14:50 CEST 2009 A _lot_ of topics have been covered. I think this counts as the most exciting discussion I've ever had with anyone on the topic of concatenative languages. Such intensive discussions tend to drive you right to the center of your ignorance. I'm going to try to list what I've learned, relative to my own endeavors: * It pays to distinguish 3 kinds of Joy machine variants: linear with intensional quotations, nonlinear with extensional quotations and a staged linear/nonlinear version with extensional quotations and linear run-time code structures (continuations, compositions, partial applications, ...) * Continuations and tasks deserve to be treated as different things in a concatenative stack language. The former doesn't include the parameter stack (partial continuations are stack->stack functions) and the latter does. * Phase separation is important: intensional code quotation is difficult to specify other than a VM with late binding. Static / early binding helps here. * The pattern matching approach in Staapl isn't so bad. It would be nice to find a more elegant rewriting _syntax_ for it, but the semantics seems to be just what I need to get the desired compile time reductions. Generalizing the rewrite semantics seems to open up a can of worms. However, it might be beneficial to do this for _optimizations_ since the interesting ones they tend to be non-confluent. Then, about what I don't understand (warning: buzzwords): * The link between stack languages and other state threading mechanisms. The key ignorance seems to concentrate on Monads, Monad transformers and Arrows. * How to use the above pure functional description to build a lazy, typed, partially evaluated system. Entry: Stackless Extensional Joy Date: Mon Aug 31 16:36:23 CEST 2009 I'm involved in a discussion with Dominikus about writing a specification of Joy in terms of rewrite rules. It is his opinion (and I believe also Manfred's) that it is necessary to explicitly use a data stack in the description. EDIT: * The discussion was resolved by making a clear distinction between language semantics (map from syntax to some representation domain, i.e. unary functions) and purely syntactic (meaninless) manipulations. The original MvT paper[2] doesn't make this distinction either (it talks about rewriting to specify _semantics_ not to specify legal syntactic operations in a formal system without involving semantics.) In the rules below, a purely syntactic rewriting system only gives you the ability to state that two concatenative forms are equivalent. Then later you can attach a meaning to that (they represent the same program). * Proper treatment of quotations is important as a language design issue (early / late binding), but when you're specifying an operational semantics, of course any one would work. I.e. in the discussion it was irrelevant and this put me on a side track. * In my understanding, the discussion needed a clear definition of value (irreducable expression) and redex (reducable expression). Using a stack to specify this makes things simpler, but is not necessary. However, in the specification I produce, a stack can be immediately seen to emerge. Conclusion: if (purely syntactic) reduction rules are of the form a .. m A -> n .. z With `a' .. `z' values and `A' a non-value, the values on the left of `A' can be _interpreted_ as a stack, as can the values on the right of '->'. In that light, here's what I wrote (the first section about quotations is irrelevant). * * * It is mine that this is not necessary. What _is_ necessary is a clear treatment of quotations, respecting the non-isomorphic map from syntax to meaning. Thusfar this has been a series of hunches. I'd like to see why I really believe this (me being wrong is OK too..) The problem appears to lie with the fact that the homomorphism S which attributes meaning (a function) to syntax (a concatenation of primitive symbols) is not one-to-one. I.e.: S( [ 1 dup ] ) = S( [ 1 1 ] ) Is this related to ``never reduce under lambda'' ? I.e. there is only one _expansion_ operator: `i'. All the others are combinators that _re-arrange_ and _contract_ things. It is not allowed to perform _any_ substitutions _inside_ quotations. The only place where this is legal is in the current expansion of the program: a flat list of tokens and quotations. Does this help to eliminate the data stack? I think I'm on the wrong track: to eliminate the data stack it probably doesn't matter much whether code is intensional or extensional. What matters is that it needs to be really well-defined what a _value_ is. I.e. what does `swap' do actually do? Can you explain that _without_ referring to the stack? Syntactically, the quotation "a b swap" can be reduced to "b a" but _only_ if both `a' and `b' are values. Isn't this really just LR(1) parsing? In [2] manfred alludes to all this.. However, going on about stackless semantics, he says: ``This is the key for a semantics without a stack: Joy programs denote unary functions taking one program as arguments and giving one program as value. '' This seems to be the ``staged'' interpretation. A program denotes a function, and evaluation is finding the simplest syntactic form of this function. * * * 1. values It seems that what really matters is to distinguish values from non-values, not necessarily using a stack. Applicability of rewrite rules representing `stack words' depend on leading symbols being _values_. a b swap == b a only if `a' and `b' are values. If reduction order doesn't matter (confluent rewrite rules), then picking an order that leads to a simple algorithm seems like a reasonable default approach. However, my position is still that the stack is purely an artifact of the way you sequence the reductions. In MvT's words[2]: ``It is clear that such a semantics without a stack is possible and that it is merely a rephrasing of the semantics with a stack. Purists would probably prefer a system with such a lean ontology in which there are essentially just programs operating on other programs.'' So I guess I'm wearing the purist hat for a change.. The reason I do is that there might be a benefit in generalizing this to non-confluent systems that will give you a set of possible reductions. 2. left to right order One way to do the reuctions is from left to right. Values are skipped while redexes recombine with the values. This will then effectively yield a stack of values as a result. I.e. in the following, everything on the left of '|' is fully reduced (the stack) and the rest are possible redexes (the code). | 2 5 + 3 * 2 | 5 + 3 * 2 5 | + 3 * 7 | 3 * 7 3 | * 21 | This left-to-right approach guarantees that a rule that needs values on it's left side will allways be applicable. The general rewrite system that starts at any place in the code can't do this: some symbols that need values on the left side won't have them (yet). 3. other orders? So, what is the benefit of general rewrite rules, without specifying an explicit sequencing order? It seems that Forth's parsing words (i.e. `variable') use the _other_ direction: here symbols can modify the semantics of symbols to the right. Also non-confluent rewriting systems that describe optimizations and not language semantics seem to need a non-local approach. Another useful point is to use global pattern matching to find isolated words: if the reduction order doesn't matter, then (in a pure functional setting) these segments are value-isolated and can be run in parallel. * * * So, words are not values, but values are words. DEFINITION: A value is word that does not produce a reducable expression if it is right-appended to a sequence of values. It's a bit anti-climactic maybe, but I think that's the essence.. Dominicus calls this ``constructor functions'', but it requires a definition that depends on a data stack: a domain that is used to chain functions of the semantic space together. [1] http://www.latrobe.edu.au/philosophy/phimvt/joy/j00ovr.html [2] http://www.latrobe.edu.au/philosophy/phimvt/joy/j07rrs.html Entry: Syntax vs. Semantics Date: Mon Aug 31 20:46:22 CEST 2009 What I get: ``running'' a program is different from ``compiling'' it. What I don't get: where is the bottom line semantics? Is it really just the physical processes that implement the interaction of a concrete machine with the world? Is it the model of physics? Is it a simplified model of combinatory logic and memory? It looks like at some point you have to stop this sillyness and attach a semantics, associating mathematical functions with the syntax. The higher up the chain you can do this, the more structure it will probably have and the easier it becomes to reason about the meaning of programs. EDIT: I found this on [1]: ``... critics [of operational semantics] counter that the problem of semantics has just been delayed. (who defines the semantics of the simpler model?).'' It looks like the confusion is about operational semantics being _relative_ while at some point you do need something tangeable to have any meaning at all. However, it is possible to allow for syntax transformation based on preservation of relative semantics. What I still can't make precise: how can rewriting (a syntactic operation) be the specification of the semantics of a formal language? [1] http://en.wikipedia.org/wiki/Formal_method Entry: Types vs. Staging vs. Abstract Interpretation Date: Thu Sep 3 08:43:56 CEST 2009 I'd like to know more about the formal relation between types and abstract interpretation[1][3], and types and staging[2]. On an intuitive plane it does seem reasonable to blur the 3 concepts: type checkers / inference engines interpret programs in the compilation stage. Following some links I ended up here[4]. Frank Atanassow: ``Personally, I think the hits of the near future (ten or fifteen years) in the statically typed realm will be staged/reflective programming languages like MetaML, generic programming as in Generic Haskell and languages which support substructural logics like linear logic. I see opportunities for combining these in a powerful way to support lawful datatypes, i.e., datatypes which satisfy invariants in a decidable way. The underlying ideas will probably also pave the way for decidable behavioral subtyping in OO-like languages. A unified foundation for functional, procedural, OO and logic languages is also something I predict we will see soon. The aspect-oriented stuff will be probably sorted out and mapped onto existing concepts in current paradigms. (I don't think there is anything really new there.) Also, I think that in twenty years there will no longer be any debate about static vs. dynamic typing, and that instead most languages will instead provide a smooth continuum between the two, as people realize that there is no fundamental dichotomy there. Also languages will have more than one level of types: the types themselves will have types, and so on. But we will be able to treat types in a similar way to the way in which we treat values: we will be doing computations with them. There will be an increased emphasis on efficiency in the future, and I think we will see fewer aggressively optimizing compilers; instead it will be the programmer's job to write his programs in such a way that they are pretty much guaranteed to be efficient with any reasonable compiler. This sounds like a step backward, but it won't be because we will be better able to separate concerns of specification from implementation. (The reason I see more emphasis on efficiency is partly because I think that wearable computers and other small computers will become ubiquitous, partly because I think we will have the technology to do it when substructural logics invade static typing, and partly because we are starting to understand how to assign static type systems to low-level languages like assembly language.)'' [1] http://www.di.ens.fr/~cousot/COUSOTpapers/POPL97.shtml [2] http://lambda-the-ultimate.org/node/2575 [3] http://lambda-the-ultimate.org/node/220#comment-1695 [4] http://lambda-the-ultimate.org/classic/message6475.html#6506 Entry: Metaprogramming Patterns Date: Thu Sep 3 11:27:47 CEST 2009 Two topics: - I am focussing on building stagable abstractions on top of combinatory circuits for real-time DSP applications. This should yield a series of small DSLs and DSL -> C compilers. - Jacques Carette's approach to finding a list of metaprogramming patterns[1]. The latter paper talks about 1. the need for CPS-style programming to assure proper name generation (``let insertion'') for storing intermediate results and 2. a way to solve the notational problems using a monad (which can then accomodate other effects). I currently don't see how this can be used to bring these techniques ``to the people'', for the simple reason that it takes me quite some effort to follow the notation, and I already spent considerable effort reading about the field. Types bring security, but complicate matters quite a bit. The payoff might be large, but the investment isn't neglegible: sometimes it takes a whole lot of maneuvering to express the static structure you want in the type system. The monadic style can be relaxed by using control operators[2]. For practical purposes it seems that untyped abstract-evaluation based approaches are a better way to gently add this to the toolbox of nuts&bolts embedded software engineering, with the typed approach currently limited for the construction of software tools by experts in both the domain _and_ typed functional programming. What matters in practical / simple DSLs is to provide a good abstraction (semantics) and notation (syntax), and to allow for static analysis. Whether the generators _themselves_ are statically verified is an added safety I see only pay off in very specialized and error-prone generator applications, unless the notational and conceptual overhead can be reduced (as seems to be the idea of [2]). [1] http://www.cas.mcmaster.ca/~carette/publications/scp_metamonads.pdf [2] http://okmij.org/ftp/Computation/staging/circle-shift.pdf Entry: Local consistency Date: Thu Sep 3 13:47:40 CEST 2009 Context: I'm looking at the jump from data flow programming -> local consistency rules[1]. More specifically, in my application at hand, are local constraints enough? (as opposed to global constraints like simultaneous sets of equations) If so, then (staged) constraint propagation is a good solution. Let's look at the ``trivial constraint language'' described in chapter 2 of [2]. It focusses on local propagation only. Apart from simple arithmetic constraints it is useful for allowing constraints that can't be directionalized fully, i.e. writen in terms of inequalities like m = max(a,b) The remaining chapters then talk about removing some of the deficiencies: data types, abstraction mechanisms, multiple use (tracking changing data), global issues and dependencies of constraints (why is there a contradiction?). In [1], a constraint satisfaction problem is defined as a set of variables, a set of domains, and a set of constraints. Variables and domains are associated: the domain of a variable contains all values the variable can take. A constraint is composed of a sequence of variables, called its scope, and a set of their evaluations, which are the evaluations satisfying the constraint. [1] http://en.wikipedia.org/wiki/Local_consistency [2] http://www.ai.mit.edu/publications/pubsDB/pubs.doit?search=AITR-595 Entry: Monads vs. delimited control Date: Sun Sep 6 10:55:07 CEST 2009 So, practically: I have a need to turn a function that uses explicit mutation of a dynamic variable into a pure function. How to do this without adding explicit state threading? Simplified: you want the dynamic parameter to be included inbetween the prompt and the shift / control operator that reifies the dynamic context into a function. I.e. something like this: #lang scheme/base (require scheme/control) (provide make-counter state) (define state (make-parameter #f)) (define (make-counter) (reset (parameterize ((state 0)) ;; param sandwiched between `reset' and `shift' (let loop () (printf "state = ~s\n" (state)) (shift k k) (state (add1 (state))) ;; mutate (loop))))) The problem is that it doesn't give you referential transparency: the mutation is still visible when a certain continuation gets invoked multiple times, as the parameter's storage location is a shared value. Every continuation should somehow include its own value of the parameter. This can be assured by saving the parameter's value whenver a continuation is created, and resetting it whenever it is resumed: #lang scheme/base (require scheme/control) (provide make-counter state invoke) (define state (make-parameter #f)) (define (make-counter) (reset (parameterize ((state 0)) (let loop () (printf "state = ~s\n" (state)) (state (let ((s (state))) (shift k (cons k s)))) (state (add1 (state))) ;; mutate (loop))))) (define (invoke ctx) ((car ctx) (cdr ctx))) Here the continuation is extended with a context value `s' which is passed to the continuation by `invoke' and used to reset the `state' parameter upon context entry. It looks like this is related to the ``continuations + storage cell'' approach in [1], though a bit too dense for me atm. Some related work. In [2] investigates the interaction between DC and `dynamic wind', while [3] focusses on DC and dynamic binding. I must admit I fail to understand the subtleties, as it all seems ``obvious'' to me from the pov. of continuation marks. I'll come back to this after using it in practice. To summarize: partial continuations capture control, while parameters are useful for ``locally global'' threaded state in case it's not _practical_ to implement that lexically. You locally give up referential transparency to increase modularity and simplicity of function interfaces. [1] http://eprints.kfupm.edu.sa/62283/1/62283.pdf md5://e60a51d38011e8dca44540f590643001 [2] http://people.cs.uchicago.edu/~robby/pubs/papers/icfp2007-fyff.pdf [3] http://okmij.org/ftp/Computation/dynamic-binding.html#DDBinding Entry: APL & J Date: Mon Sep 7 22:51:52 CEST 2009 Time for something different. Hmm.. there's no open source implementation? [1] http://en.wikipedia.org/wiki/J_programming_language Entry: Towards the best collection traversal interface Date: Wed Sep 9 09:50:14 CEST 2009 I find this idea quite intriguing, as it seems to be central to a lot of things I'm trying to understand. I'm putting the idea to the test by trying to avoid lists wherever possible, and use left fold instead. Enumerators are easily bridged to SRFI-41 lazy lists, eager lists and PLT Scheme sequences. The `enum->stream' operation uses reset/shift to invert control as in: (define (enum->stream enum) (reset (enum (lambda (el) (shift k (stream-cons el (k #t))))) stream-null)) Example: the `choice' operator[3] in Staapl, which uses enumerators for representing choices and results, making it easier to compose searches. (Internally, the choice enumerator is translated into a lazy stack of resume points.) [1] http://lambda-the-ultimate.org/node/1224 [2] http://okmij.org/ftp/Streams.html#enumerator-stream [3] http://zwizwa.be/darcs/staapl/staapl/machine/choice.ss [4] http://www.eros-os.org/pipermail/e-lang/2004-March/009643.html Entry: Environment / Continuation vs. Param stack / Return stack Date: Wed Sep 9 13:49:24 CEST 2009 The analogy doesn't hold, but there is some relation.. Can this be made more precise? The explanation could be centered around the following observation. In a concatenative language, there is no `lambda' to introduce new functional abstractions closed over a surrounding lexical environment. However, in a concatenative language it is possible to do something similar using the quotation-equivalent of `cons', `list' and `append'. More precisely, in a concatenative language one can take two quoations and `compose' them, and one can take a data item and turn it into a quotation that will reproduce it. In some sense, `lambda' performs a `cons' of code and environment. With a bit of a stretch of imagination, one could then see a data stack as an environment, and a quotation as a code body containing free variables. Entry: Bananas, Lenses, Envelopes and Barbed Wire Date: Fri Sep 11 12:55:25 CEST 2009 It's probably not very smart to try to do anything with ``algebra of programs'' without this work[1]. It's interesting to also look at Fokkinga's introduction to category theory[2]. From my particular perspective (emphasis on vectors) recursive data types are probably too general. However, the theory here could serve as some guideline, as recursive types will probably re-emerge as an implementation issue. Ok, the paper[1]. operator recursion pattern for example ------------------------------------------------------ bananas catamorphism fold, map lenses anamorphism unfold, map envelopes hylomorphism[3] (ana->cata) factorial barbed wire paramorphism Translated to the talk of mere mortals: an catamorphism is a data consumer, an anamorphism is a data producer, and a hylomorphism is a producer feeding into a consumer without an intermediate representation of the data structure that decouples them. A paramorphism is a catamorphism that ``eats its argument and keeps it too''. Functor: map types to types, and functions to functions. Arrow: ``wrap'' a function to operate on a different space. ... (etc.. glossing over details for a moment) The point in section 4. is to program in terms of the morphisms instead of using explicit recursion on data types. For each cata-, ana- and paramorphism W, 3 rules are provided: - evaluation rule - uniqueness property (induction proof to be a W) - fusion law The fusion laws are based on the concept of ``fixed point fusion'': f ( u g ) = u h <= f strict ^ f . g = h . f here u : (A -> A) -> A is the fixed point operator. In expanded form the LHS is quite obvious: f . ( u g ) = f . g . g . g . ... = h . f . g . g . ... = h . h . f . g . ... = u h The take-home argument (ignoring strictness issues related to the fixed point combinator) is that fusion for ana- and catamorphisms are left/right symmetric: ana: |( x )| . f = |( y )| <= x . f = f_L . y cata: f . (| x |) = (| y |) <= f . x = y . f_L Where f_L is obtained from f as some fixed point operation.. (???) (... I'm going to give this up for now, but this paper gives enough ideas to construct a less abstract version. ) [1] http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.41.125 [2] http://www.cs.utwente.nl/~fokkinga/mmf92b.pdf [3] http://en.wikipedia.org/wiki/Hylomorphism_%28computer_science%29 Entry: Staging Control Flow Date: Fri Sep 11 14:20:53 CEST 2009 It looks like all the things I'd like to do (making DSP/Control prototyping and finding correct implementations two orthogonal problems that do not involve duplication of effort) all has to do with staging control flow : how high can you make the level of abstraction while still guaranteeing that the eventual product is a bounded-time combinatorial circuit / state machine. Most of the DSP/Control applications have a very functional, parallel data flow character. What makes them difficult to implement is that they need to pass through the von-neumann bottleneck. There are a lot of choices to be made turning 1. equations into directed functions and 2. sequencing operations (control) and managing intermediate results (memory). A key paper is going to be this one[1]. Both from the perspective of making it possible to express the original algorithm in a straightforward way, and from the perspective of making all the design decisions explicit. Above I'm talking about moving from higher level languages down to some ideal low-level machine architecture. Mapping to _real_ hardware is then another problem that might involve ``lobotomizing'' a compiler and bringing decisions to the surface so they can take part in a global optimization process. [1] http://www.cas.mcmaster.ca/~carette/publications/scp_metamonads.pdf Entry: Control-Flow Analysis of Higher-Order Languages (Shivers) Date: Sun Sep 13 10:22:16 CEST 2009 Based on the techniques of CPS and non-standard abstract semantic interpretation (NSAS). The basic problem is described as an interdependency of two analysis phases: flow analysis needs a control flow graph, but because code can be bound to variables, the construction of the control flow graph needs flow analysis. In CPS Scheme, the problem is reduced to determining which call sites call which lambdas. This is because all control transfers are represented by procedure calls. This problem is represented as the search for a function L(c) which maps a call site c to a minimal set of lambda expressions that are possibly called at c. NSAS is described as a technique to construct a computable analysis for a certain property X. - Start with a denotational semantics S - Construct a _non-standard_ semantics S_X derived from S, that precisely expresses X. - Construct an _abstract_ version of S_X that trades accuracy for compile-time computability. Note that this is one of (the main?) practical reasons why denotational semantics are important. The denotational semantics for CPS Scheme is presented as an ``interpreter written in a functional language''. [1] http://www.ccs.neu.edu/home/shivers/citations.html#diss Entry: CPS vs. A-normal form Date: Sun Sep 13 10:39:18 CEST 2009 From [1] (print (* (+ x y) (- z w))) in CPS, where `'k' is the contination of the expression: (+ x y (lambda (xy) (- z w (lambda (zw)) (* xy zw (lambda (prod) (print prod k)))))) in A-normal[2] form, which I've called ``nested let'' before: (let* ((xy (+ x y)) (zw (- z w)) (prod (* xy zw))) (print prod)) [1] http://www.ccs.neu.edu/home/shivers/citations.html#diss [2] http://en.wikipedia.org/wiki/Administrative_normal_form Entry: Delimited Continuations and Staging Date: Mon Sep 14 11:27:48 CEST 2009 Avoiding monads using delimited continuations[1][2]. In the latter, ``scope extrusion'' (the possibility of bringing variables outside of their scope using assignments) is illustrated using the following MetaOCaml example: let r = ref .<1>. in . .~(r := ..; .<()>.)>.; !r => .. Inside the escape .~( ) a code value is assigned to the reference r and referenced outside of the quotation .< >. to be returned as the value of the whole expression. This code value is ill-formed: the variable y_1 (y renamed) is no longer bound. Arbitrary shift/reset will give similar problems: variables can be transported outside of their scope. The paper suggests an approach where shift/reset is still used, but escapes are limited up to the binding site. So the thing to figure out is how this translates to scheme: is it worth limiting control effects (since we have no typing, but do want to have correct scoping). [1] http://lambda-the-ultimate.org/node/3112 [2] http://okmij.org/ftp/Computation/staging/circle-shift.pdf Entry: Proof Assistants / Constructive Analysis Date: Tue Sep 15 21:05:47 CEST 2009 Proof assistants have a small ``trusted core'' that's manually verified, which is then used to bootstrap other theories. Correctness can be proven _mostly_ in the system itself (but of course not fully from Goedel's incompleteness). An interesting remark about types vs. set theory: in set theory the question: is the set \pi an element of the set \sin? (DeBruyn) The answer depends on the encoding. You need a layer of type theory over the set theory. Modern proof assistants are based on type theory directly: serves both as a foundation of mathematics and as a programming language. Exact reals scream for constructive logic because there is no zero-test. You want to avoid taking the exact decision P or not-P while programming. However for naturals, rationals or other decidable structures you do have this decision. Programming in type theory used for real programs: Leroy built a machine-verified compiler for Cminor (subset of C) to machine language. The idea is to extend this to Coq (built on Ocaml (built on C)). [1] http://videolectures.net/aug09_spitters_oconnor_cvia/ Entry: Sparse Conditional Constant Propagation Date: Fri Sep 18 08:45:58 CEST 2009 These 3 are very related: - constant folding (1 + 2 -> 3) - constant propagation (A = 3; B = 4; A + B -> 3 + 4 -> 7) - function inlining (f x y = x + y; f a b -> a + b) Abstract interpretation can do constant prop+fold and function inline all at the same time, as long as all operations / functions have a staged behaviour. In a functional (SSA) setting, this isn't such a fuss. Mutation however complicates matter. The wikipedia page about constant folding[1] talks about reaching definition[2] analysis, which in SSA is of course trivial. So, sparse conditional constant propagation[3] uses abstract evaluation of SSA form: ``The crux of the algorithm comes in how it handles the interpretation of branch instructions.'' Basicly, conditional branches depending on known data can be picked at compile time. [1] http://en.wikipedia.org/wiki/Constant_folding [2] http://en.wikipedia.org/wiki/Reaching_definition [3] http://en.wikipedia.org/wiki/Sparse_conditional_constant_propagation Entry: Controlling Effects Date: Fri Sep 18 15:25:29 CEST 2009 Filinski PhD: representing monads w. delimited control[1]. [1] http://www.diku.dk/hjemmesider/ansatte/andrzej/papers/CE.ps.gz Entry: Algebra of Programming Date: Sun Sep 20 10:27:49 CEST 2009 Oege De Moor[1] and Richard Bird[2], their book[3] and a LtU thread[4]. That thread contains some interesting links. Looks like this is the place to start for getting some more information on the subject. I also asked John Nowak what he's up to, since it seems to be related to [3]. The thread[4] mentions that Oege stopped pursuing this line of research becase it is too abstract. In any case, a bit more knowledge of category theory would help. See Maarten Fokkinga[5]'s [6]. [1] http://www.comlab.ox.ac.uk/people/oege.demoor/ [2] http://www.comlab.ox.ac.uk/people/Richard.Bird/index.html [3] isbn://013507245 [4] http://lambda-the-ultimate.org/node/1117 [5] http://wwwhome.cs.utwente.nl/~fokkinga/ [6] http://www.cs.utwente.nl/~fokkinga/mmf92b.pdf Entry: Galois Connection Date: Sun Sep 20 10:51:08 CEST 2009 A Galois connection is a kind of morphism between posets. Starting with Lecture 10b [1] from Cousout's course at MIT[2]. The basic property is expressed as: a(x) [= y iff x <= g(y) Galois connections are interesting when they are used to relate sets of different ``sizes''. Let's take a be the abstraction operation which maps the larger poset P,<= to the smaller poset Q,[=. Page 112 in the slides, (page 28 in the pdf/4) has a picture of a Galois connection. From this, the point is that while a roundtrip g . a can shift around elements in P, it will not change the order relation <= in P. I.e. some elements in P are promoted (or dually demoted), but never in a way that they switch order. In other words, g . a is extensive. This is captured in the following theorem: From the properties: a,g monotone, a . g reductive, g . a extensive follows that a, g form a Galois connection. f extension: x <= f(x) f reductive: f(x) <= x Notes: * as with most related to posets, each GC has a dual which reverses the order relations in P and Q and exchanges the roles of a and g. * GCs compose using ordinary function composition of the a's and g's (as opposed to Galois correspondences) * GCs can be combined as sums (linear, disjoint, smashed), products and powers. [1] http://web.mit.edu/16.399/www/lecture_10-maps1/Cousot_MIT_2005_Course_10b_4-1.pdf [2] http://web.mit.edu/16.399/www/ Entry: Software pipelining: An effective scheduling technique for VLIW machines Date: Sun Sep 20 12:08:39 CEST 2009 From the abstract-interpretation based ``flattening'' of matrix operations (Gauss-Jordan elimination) to data flow graphs it dawned on me that ``sorting'' this network might be an interesting optimization for VLIW processors (such as the TI DaVinci / C64x DSP, or NXP TriMedia) Of course, it turns out I have it all backwards: the VLIW architecture was introduced to take advantage of this pattern. From [1] (which I found in a collection[6] of must-read CS papers): ``The key to generating efficient code for the VLIW machine is global code compaction. In fact, the VLIW architecture is developped from the study of the global code compaction technique, trace scheduling''. On this subject Ellis' PhD[4] seems to be the definitive referece, however I can't find an electronic copy. Looks like this book[5] by Fisher et al. might have some answers too. This one[2] is on Trace-Scheduling-2, a non-linear extension. The basic idea in trace scheduling is to optimize the most frequently executed traces by turning them into straight-line code. This in turn allows for plenty of opportunities to find parallelism. So, trace scheduling attempts to _create_ the conditions I mention in the first paragraph, by picking traces that are highly probable. However, it has its problems. The most important one is exponential code explosion. So, [1] talks about software pipelining, which is an alternative to trace scheduling. It uses hierarchical reduction to straighten branches: a conditional branch is split into streams of conditional code, where the shorter branch is padded to fit the size of the larger one. The rationale is that because jumps are particularly expensive in VLIW (i.e. #FU x pipeline-depth), this approach (disabling units from the unused side of the branch) is often better than performing a jump. [1] http://reference.kfupm.edu.sa/content/s/o/software_pipelining__an_effective_schedu_8310.pdf [2] http://www.hpl.hp.com/techreports/93/HPL-93-43.pdf [3] http://courses.ece.illinois.edu/ece512/Papers/trace.pdf [4] http://mitpress.mit.edu/catalog/item/default.asp?tid=5098&ttype=2 [5] isbn://1558607668 [6] http://www.cs.utexas.edu/users/mckinley/20-years.html Entry: DSLs as compiler hints Date: Sun Sep 20 16:05:47 CEST 2009 ( In the context of previous post[1] about VLIW optimizations. ) Coming back to the combinator approach: you loose some, because not all programs can be expressed, and you win some, because specification can be separated from implementation. An essential part is that you fix the specification of the solution on a higher abstraction level such that the compiler doesn't need to _infer_ properties of your solution (to choose a different implementation). By using a high-level description, properties can be made explicit, independent of the meaning (correctness) of the program (i.e. as ``aspects''). These could then be used by a compiler to optimize over: it can concentrate on searching instead of spending time on inference. The real problem however is to find a good collection of combinators (the ``DSL''), and the specification of an escape hatch to lower, more general purpose levels. Here _good_ means that it can express most of the solutions, and provides a good parametrization of possible implementations. [1] entry://20090920-120839 Entry: Theorems for free! Date: Sun Sep 20 20:07:47 CEST 2009 Wadler's free theorems[1] and the algebra of programs. A remark that caught my attention: ``in general the laws derived from types are of a form useful for algebraic manipulation''. (I.e. push `map' through a function). These theorems depend on _parametricity_ : because of the _hole_ in the specification, it can't do much else than be about _structure_. The theorems then reflect structural theorems (i.e. commutation laws). [1] http://homepages.inf.ed.ac.uk/wadler/papers/free/free.ps Entry: The Design and Implementation of Typed Scheme Date: Mon Sep 21 09:45:47 CEST 2009 Based on occurence typing: assigning distinct subtypes based on the control flow of the program. This is based on the observation that Scheme programmers often use flow-oriented reasoning: distinguishing types based on prior operations. [1] http://www.ccs.neu.edu/scheme/pubs/popl08-thf.pdf Entry: Understanding Expression Simplification Date: Mon Sep 21 12:03:33 CEST 2009 .. in the light of Minimum Description Length. [1] http://www.cas.mcmaster.ca/~carette/publications/simplification.pdf Entry: Dynamic Programming Date: Mon Sep 21 12:54:33 CEST 2009 Subdivide and memoize to avoid exponential explosion. What I never realized is that Recursive Least Squares (RLS) falls in this category, and in general all ``update'' stream-based algorithms. In [2] a technique is mentioned that centralizes memoization in a y-combinator. For more about this see [3]. [1] http://en.wikipedia.org/wiki/Dynamic_programming [2] http://okmij.org/ftp/Computation/staging/circle-shift.pdf [3] http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.92.495&rep=rep1&type=pdf Entry: Partial Evaluation for the Lambda Calculus Date: Mon Sep 21 14:03:08 CEST 2009 [1] http://eprints.kfupm.edu.sa/20847/1/20847.pdf Entry: Constraint Programming from CTM Chap. 12 Date: Tue Sep 22 12:29:12 CEST 2009 Combination of propagation and search. Try local deductions first, and solve the rest with search. Choice can be introduced by transforming a constraint program P into P^C and P^(~C). Local deductions are implemented as propagators. ( Equations, i.e. sets of linear equations, are propagators that can be turned into functions, but there is a more general relational framework. ) Choice is inserted implicitly using a heuristic ``distribution strategy''. Entry: Staging & Typing Date: Wed Sep 23 11:51:03 CEST 2009 This one[1] should be an eye-opener, next to the first paper on taggless interpreters[3]. [1] http://okmij.org/ftp/Computation/staging/metafx.pdf [2] http://lambda-the-ultimate.org/node/2575 [3] http://lambda-the-ultimate.org/node/2438 Entry: FISh & Squigol Date: Wed Sep 23 10:20:53 CEST 2009 Functional = Imperative + Shape[1][2]. Another one of those little gems to read before attempting the DSP / vector extension to Staapl. And a tutorial on squigol[3]. [1] http://www-staff.it.uts.edu.au/~cbj/FISh/index.html [2] http://linus.socs.uts.edu.au/~cbj/Publications/latest_fish.ps.gz [3] http://ti.arc.nasa.gov/m/profile/ttp/squigol.pdf Entry: Recent Scheme papers from NU Date: Thu Sep 24 14:43:15 CEST 2009 [1] http://www.ccs.neu.edu/scheme/pubs/ Entry: Scheme implementation Date: Thu Sep 24 15:06:09 CEST 2009 Following advice from [1], here is a Kranz's PhD about Orbit[2][4] and Dybvig's PhD about implementing Scheme[3]. For the run-time side: David Gudeman about representing dynamic typing [5]. [1] http://news.ycombinator.com/item?id=835020 [2] http://repository.readscheme.org/ftp/papers/orbit-thesis.pdf [3] http://www.cs.indiana.edu/~dyb/papers/3imp.pdf [4] md5://3a9e0bba8f636d5a9fcdd3d19fc09216 [5] ftp://ftp.cs.indiana.edu/pub/scheme-repository/doc/pubs/typeinfo.ps.gz Entry: Pico Date: Fri Sep 25 16:48:38 CEST 2009 I'm looking back at Pico[1], a language developped at prog@vub by De Meuter and d'Hondt. A ``lisp for mere mortals''. [1] ftp://prog.vub.ac.be/Pico/Docs/LispWS.pdf Entry: Logic Date: Sun Sep 27 17:28:14 CEST 2009 If first order logic can be used to create structured domains (the space where the predicate parameters live), propositional logic represents just structure, and isn't `about' anything. What does higher order logic represent? It allows quatification over predicates and higher order types. [1] http://en.wikipedia.org/wiki/First_order_logic Entry: Formal methods Date: Sat Oct 3 20:33:50 CEST 2009 SPIN/Promela and TLC/TLA+ by Jose Faria. The paper[1] compares two formal specification tools based on a case study: an algorithm to deal with non-blocking linked-lists based on the compare-and-swap (CAS addr old new) instruction. CAS atomically compares the contents of a location addr with an old replaces it with a new value in case of a match, returning a boolean to indicate whether the substitution took place or not. This primitive allows to detect significant conflicts in the case of list operations: some other process might have modified the list in such a way that is incompatible with the intermediate state of a list operation, in which case the whole operation can simply be restarted based on the result of the CAS. TLA+[4] is a specification language for describing and reasoning about asynchronous nondeterministic concurrent systems. TLC is an explicit-state model checker for specs written in TLA+. A spec in TLA+ is summarized in a single formula that describes a state machine in erms of an initial condition, a next state relation and possibly some liveness conditions. Promela is the specifically developed input language of SPIN, an on-the-fly explicit-state model checker (as TLC, while TLC was designed after TLA+). The Wikipedia page on linear temporal logic (LTL)[5] summarizes the useful properties as the ability to express safety (something bad never happens) and liveness (something good keeps happening). [1] http://www.openlicensesociety.org/docs/FMethodsReport_ComparisonTLASpin.pdf [2] http://en.wikipedia.org/wiki/Promela [3] http://en.wikipedia.org/wiki/Temporal_logic [4] http://en.wikipedia.org/wiki/Temporal_logic_of_actions [5] http://en.wikipedia.org/wiki/Linear_temporal_logic Entry: Funmath Date: Sun Oct 4 10:16:46 CEST 2009 The previous article lead me to funmath[1]. Funmath stands for Functional mathematics (for other uses of the name, look here). The underlying principle consists in defining mathematical concepts as functions (hence the name) whenever doing so is appropriate. This turns out to be especially convenient where it has not yet become common practice. The idea is to build a defect-free notation for mathematics. It reminds me of what Sussman is up to with his recent work on Scheme + differential geometry. By formalism we mean a framework for reasoning comprising two elements: (a) a symbolic language or notation, (b) rules for symbolic manipulation. (a) The language is usually characterized by its form, typically specified by a formal syntax, and its meaning, typically specified by a (denotational) semantics. (b) The rules are typically specified by a formal system, which can be seen as the axiomatic semantics of the language if we borrow the term from programming languages. Gries advocates calculational reasoning. This means that logical arguments are presented as symbolic calculations, stepping from one equation to the next using appropriate rules, and linking them by (in)equalities. [1] http://www.funmath.be/ [2] http://www.funmath.be/LRRL.pdf [3] http://www.cs.utexas.edu/~EWD/ewd10xx/EWD1073.PDF Entry: The Two Towers. Date: Sun Oct 4 11:54:03 CEST 2009 Two important things happened to me in the course of the last 10 years. Rising from the puddle of asm and C, I discovered Forth and its algebraic nature, and Scheme, lambda calculus, macros, types, ... The practical pilar was compilation. Through manipulation of language as just another data object, a different kind of math was brought within my reach - quite different than the isolated world of linear algebra and calculus useful for numerical applications. Instead of math being something that's done on paper, it came to live in my hands through the manipulation of code objects. This took a couple of years to really sink in. As opposed to the principle being simple - formal languages, axioms and inference rules, semantics represented by functions mapping language objects to other objects, there is an incredible range of possible ways to bring mathematical structures to physical computing machinery. So is logic the ultimate programming language? I guess it depends on what your goals are. Currently I see really only two camps in software development: the mathematical / logical camp which limits power by introducing formal abstractions that have provable or verifyable properties, and the biological camp which uses evolutionary techniques to approach correctness (and specification!) by removing constraints and looking only at observable behaviour. I guess the point is something like this: since computers themselves behave mathematically (within certain bounds), do you want to propagate this kind of exactness to very high level claims (proofs in logic), or are you more interested in using the computer to give you a system that implements limitless power (reflection / programmable semantics). Maybe, stretching it a bit, the former could be called the tree / directed graph approach (proof / transport of truth), while the other is the full graph approach (connected objects, the internet model). Entry: Signal Processing Functions, Algorithms and Smurfs: The Need for Declarativity Date: Sun Oct 4 13:06:26 CEST 2009 Following the links on [2] brings me to a paper[1] by Boute ``Signal Processing Functions, Algorithms and Smurfs: The Need for Declarativity''. This is brilliant stuff. The main cause [of decline of declarative thinking] in DSP is a shift from essentially declarative mature engineering formalisms to ``algorithmic thinking'' induced by computer implementation, ignoring the declarative mathematical methods for software.'' Boute mentions SILAGE[3], a DSP dataflow language. Our own research over the past 15 years is also aimed at unifying EE and CS, starting with mathematical modeling and reasoning. I have a new hero. Looks like I need to shut up and read for a while. Most of my plans for Staapl's DSP language are probably best put in this framework. [1] http://www.funmath.be/SmurFinl.pdf [2] http://www.funmath.be/LRRL.pdf [3] http://www.cosic.esat.kuleuven.be/publications/article-756.pdf Entry: Coq & Dependent types Date: Wed Oct 7 09:53:16 CEST 2009 Let's pick up again at Coq and dependent types. Entry: Linear types Date: Wed Oct 7 10:04:35 CEST 2009 I'm having a look at this survey[1] about linear types, regions and capabilities. At any point in time, the heap consists of a linear upper structure (a forest), whose leafs form the boundary with a nonlinear lower structure (an arbitrary graph). It is possible to lift this requirement using _focus_. Wadler noted that this is extremely restrictive. ... This leads to an explicitly threaded programming style (i.e. `uncons' and a linear stack of values) which is heavy and over-sequentialised. Temporary aliasing is possible, provided there is only one remaining pointer when a variable recovers its original linear type. (Wadler's ad-hoc `let!' is essentially the style in which the PF primitives are implemented.) Apparently there is a cleaner idea hidden. Only state has to be linear, a state transformer can safely be nonlinear. Monads are a language design in which state is implicit and only state transformers are first-class values. In principle one could type-check a monad's implementation using a linear type system (it allows in-place updates) and type-check its clients using a standard type system. Linearity is meant to enforce the absence of aliasing. Regions are intended to control aliasing. Then, Baker's regions provide some annotation that can be used to perform 1) collection at function exit and 2) in-place update for intermediate data. Then some about `letregion', regions that coincide with lexical scope which with proper escape analysis can be implemented as a stack. This then leads to type-and-effect[2] systems. The calculus of capabilities is a low-level type-and-effect system. Then it becomes a bit too abstract. I follow the general idea though: distinguish pointers (shared environment) from the right to deallocate or dereference (linear capabilities). Skipping to the practical: Cyclone, a ``safe C'', provides fine grained control over allocation/deallocation without sacrificing safety. Cyclone is a complex programming language. Simplified calculi describe its foundations and discuss interesting connections between linearity, regions and monads. [1] http://lambda-the-ultimate.org/node/3581#comment [2] http://en.wikipedia.org/wiki/Effect_system Entry: Componend Based Software Engineering Date: Fri Oct 9 16:44:58 CEST 2009 I'm trying to reconnect to main stream OO-world terminology. The discipline of component-based software engineering[1] talks about interfaces and dependencies. Translating to PLT Scheme's large scale compositional structure (ignoring the small-scale class / inheritance / mixin functionality) this reflects two parts. Both require and provide interfaces and represent compilation units - units: cyclic, no macros - modules: acyclic w. macros Presence of macros makes module compilation follow the dependency graph (functionality defined in one module might influence language semantics of other). Modules dependencies follow a directed acyclic graph. Units are more like the components in CBSE. [1] http://en.wikipedia.org/wiki/Component-based_software_engineering Entry: OpenComRTOS Date: Mon Oct 12 11:49:06 CEST 2009 Best point of entry is the white paper[1]. The kernel is communication based, and has several semantic levels: L0 Priority based pre-emptive multitasking (packets & ports). L1 Higher level RTOS: semaphores, events, queues, resources. L2 Dynamic features: code mobility, ... The system is modular with kernel & drivers implemented as tasks. [1] http://www.altreonic.com/sites/default/files/Whitepaper_OpenComRTOS.pdf Entry: Java vs. Other Date: Wed Oct 14 14:14:57 CEST 2009 * primitive types Primitive types are not objects, however they can be wrapped as such. int -> Integer float -> Float double -> Double * Java and functional programming Maybe Java generics[1] is the best place to link FP and Java. Generics are FP-style parametric polymorphism(PP). This is different from inheritance based polymorphism. I.e. the most general container without PP needs the least common denominator: the universal reference type `Object'. This however does not enforce that all elements in the container are of a more specific type. Generics allow a solution to that by providing a generic template that can then be specialized to a more specific type. This is akin to C++ templates, except that Java generics type-check in _parametric_ form, and all their specializations are well-typed, in contrast with C++ templates which type-check in expanded form using a heuristic type checker. * OO vs FP in context of Java FP: - algebraic data types somewhat analogous to class hierarchies - parametric polymorphism abstract collections - higher order functions parameterize computational processes The main difference is that in class-based, single dispatch OO, methods are _tied_ to classes, while in FP this coupling is less strict. [1] http://www.javaworld.com/jw-02-2000/jw-02-jsr.html?page=2 Entry: Linearizability Date: Sat Oct 24 10:28:13 CEST 2009 From [1] Linearizability provides the illusion that each operation applied by concurrent processes takes effect instantaneously at some point between its invocation and its response, implying that the meaning of a concurrent object's operations can be give by pre- and post-conditions. Maybe better is The art of multiprocessor programming[2], chapter 3 about Concurrent objects. Quiescent Consistency: any time an object becomes quiescent, then execution so far is equivalent to some sequential execution of the completed calls. I.e. A--- B---- C-- D-- .. . .. This can mean all possible permutations of A,B,C followed by D, because there is no quiescent time among the invocations of A,B,C but all of them are separated from D. QC doesn't necessarily respect program order (= single thread's sequential order) Sequential Consistency: In any concurrent execution, there is a way to order the method calls sequentially such that they (1) are consistent with program order and (2) meet the object's sequential specification. SC is _not_ composable. QC and SC conditions are incomparable. Linearizability: each method call should appear to take effect instantaneously at some moment between its invocation and response. L => SC. [1] http://www.cs.brown.edu/~mph/HerlihyW90/p463-herlihy.pdf [2] isbn://0123705916 Entry: Language oriented development Date: Tue Oct 27 11:16:52 CET 2009 As expresses in [1]: ``As a result of my academic and professional training i have come to rely heavily on types as a development discipline. In fact, if i cannot devise a sensible type "algebra" for a given (application) domain then i feel i don't really have a good understanding of the domain. One way of seeing this from the Schemer point if view is that the deep sensibility embodied in the Sussman and Abelson book of designing a DSL to solve a problem is further refined by types. Types express an essential part of the grammar of that language.'' A strange pattern emerges when you think of it in this way: 1. ML and its algebraic data types (ADT) designed as a meta language, a system to represent another formal language. 2. Haskell: Types and lambda calculus as the only vehicle for writing any kind of computer program. 3. Thinking about programming as writing a programming language for describing a problem. It's so simple and straightforward. [1] http://www.haskell.org/pipermail/haskell-cafe/2008-April/041239.html Entry: LL and LR parsing vs. binary protocol design Date: Mon Nov 9 08:21:38 CET 2009 Both LL(k) and LR(k) have finite lookahead and no backtracking. This means that the parsing decisions need to be made based only on the next k input symbols. LL is top-down (recursive descent) and LR is bottom-up (recursive ascent). In their simplest forms (only one state?) they correspond to a prefix and a postfix language. This is useful for building serialization protocols optimized for minimal parser complexity, i.e. to run in hardware or small 8-bit uCs. I've determined 4 important design decisions[1] for a protocol: - delimited vs. prefixed token stream - representing quotation + construction tokens - bottom-up (postfix) or top-down (prefix) structure - constructor arity tagging [1] entry://../libprim/20091107-113002 Entry: Peter Landin Date: Mon Dec 21 22:17:56 CET 2009 - functional programming languages - domain-specific languages - syntactic sugar - SECD machine - function closures - program closures (continuations) - streams - connection between streams and coroutines - delayed evaluation - partial evaluation - circularity to implement recursion (tying the knot) - graph reduction - sharing - strictness analysis - where expressions - disentangling nested applications into where expressions [1] http://www.vimeo.com/6638882 Entry: Two kinds of optimizations Date: Sun Jan 3 13:01:13 CET 2010 Let's see if I can find the quote again: There are only two kinds of optimizations: * Not performing the work (yet), i.e. performing it lazily at run-time, or eliminating it at compile-time. * Performing the work only once and reusing the result. I.e. run-time memoization and compile-time evaluation. I think this was attributed to Mich Wand by Dave Herman, but I can't find the reference. Entry: Java and CPS vs callbacks. Date: Tue Jan 5 10:45:52 CET 2010 I'm working on Android lately, which has a lot of asynchronous message passing going on. Using this without anonymous classes is a pain: the alternative is to extend the calling class with callbacks implementing a particular callback interface. It's much easier to use anonymous objects. This is essentially CPS: call a function, and provide a context it needs to invoke whenever it sends it reply. What this really shows me is the arbitraryness of designing with objects and classes. I think I understand why ``patterns'' are so big in OO: they are essentially an informally specified set of rules to adhere to to not get bogged down in mind numbing low-level decisions. However, the patterns are in the design doc, not in the source code, and the programmer is supposed to recognize them, looking past the boilerplate code. In Functional programming this is less so. It seems that it's easier to abstract away boiler plate code: just add yet another higher order function. Entry: Avi Bryant: Don't build stuff for developers Date: Sat Jan 9 00:27:33 CET 2010 If you want to use all the cool stuff, don't build stuff for developers, because they will get in your face about it. [1] http://2010.cusec.net/01-08/from-cusec-2009-avi-bryant-bad-hackers-copy-great-hackers-steal/ Entry: Applicative Functor Date: Sun Jan 10 11:40:33 CET 2010 To make things more intuitive I'm calling the parameterized data types that are members of the type class Functor and Applicative ``collections''. (I find a fixed-size array most intuitive.) Functor: A functor is a collection that supports an operation `fmap' which maps A SINGLE transformation of elements to a transformation of collections. class Functor f where fmap :: (a -> b) -> f a -> f b Applicative: An applicative functor is a collection that supports the operation `<*>' which maps A COLLECTION of transformations to a transformation of collections. class (Functor f) => Applicative f where pure :: a -> f a (<*>) :: f (a -> b) -> f a -> f b In addition a function `pure' is required that wraps an element into a collection. The operations need to satisfy some laws: pure id <*> v = v -- Identity pure (.) <*> u <*> v <*> w = u <*> (v <*> w) -- Composition pure f <*> pure x = pure (f x) -- Homomorphism u <*> pure y = pure ($ y) <*> u -- Interchange In [3] it is mentioned that this can be used for side-effects -- hence the name `pure'. I don't quite get that. [1] http://en.wikibooks.org/wiki/Haskell/Applicative_Functors [2] http://learnyouahaskell.com/functors-applicative-functors-and-monoids [3] http://www.soi.city.ac.uk/~ross/papers/Applicative.html Entry: Recursive make Considered Harmful Date: Tue Jan 12 08:38:55 CET 2010 The general idea: don't (artificially) break up the dependency graph in separate components. If there are _any_ dependencies between different components of a project, a Makefile best describes these in a central way. What I've learned: some key tricks are based on the fact that make is _string based_, and that := and = have different meaning: the first is strict expansion - it is evaluated immediately, and the second is deferred expansion - it saves the string literally, expanding only when triggered trough strict expansion or evaluation of a rule. From [2]: Basically, what's needed to tackle these issues is a variable that tracks the 'current directory' while the source tree is traversed and makefile fragments are included. This variable can then be used in describing dependency relations in a relative fashion, and in the include path for the compiler in build recipes. As far as I understand: using strict assignment (:=) you create some mutables state during inclusion of rule files. Any rules that use these variables get immediately expanded to strings and stored to build the dependency graph. [1] http://miller.emu.id.au/pmiller/books/rmch/ [2] http://www.xs4all.nl/~evbergen/nonrecursive-make.html [3] http://aegis.sourceforge.net/auug97.pdf Entry: Beautiful differentiation Date: Wed Jan 13 08:37:20 CET 2010 I'm trying to understand this[1] magnificent paper by Conal Elliott. Click trough for a video presentation from ICFP 2009. The basic idea is that by using an abstract, recursive definition of the chain rule and some clever function overloading, it is possible to implement AD using very general high level code. The code that can be found here[2] doesn't seem to have the most general version of the chain rule (multiplication replaced by composition with linear map). Or do I miss something? From Dif.hs: -- The chain rule infix 0 >-< (>-<) :: (Num a) => (a -> a) -> (Dif a -> Dif a) -> Dif a -> Dif a f >-< d = \ p@(D u u') -> D (f u) (d p * u') [1] http://conal.net/papers/beautiful-differentiation/ [2] http://conal.net/blog/posts/beautiful-differentiation/ Entry: Can functional programming be liberated from the von Neumann paradigm? Date: Fri Jan 15 08:02:42 CET 2010 I like some of Conal Elliott's ideas, especially his critique on IO in Haskell[1]. The post is about composability. I.e. a "program" with IO is an artificial notion. Really, the unit of composition on that level should be the module (a bunch of functions and data structures augmented with static meaning). Can the "program" be eliminated? I love this: Roly Perera noted that ... you never really need to reach `the end'. (It really is all about composition.) To extend an example in the previous post, after numbers, then strings, and then pixels, are phosphors the end? (Sorry for the obsolete technology.) After phosphors come photons. After photons comes retina & optic nerve. Then cognitive processing (and emotional and who-knows-what-else). Then, via motor control, to mouse, keyboard, joystick etc. Then machine operating system, and software again. Round & round. More importantly, our interactions with other wetware organisms and with our planet and cosmos, and so on. Roly added: What looks like imperative output can be just what you observe at the boundary between two subsystems. Which is exactly how I look at imperative input/output. [1] http://conal.net/blog/posts/can-functional-programming-be-liberated-from-the-von-neumann-paradigm/ Entry: Functional programming with GNU Make Date: Tue Jan 19 08:24:53 CET 2010 GNU make's "call" is Scheme's "apply". However, I don't see if lexically nested procedures are possible. I guess this would require proper quoting/unquoting (functions are represented as strings). More specifically VAR = value <-> (define VAR (lambda () value)) VAR := value <-> (define VAR value) The former is a recursively expanded, while the latter is a simply expanded variable. The analogy with a lambda thunk isn't completely correct, as it depends on how is applied. I.e. VAR = ... $(1) ... $(2) ... <-> (define VAR (lambda (v1 v2) ... v1 ... v2 ...)) To invoke : $(call VAR,arg1,arg2) [1] http://lambda-the-ultimate.org/node/85 [2] http://okmij.org/ftp/Computation/Make-functional.txt [3] entry://20100112-083855 Entry: Common expression elimination Date: Fri Jan 22 14:48:58 CET 2010 Here is a pattern I ran into today. It is somewhat related to loop exchange and memoization. Translated to a Scheme program transformation problem it is: (begin (let ((a 1) (b 2) (c 3)) ...) (let ((a 1) (b 7) (c 19)) ...) (let ((a 1) (b 5) (c 3)) ...)) -> (let ((a 1)) (begin (let ((b 2) (c 3)) ...) (let ((b 7) (c 19)) ...) (let ((b 5) (c 3)) ...))) I.e.: if all the bindings of `a' are the same, pull them out. This is useful for presenting a hierarchical view of a database table. The solution is straightforward, especially if this needs to be done over only one level: transpose the nesting and pull out rows that have the same values. Now, the interesting thing is that the table view comes from a different hierarchical nesting that's been flattened as the variables visible in the deepest nesting level. So then this gives a way to invert certain nested namespaces into another nesting. Entry: Relational lenses and partial evaluation (generating VMs) Date: Sat Jan 23 11:45:17 CET 2010 Is it possible or feasible to formulate the specification of a machine such that different optimizations can be described and implemented by an automated procedure? I.e. A straightforward example is to use a cons list as a value rib during the evaluation of the expressions in a `let' form. This has the advantage of maximal sharing in the case a continuation is captured during the evaluation of one of the expressions, i.e. in <*> below: (let ((a e_a) (b e_b) ;; <*> (c e_c)) (...)) Using a vector to represent a value rib imperatively is more efficient, but requires a copying operation on let/cc to avoid the mutation to have observable side-effects. The value rib is conceptually a part of the environment. The same argument goes for the activation stack. I think I ran into this before, and the pattern is called "lazy stacks". I guess the advantage of a CPS representation is then that there is only one kind of stack: the environment stack. Entry: Convert a static library to a shared library Date: Sun Jan 24 13:18:52 CET 2010 Suppose foo.a is made of bar.o baz.o I keep on running into this problem: gcc -shared foo.a -o libfoo.so is not the same as gcc -shared bar.o baz.o -o libfoo.so In the first example, all the objects in foo.a are ignored because nothing depends on them! Entry: Representing control: a study of the CPS transformation. Date: Mon Jan 25 08:22:17 CET 2010 By Danvy and Filinsky[1]. This seems to be an important work to understand where ANF and `shift' and `reset' come from. Main property of CPS term = independence of evaluation order. I.e. it is a sequential program. [1] http://www.cs.tufts.edu/~nr/cs257/archive/olivier-danvy/danvy92representing.ps.gz Entry: Higher order functions in Java Date: Mon Jan 25 10:17:12 CET 2010 The simplest approach seems to be to use a 'forEach' function for each class that lifts a function. I.e. partially applied map/fold. Currently, my main concern is to abstract database queries in Android. There are essentially 3 main strategies to do this: - Eagerly convert to concrete lists/arrays/... - Iterator - universal traversal function (left fold with termination) For database traversal the Iterator abstraction isn't very good as it doesn't have a close() method, which might leak resources. The other two are fine. Inversion of control ala `shift' and `reset' doesn't seem to be straightforward to emulate, so this leaves forEach and lists, which can be generated from forEach. Entry: Type aliases for Java Generics Date: Mon Jan 25 10:33:33 CET 2010 [1] http://stackoverflow.com/questions/683533/type-aliases-for-java-generics Entry: Indulge yourself: Scheme literature Date: Mon Jan 25 16:04:13 CET 2010 From [1]: Indulge yourself: http://library.readscheme.org/ The must reads are Keny Dybvig's thesis, "Three Implementations of Scheme". The original lambda papers can wait until you read the Orbit paper, an optimizing compiler for T by Kranz, Rees et al. The Lisp implementation bibliography pretty much runs through PL research like a vein. Some of the stuff you must read for Lisp are typically in "books"; Christian Quinnec's Lisp in Small Pieces is the most important work, but you will need a good foundation in denotational semantics (you can get by with the one chapter in the little book by Nielson and Nielson, "Semantics with Applications: A Formal Introduction". Somewhere in there you will brush against various compilation methods and IRs for the lambda calculus, most importantly continuation passing style. Most semantics text introduce lambda calculus and its three rules, but none go in depth into this like the tall green book by Andrew Appel, "Compiling with Continuations", a good chunk of which can be read in Appel's other papers. Appel's work is MLish in nature, but don't let that stop you; most optimizing Lisp compilers are MLish down underneath anyway. CMUCL does very good type inference but gets short of implementing a full Hindley-Milner. Felleisen et al's "The Essence of Compiling with Continuations" might also come handy, though it's heavy on the theory. Andrew Kennedy continues the saga with "Compiling with Continuations Continued", this time CPS gives way to A-Normal Form, another IR. He describes the techniques used by a compiler targeting .NET. Most compiling "meat" can be found in the bits-and-bytes type papers. Wilson's GC bibliography "Uniprocessor Garbage Collection Techniques" is a must, it should have been called "What Every Programmer Should Know About Garbage Collection". Not to be confused with Richard Jones' "the Garbage Collection Bibliography". Boehm's "Garbage Collection in an Uncooperative Environment" is sheer hacking bravado, perhaps second only to "Pointer Swizzling at Page Fault Time", which should introduce you to memory management for disk-based heaps (i.e. object stores) among other things. Your start in hacking runtimes will probably be David Gudeman's "Representing Type Information in Dynamically Typed Languages"; this is where you learn how stuff looks inside the computer when you no longer need to malloc and free. A previous hacking of a Pascal dialect prepared me for this wonderful paper. Implementations of runtimes are documented by Appel, for SML/NJ, Robert MacLaclahn's "Design of CMU Common Lisp" (also perhaps Scott Fahlman's CMU report on CMUCL's precursor, "Internal Design of Spice Lisp", but that confused the crap out of me as I don't know the machine architecture they're talking about.) You will also enjoy the Smalltalk research starting with L. Peter Deutsch's first optimizing Smalltalk compiler, documented in "Efficient Implementation of Smalltalk-80", follow the Smalltalk lineage btw, all they way up to David Ungar's "The Design and Evaluation of a High Performance Smalltalk System" making sure NOT to ignore Self and its literature, also spearheaded by Ungar (Start your Smalltalk hacking career with Timothy Budd's "A Little Smalltalk", should take you about a weekend and will absolutely prepare you for dynamic languages; a similar system is described by Griswold and Griswold, compiler, intermediate representation and VM, but that one is for ICON.) Dynamic type inference and type-checking (TYPEP and SUBTYPEP, CLASS-OF, INSTANCE-OF, etc) you can learn a good chunk of how CLOS should look like to the runtime system from Justin Graver's "Type-Checking and Type-Inference for Object Oriented Programming Languages". He scratches the surface, and you should supplement this with a selection from Smalltalk and Self, though neither will prepare you for multiple-dispatch, for that peer into Stanley Lippman's "Inside the C++ Object System". I have deliberately avoided "classics" on Lisp, compiler construction, optimization, and other stuff. None of the books and papers I have recommended are as popular as SICP, PAIP, or AMOP. Or even the popular PL books, like EoPL, van Roy and Haridi, both of which you should read by the way, but they're stuff that you need to read and understand to be able to implement a practical Lisp implementation, or at least satisfy your curiosity. More here: http://www.reddit.com/r/programming/comments/9220o/ask_proggit_recommender_a_compsci_paper_for_me_to/ [1] http://news.ycombinator.com/item?id=835020 [2] http://news.ycombinator.com/item?id=834175 Entry: Scheme compilers Date: Mon Jan 25 16:44:09 CET 2010 Slava Pestov[1]: If you compare performance on benchmarks, then Gambit-C and Ikarus are closer to the performance of C, whereas PLT Scheme is a bit faster or equal to Python. I prefer the design of Ikarus over Gambit-C. Compiling to C seems like a big hack on the other hand. Ikarus reminds me of SBCL in a lot of ways, and SBCL's compiler is one of the best dynamic language compilers of all time. Another nice Scheme compiler is Larceny[2]. The source is very easy to read, and if you haven't seen a compiler that uses ANF as intermediate representation its worth checking out. [1] http://www.reddit.com/r/programming/comments/9tek5/were_learning_scheme_in_our_introduction_to/ [2] http://www.ccs.neu.edu/home/will/Larceny/overview.html [3] http://ikarus-scheme.org Entry: Open, extensible object models Date: Sat Jan 30 11:43:26 CET 2010 Everything dynamic. A very nice video presentation here[3], slides and other info here[4]. [1] http://piumarta.com/software/cola/objmodel2.pdf [2] http://piumarta.com/software/cola/ [3] http://www.youtube.com/watch?v=cn7kTPbW6QQ [4] http://www.stanford.edu/class/ee380/Abstracts/070214.html Entry: Name my recursion pattern Date: Sat Feb 13 14:37:50 CET 2010 How is this called: 1. start with a list: [a] and a context c 2. for each a <- [a], map (a,c) -> (a',c') 3. collect [a'] and c' Functor? Applicative functor? Monad? Arrow? a0 a1 ... | | v v s0 -> s1 ... | | v v b0 b1 ... Does it fit in one of the following? class Functor f where fmap :: (a -> b) -> f a -> f b lass (Functor f) => Applicative f where pure :: a -> f a (<*>) :: f (a -> b) -> f a -> f b Entry: Recursion and Co-recursion for filters (s,a) -> (s,b) on a list [a] Date: Sat Feb 13 17:54:06 CET 2010 -- CORECURSIVE -- The intermediate/end state s is never observed, so [a] can be infinite. iimap :: ((s,a) -> (s,b)) -> s -> [a] -> [b] iimap fn = f where f s [] = [] f s (a:as) = let (s',b) = fn (s,a) in b:(f s' as) -- RECURSIVE -- State s can be observed, so [a] needs to be finite. This can't be -- written as co-recursion, so we write it as a fold where the results -- are accumulated in reverse. iifold :: ((s,a) -> (s,b)) -> (s,[a]) -> (s,[b]) iifold fn (s,as) = f s as [] where f s [] bs = (s, bs) f s (a:as) bs = let (s',b) = fn (s,a) in f s' as (b:bs) integrate (s,a) = let s' = a+s in (s',s') -- In the RECURSIVE pattern, the fact that the `bs' accumulator is a -- list is irrelevant. In the CORECURSIVE pattern, the fact that the -- result is a list is essential: it is the recursion inside the list -- constructor that makes it possible to present iimap with infinite -- data. Entry: Haskell pointer equality Date: Sun Feb 14 11:29:56 CET 2010 import System.IO.Unsafe import System.Mem.StableName ptrEqual :: a -> a -> IO Bool ptrEqual a b = do a' <- makeStableName a b' <- makeStableName b return (a' == b') termRefEq :: (Eq a) => (Term a) -> (Term a) -> Bool termRefEq x y = unsafePerformIO $ ptrEqual x y Entry: Typeful symbolic differentiation of compiled functions Date: Wed Feb 17 11:20:43 CET 2010 The interesting part about this[1] paper is the `reflect' function: > class Term t a | t -> a where > reflect :: t -> a -> a > > newtype Const a = Const a deriving Show > data Var a = Var deriving Show > data Add x y = Add x y deriving Show > newtype Sin x = Sin x deriving Show > > instance Term (Const a) a where reflect (Const a) = const a > > instance Term (Var a) a where reflect _ = id > > instance (D a, Term x a, Term y a) => Term (Add x y) a > where > reflect (Add x y) = \a -> (reflect x a) + (reflect y a) > > instance (D a, Term x a) => Term (Sin x) a > where > reflect (Sin x) = sin . reflect x ... This is the straightforward emulation of GADT. The function `reflect' removes the `tags' after the symbolic differentiation. Actually, `Sin' is a newtype constructor, so there is no run-time tag to eliminate in this case. Note that the different kinds of terms are types collected in a class, not constructors collected in a type. This is compile-time type dispatching vs. run-time tag dispatching. This is an interesting trick, and also seems to be at the basis of a lot of later work. So let's have a look at this GADT trick in isolation [2][3]. [1] http://okmij.org/ftp/Haskell/differentiation/differentiation.lhs [2] http://okmij.org/ftp/ML/#GADT [3] http://lambda-the-ultimate.org/node/1293 Entry: Generalized Algebraic Data Type (GADT) Date: Thu Feb 18 08:10:18 CET 2010 From WikiPedia[2]: ... the parameters of the return type of a data constructor can be freely chosen when declaring the constructor, while for algebraic data types in Haskell 98, the type parameter of the return value is inferred from data types of parameters; From HaskellWiki[3]: Generalised Algebraic Datatypes (GADTs) are datatypes for which a constructor has a non standard type. and [4] explains it in mortal: why Haskell don't yet supports full-featured type functions? Hold your breath... Haskell already contains them and at least GHC implements all the mentioned abilities more than 10 years ago! They just was named... TYPE CLASSES! Together with the multiparam typeclass extension this gives a way to represent quite powerful type functions. (for "data" constructs) Lack of pattern matching means that left side can contain only free type variables, that in turn means that left sides of all "data" statements for one type will be essentially the same. Therefore, repeated left sides in multi-statement "data" definitions are omitted and instead of data Either a b = Left a Either a b = Right b we write just data Either a b = Left a | Right b ... And here finally comes the GADTs! It's just a way to define data types using pattern matching and constants on the left side of "data" statements! How about this: data T String = D1 Int T Bool = D2 T [a] = D3 (a,a) Amazed? After all, GADTs seems really very simple and obvious extension to data type definition facilities. It seems the main trick that GADTs facilitate is to replace constructor tag matching for sum types with type-level pattern matching. This moves information from run-time to compile time and so can provide more safety. [1] http://lambda-the-ultimate.org/node/1293 [2] http://en.wikipedia.org/wiki/Generalized_algebraic_data_type [3] http://www.haskell.org/haskellwiki/GADT [4] http://www.haskell.org/haskellwiki/GADTs_for_dummies Entry: How to explain Monads Date: Mon Feb 22 15:47:33 CET 2010 Best one until now[1]. Given a couple of functions a -> m b where m is a type constructor, how can one construct a composition of these functions given that the wrapped types b correspond to the input type a of the next function? Solution: Impose two requirements: - m is a Functor, i.e. it has a map function that can lift a -> m b to m a -> m (m b) - there is a function join : m (m a) -> ma that can combine layers The function that computes a Kleisli composition is: (>=>) :: Monad m => (a -> m b) -> (b -> m c) -> a -> m c In this picture a Comonad is also simple to understand. Given a couple of functions w a -> b where w is a type constructor, how can one construct a composition of these functions given that the wrapped types a correspond to the output type b of the prvious function? Solution: Impose two requirements: - w is a Functor, i.e. it has a map function that can lift w a -> b to w (w a) -> w b - there is a function duplicate : w a -> w (w b) that can combine layers So why does haskell use bind (>>=) and not join? My guess is that bind allows CPS-style code to look like assignments or list comprehensions. So from a _usage_ point of view, bind seems more natural than join. [1] http://blog.sigfpe.com/2006/06/monads-kleisli-arrows-comonads-and.html Entry: Applicative programming with effects. Date: Wed Feb 24 21:16:14 CET 2010 I'm reading [1] again. It's about ``pure functions applied to funny arguments.'' Some aha's of a concrete-minded Schemer: * map in Scheme is map, zipWith, zipWith2, ... in Haskell. These need to be different functions as they have different type signatures. * The S and K combinators are `ap' and `return' from the environment (reader) monad. The paper says that S & K are ``designed for this purpose''. That's the first time I hear this. But surely, looking at S indeed it applies proto-function and proto-argument to an environment, and applies the resulting function to the resulting argument. Now the concrete-minded mind needs to take a distance from looking at a functor as a data structure over which one maps a function piecewize, and instead see it as a computation. Best to start with monads, as each monad is a AF. What I don't quite get is this idea of "pure function & effects" where the <*> operator combines effects and the `pure' operator lifts a pure function into the effectful domain. Let's start with sequence :: (Monad m) => [m a] -> m [a] just as in the paper. This function takes a list of computations and produces a list of results, threading the monadic effect. sequence [] = return [] sequence (c:cs) = do x <- c xs <- sequence cs return (x:xs) Which can be written differently as: sequence [] = return [] sequence (c:cs) = pure (:) <*> c <*> sequence cs The paper then generalizes this to `traverse'. The key point being that the recursive call is _inside_ the effectful world. Now I can't bridge this explanation with the type signature of an applicative functor: pure :: x -> a x (<*>) :: a (x->y) -> a x -> a y Probably for the same reason that I couldn't see this for Monads in the beginning. My intuitive confusion there was that I was looking at `a' as a data constructor and a type constructor at the same time. To state the (now) obvious: The two lines above are part of a class definition `Applicative a', where `a' is a type variable of kind * -> *, I.e. a parameteric type with one parameter. The `Applicative' is (like) a predicate on type variables. [1] http://www.cs.nott.ac.uk/~ctm/IdiomLite.pdf Entry: The Actor model is not composable Date: Wed Mar 3 11:23:41 CET 2010 I recently ran into this in practice (in Java): a system with a lot of message passing suddenly needed a sequential behaviour. This then lead me to remove all synchronous message passing and use semaphores instead. It talks about something I've tried to hint at earlier: * FP allows for ``exponential'' expressivity: since everything can compose with everything else, the total number of expressable behaviours grows very fast. * Stateful programming allow for ``linear'' expressivity: you can add stuff, but in general it can't be combined as-is with other code. This is of course a black&white picture, but it seems to be true in spirit: it's harder to reach exponential expressivity in stateful languages exactly because of the inertia present in state -- hidden assumptions as mentioned in [1]. [1] http://pchiusano.blogspot.com/2010/01/actors-are-not-good-concurrency-model.html Entry: Arrows Date: Sun Mar 7 10:14:27 CET 2010 To understand arrows is to understand their basic combinators[1]. As a concrete example one could think of arrows as a generalization of functions. > instance Arrow (->) where > arr f = f > f >>> g = g . f > first f = \(x,y) -> (f x, y) -- for comparison's sake > second f = \(x,y) -> ( x, f y) -- like first > f *** g = \(x,y) -> (f x, g y) -- takes two arrows, and not just one > f &&& g = \x -> (f x, g x) -- feed the same input into both functions [1] http://en.wikibooks.org/wiki/Haskell/Understanding_arrows Entry: Simply Typed LC is Strongly Normalizing Date: Sun Mar 7 21:19:03 CET 2010 From [1]: Given the standard semantics, the simply typed lambda calculus is strongly normalizing: that is, well-typed terms always reduce to a value, i.e., a lambda abstraction. This is because recursion is not allowed by the typing rules. Recursion can be added to the language by either having a special operator of type (a->a)->a or adding general recursive types, though both eliminate strong normalization. Since it is strongly normalizing, it is decidable whether or not a simply typed lambda calculus program halts: it does! We can therefore conclude that the language is not Turing complete. In Haskell and OCaml the type inferencer doesn't like construction of infinite types: :t (\x -> x x) (\x -> x x) To express the Y combinator in this direct form requires it to be wrapped in a recursive type[3] (quotes from [2]): The problem with fix f = (\x -> f (x x))(\x -> f (x x)) is that one needs a solution to the type equation b = b -> a. Fortunately this can be done with Haskell’s data types. > newtype Mu a = Roll { unroll :: Mu a -> a } > fix f = (\x -> f ((unroll x) x)) (Roll (\x -> f ((unroll x) x))) Of course, this is just an academic exercise. To actually define a fixpoint combinator in Haskell, one would use recursive definitions. I.e. the Y combinator can be defined directly in its recursive form: > y f = f (y f) [1] http://en.wikipedia.org/wiki/Simply_typed_lambda_calculus [2] http://r6.ca/blog/20060919T084800Z.html [3] http://en.wikipedia.org/wiki/Recursive_type [4] http://en.wikipedia.org/wiki/System_F Entry: System F vs. Hindley–Milner Date: Sun Mar 7 22:11:15 CET 2010 Hindley-Milner[2] is a restricted form of System F[2]. Type inference for HM is decidable while for System F it is not. [1] http://en.wikipedia.org/wiki/System_F [2] http://en.wikipedia.org/wiki/Hindley–Milner Entry: Writing compilers Date: Sun Mar 7 22:56:47 CET 2010 The basic idea is that a compiler is a function, i.e. implemented by a collection of rewrite rules, that relates one syntactic domain to another. In order to verify correctness of this ``arrow'' the language needs a formal model - another arrow - to compare it with. comp language -----> machine code | | | definitional | machine code | interpreter | interpreter v v abstract domain (i.e. Haskell functions) That's my idea. How does [1] do it? In each pass (14 passes and 8 intermediate languages) the semantics of the language is defined, and the transformation is proven to preserve the semantics. I am only interested in one pass: the reductions used in the PIC18 instruction selection. It looks like I need a Hoare logic[2] for the machine opcodes, i.e. some kind of typing rules for the state transitions, and how they can be composed. So, my question is: in the spirit of tagless interpreters, is it possible to write down the Staapl PIC18 reduction rules such that: * PIC18 Instruction semantics can be encoded in the type as much as possible. I.e. data dependencies. * Remaining semantics (i.e. add or sub) is encoded in behaviour, and can be verified by quickcheck. Doing this in Scheme as an alternative to encoding it in Haskell/OCaml type system is probably also possible, but I would need to make a checker/inferencer. [1] http://compcert.inria.fr/doc/index.html [2] http://en.wikipedia.org/wiki/Hoare_logic Entry: State breaks composition Date: Sat Mar 13 12:14:53 CET 2010 More correctly it breaks composition of _function_ and requires the less powerful composition of _instance_, which is a combination of function and state (i.e. an object). In Haskell, stateful computations are modeled as state _transitions_ which keeps the model purely functional. s -> (a,s) One builds a program as a (large!) state transition built from primitive state transitions, and then supplies it with an initial state to set it in motion. This keeps composition by abstracting out state completely. External (physical) state is captured in much the same way by the IO monad[3], which relates function composition and real-world state. This synchronizes external at key points with intermediate nodes in the evaluation of a function network. type IO a = RealWorld -> (a, RealWorld) Notes: * Also in the more theoretical work about complexity theory one uses this machine->network transition. In the first lecture in [2] the link is made between turing machines (TM) and combinatorial networks (CN) by ``unrolling'' the turing machine in time to yield a network. The main point being that by mechanical translation, a polynomial time TM algorithm can be converted into a polynomial size CN. The converse does not seem to be proved, but in practice one can usually find a TM for a CN. (``morally equivalent'') [1] http://en.wikipedia.org/wiki/Flip-flop_(electronics) [2] http://sms.cam.ac.uk/collection/545358 [3] http://www.haskell.org/haskellwiki/IO_inside Entry: Ziggurat Date: Sun Mar 14 14:49:41 CET 2010 The example from fig1+2 in [1]. ;;; Fig 1: Creating classes and objects ;; Real number objects are described by a pair of integers (m . e) ;; where the value x is determined by x = m * 10^e (define real-class (make-top-class)) ;; Integer objects are described by a single integer; to instantiate ;; as a real number, use an exponent of 1. (define int-class (make-class (lambda (x) (make-object real-class (cons x 0))))) ;;; Fig 2: Creating methods (declare-method (num->string n)) (method real-class num->string (lambda (n) (let ((data (view real-class n)) (mant (car data)) (exp (cdr data))) (format "~sE~s" mant exp)))) (method int-class num->string (lambda (n) (let ((snum (view int-class n))) (number->string snum)))) ;; Methods are functions that take the object as an argument. The ;; `view' form returns the internal representation of an object. I don't understand: why does int-class call (make-object real-class ...) while it still has access to the integer? I don't get the paper. I find no point to hook on. Maybe some code and interaction would help? [1] http://www.ccs.neu.edu/home/dfisher/icfp06-ziggurat.pdf Entry: Mathematical Logic and Compilation Date: Wed Mar 17 08:37:48 CET 2010 I'm trying to get some intuition straight about specification by compiler. I don't find what I'm looking for on the web, so I guess it's ``obvious'' ;) In formal mathematics and logic, you have syntax (s) and inference rules (s1,s2,... -| s) that allow the construction of new syntax. To alleviate the tedium of working with such a low-level substrate, one allows the construction of (informal, finitistic) mathematical structure that talks about manipulation of formula and proofs. I.e. that talks about the existence of proofs using constructive methods: algorithms to construct a proof. Now, in compiler construction one works the other way around: * One starts with a physical model (i.e. an electronic circuit). This can be abstracted by a mathematical (semantic) model. * The objective is then to derive a formal system (syntax and code transformation rules) and an interpretation such that the semantic model is preserved. I probably need to look at Model Theory[1] and Proof Theory[2]. ( I need to grow some more hair on my chest. ) [1] http://en.wikipedia.org/wiki/Model_theory [2] http://en.wikipedia.org/wiki/Proof_theory Entry: Hardware Mapping Date: Wed Mar 17 10:26:47 CET 2010 Let's start at the meta-level[1]: "What are the important problems in my field?" In an attempt to make things more explicit, what are the important problems, and why am I not working on them? What am I actually doing? What is my main goal? Decouple function from implementation in numerical processing. This is translated to the following problem: 1. express numerical processing in mathematics (the DSL) 2. find a way to express hardware mapping There are plenty of examples of 1, so not much re-invention is necessary. There are plenty of examples of 2 also, but this field is really quite broad, and there are many design decisions to make. All I've been doing in the last couple of years is to learn about languages and compilers, and while I did learn a lot, I'm still struggling with making the target explicit. My conclusion up to now is that the 1/2 distinction is better viewed as a continuum, or at least a sequence of steps: 2 2' 2'' 1 ---> 1' ---> 1'' ----> 1''' I knew for a longer time that I'm building a compiler, or more specifically, a method for building multiple compilers. What I'm starting to see now is that this is all about semantics and proof. When you move down the chain from specification to implementation, you want to preserve meaning, or at least, preserve meaning relative to a certain set of conditions that express approximation. It seems I've been looking through the wrong set of glasses. The aspect to focus in is really _correctness_ and not ease-of-use. Actually, building a tower of languages is easy once you know what problem to solve. Writing DSLs becomes second nature with a bit of practice. Scheme, (Meta)OCaml and Haskell are all quite suited to do the job. However, while providing a lot of structure to eliminate silly mistakes, these tools don't solve the correctness problem: ultimately you need to define the beef as low-level computation (i.e. by pattern matching). The hard part is making sure that you preserve the intended semantics facing a mountain of implementation details. The real problem is managing those details, and replace them with a structure that is ``obviously correct''. I see it in the Staapl PIC18 compiler. It consists of an ad-hoc set of transformation rules that define the semantics of a concatenative language in terms of generation and transformation of machine code. And that is the _only_ thing it does. There is nothing that is somewhat structured to actually describe what the compiler is supposed to do, and under what conditions it breaks. So I'm getting really interested in correctness proofs[2], and in adding static semantics to language towers[3]. Looks like I need to start reading again. As for the Staapl PIC18 compiler, I'm trying to asses if testing the compiler by providing an ``obviously correct'' semantics (a reference implementation) is good enough. It is definitely more trustworthy to have a correctness proof, but from a practical point of view, a test suite with broad coverage might be sufficient. [1] http://www.chris-lott.org/misc/kaiser.html [2] http://compcert.inria.fr/doc/index.html [3] http://lambda-the-ultimate.org/node/3179 Entry: The Arbiter Problem Date: Fri Mar 19 23:53:06 CET 2010 I'm watching the interview with Leslie Lamport[1]. The recurring subject is the arbiter problem. Essentially: "which came first" is not solvable in finite time in general as time differences approach zero. Then there's some mention about discrete vs. continuous, and time differences and frequencies (non-discrete entities) used for information representation in the brain. Now this makes an old itch surface. I'm far from being able to express it, but it has to do with sigma-delta modulators (binary representation of continuous signals) and cross-modulation of near-equal square waves where arbitrary short pulses can arise. [1] http://channel9.msdn.com/shows/Going+Deep/E2E-Erik-Meijer-and-Leslie-Lamport-Mathematical-Reasoning-and-Distributed-Systems/ [2] http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.88.4426&rep=rep1&type=pdf Entry: Clojure Date: Tue Mar 23 18:07:31 CET 2010 Enlightening video interview [1]. Clojure seems quite interesting for the following reasons: - Persistent (shared) immutable data structures implemented using tries. The tries make it techinically O(log(n)) but the very high branching factor make it close to constant-time in practice. - Some CAS-based synchronization and transaction operations. I.e. the `atom' and `ref' constructs. - Good interop with JVM libs - Almost hygienic macros? No mention about this in the video though.. [1] http://channel9.msdn.com/shows/Going+Deep/Expert-to-Expert-Rich-Hickey-and-Brian-Beckman-Inside-Clojure/ Entry: Computation (pattern matching) vs. types Date: Sat Mar 27 13:34:57 CET 2010 Some more non-obvious obvious stuff (NOOS ??). Ha! I have some NOOS for you! Types (in the Haskell sense) give finite, static meaning to your program. All the other stuff to know about a program is about 1. loops (recursion) and 2. making decisions (conditionals) which are necessary to get out of loops and produce terminating computations. One of the things that really struck me when starting to use typed programming languages is that types abstract away `if' statements. You don't see their effect in a function type. A program looks a lot simpler when you can abstract away diverging control. A bit less obvious to me at that time was that they also abstract away recursion/loops. You don't see in the types that a function recurses and so possibly doesn't terminate. In fact, that is what a type system is: what you can know about a program (its structure) before running it. Knowing its full behaviour is `undecidable'; it can't be captured before running: - "infiniteness" comes from recursive functions acting on recursive data types = passing through the same control point more than once. - "decisions" come from conditionals like pattern matching = having run-time state influence future control points. So "what happens inside types" can in general not be determined by the type system. Again, to turn this around, the type system says some limitied thing about what happens at run time. This limited thing -- the program's static structure -- is the "type". Entry: Functional programs / stateful debugging Date: Sat Mar 27 20:55:05 CET 2010 After some time with Haskell, I'm thinking and writing Scheme code again. Some state re-appreciation maybe. One of the nice things to have is object pools in the form of weak hashes. When you have state (i.e. objects) it usually makes sense to keep track of them to look at program behaviour on the side. A simple approach is to always place objects of a certain kind in a weak hash table, to pay them a visit and see how their doing, or to inject some alternative states. Entry: Bottom up vs. Top down Date: Sun Mar 28 21:47:44 CEST 2010 As a programmer I am a bottom up person. I like to know the details, and build trustable and simple abstractions from the ground up. I slightly distrust top-down design. In bottom up design, high level design elements usually emerge spontaneously, and it is my impression that it is easier to "fix" a bottom up design by feeding patterns back from top to bottom after they have emerged, than it is to fix a top-down design by scraping together abstractions to hide the structure-less details that are pushed to the bottom. [1] http://reprog.wordpress.com/2010/03/28/what-is-simplicity-in-programming-redux/ Entry: A History of Haskell: Being Lazy with Class Date: Thu Apr 8 12:06:42 EDT 2010 about variable free programming: SPJ[1]: It's a bad idea; backus was wrong. I tried that and I found myself doing a lot of plumbing. Sometimes you really want to name that variable. I think Oege de Moor also left this track calling this too abstract (someone mentioned this on LtU). about specific computer architectures: SPJ[1]: It's a bad mistake: 1. why interpret if you can compile? 2. hardware industry moves so fast that it catches up easily to any specific optimizations. [1] http://research.microsoft.com/en-us/um/people/simonpj/papers/history-of-haskell/ Entry: GUIs and modules Date: Sun Apr 11 10:11:52 EDT 2010 Composability vs GUIs. Libraries are nice and composable, an end-user application isn't. Is there a way to bridge this? Pure Data comes close, though it lacks generic expressiveness. Entry: Databases and Normalization Date: Sun Apr 11 15:26:22 EDT 2010 Some DB questions from the complete noob. Suppose I have a relation where all the variables are strings. What do you call the operation that replaces unique strings by identifiers, and creates a new relation between the identifiers and the strings? The reason to do so would be to reduce storage requirements and reduce query computations, i.e. an identifier could be a 32bit or 64bit number, instead of a larger string key. In a functional store this would be object sharing by using pointers. Is this indexing? ORM? Entry: Functional Reactive Programming Date: Sat Apr 17 10:25:04 EDT 2010 I'm building a (naive implementation of) an FRP evaluator to implement the incremental update logic (compiler cache) of the ramblings formatter for the http://zwizwa.be website. Output events are server http requests, while input events are database (file) changes. Because file changes are infrequent compared to http requests, a "compiled" representation where intermediate data are retained in a cache works best. The amazing part is that, yes, it is really all about composition. And for composition, functions are king. Once you have all logic abstracted as a collection of functions, everything becomes a lot simpler to test individually and to string together. The reactive part then becomes a "toplevel" wrapped around a large collection of pure functions that does the real work. I.e. FP allows complete separation of the functional and the reactive part. This is great. The implementation uses lazy evaluation in the direction of functional dependencies (data pull) combined with event-driven invalidation in the reverse direction (data push). In Scheme this can be implemented using weak hash tables; whenever you apply a function to reactive values, notify each of the values that a computation depends on it. This can be done by associating each reactive value with a weak hash table of values that depend on them. Whenever a value gets invalidated, it can propagate invalidation to all nodes that depend on it. The weak table ensures that the GC still works for reactive values. The main abstraction then becomes function application, or more specifically: application of pure functions to reactive values, in zwizwa/plt/rv implemented as `rv-app'. Entry: Lambda Calculus for Electrical Engineers Date: Sat Apr 17 12:21:35 EDT 2010 If you look at the lambda calculus, it only ever talks about: * variable introduction or abstraction = make a socket * variable elimination or application = this plug goes in that socket The fact that it uses variables is really not that interesting and largely a consequence of paper being a flat medium. I.e. the "essence" of the LC needs to be embedded into something that can be written on paper as a flattened graph. First flatten the graph into a tree by introducing variable names, then flatten the tree to a sequence of symbols in the usual sense by introducing parenthesis and precedence rules. This horrible notation really makes it look bad and hides its true simplicity. Something clicked for me when I made this "paper serialization" explicit in the way I looked at the LC, and seeing its intrinsic beauty: an LC expression represents a directed a-cyclic graph of computation modules. Now _that_ is something that should make a lot of sense to an electrical engineer thinking of wires and amplifiers. Morale: The idea of "connectedness" that can be expressed by abstraction and application is tremendously powerful. There is one problem however. Variables in the lambda calculus represent computations, meaning the only values are other lambda terms. In some sense it doesn't exist as there are no primitive objects that are not computations themselves. Thies defies intuition, at least mine. Computation should map things to things, not computations to computations, right? Typed lambda calculi fix this by introducing primitive types, for the simple reason that without them, terms cannot be annotated with types. One can argue that the due to the existence of primitives and computations, the simply typed lambda calculus might be the best way to introduce the lambda calculus in a more concrete way. Entry: Abstract Machines and Semantics Date: Sat Apr 17 12:22:28 EDT 2010 The LC is a formal system; there is nothing up the sleeves except for formula on paper and ways to rewrite those formula. Abstract machines with environments are a more concrete form used to get something that behaves like one particular form of LC, the call-by-value lambda calculus (CBV-LC). Environments mainly allow substitution to be delayed to improve efficiency. I.e. you move from CBV-LC's "global" substitution rule to a machine that implements its substitution operation in a more low-level fashon, one step at a time. The machine is still a formal system consisting of formula and ways to rewrite those formula. Once you have such a machine, you can start moving into different directions. One popular direction is to move away from the guiding light of the CBV-LC you started from, and represent side effects such as assignments and continuations because they are essentially right in front of you, as part of the machine representation. What is not so easy to understand is the other way around: how can a particular abstract machine that has been soiled by locally implemented side-effects be re-abstracted? The subject of denotational semantics seems to be mostly about this: how to re-abstract things globally as functions that were easy to express as machines locally. These higher abstraction can then yield some insights by making it possible to prove properties of a machine that are completely intractable in the local machine view. Entry: Stream Fusion Date: Sun Apr 18 11:09:09 EDT 2010 About [1]. Simplified, there is a duality in the way sequences can be approached: as lists data or as streams (an unfolding of the list, the list's co-structure). The natural operation over a list is a fold, while the natural operation over a stream is an unfold. A stream is represented as an initial state and a stepper function. Fusing co-structures: The key trick is that all stream producers are non-recursive. This is established by allowing a stream to produce `Skip' values and moving the recursion to the `fold' part of a pipeline. Converting list ops to stream ops doesn't really perform any fusion on its own. However, it transforms the code into a form that is more accessible to the Haskell deforestation optimizer as it has no unnecessary recursion that ``blocks the view''. [1] http://www.cse.unsw.edu.au/~dons/papers/CLS07.html Entry: Clock Calculus Date: Sun Apr 18 19:48:23 EDT 2010 In the SIGNAL paper[1] it is mentioned that a clock calculus is a projection on the field Z_3: 0 = absence, 1 = true = presence, 2 = false. Weird.. The clock calculus allows then to statically verify the temporal correctness of processes. [1] http://www.springerlink.com/index/Y32277G7L8T61748.pdf Entry: Polymorphy & Functors (lifting) Date: Sun Apr 25 11:04:02 EDT 2010 One of the amazing new views that opened up for me after studying Haskell is the ubiquitous presence of morphisms that take some computation from one domain into a richer domain. Often, this can be combined with type classes, making the lift operator automatic. A class - a collection of operations on constrained types - often is defined for a single concrete base type, with other instances of the class built from composite objects. I.e. number -> vector. Now, combine this with laziness and domains can become infinite (i.e. power series, derivative towers, ...) and a lot of the mathematical objects useful for numerics/DSP can be represented quite directly in an abstract way, to be ``instantiated into'' programs using abstract interpretation. Entry: Object identity in Haskell Date: Tue May 4 16:10:29 EDT 2010 One of the weird concepts in Haskell is that objects have no `intrinsic ID'. Pointer equality (an external map that relates memory addresses to language objects) is a side effect! More specifically: objects do not exist, only values. If values need to be compared for equality, this needs to be explicitly implemented as an Eq instance. In more than one occasion I've felt the need to think of values as objects with a distinct identity, especially thinking about nodes in a graph, and adding connections. Whenever I run into the problem of needing node identity, what I really want is binding structure, or at least some staging/macro step that can create binding structure from What I learned is that even if it might not be trivial, it is usually possible to write keep nodes as variables and write function abstractions in Haskell that do the same thing. Esentially, using something akin to higher order syntax. This is one of those "deep differences" of functional programming that take some getting used to. I definitely need some more practice. Entry: Memoizing (==) in Haskell Date: Thu May 6 14:24:14 EDT 2010 Necessity for memoization of (==) pops up when comparing recursively defined datatypes. This is actually an interesting problem, as memoization is usually based on (==) in the first place! From [1] I'm pointed to Hughes' lazy memo functions[2]. The introduction talks about memoization as a fix-up ingredient for very high level programming, preserving modularity. The `unique' function implemented in terms of memoized constructors (hash consing) is probably what I'm looking for. What I didn't realized is that this can be defined in terms of a generic `memo' function. Haskell does seem to have a standard memo function[3]. [1] http://conal.net/blog/posts/memoizing-polymorphic-functions-part-one/ [2] http://www.cs.chalmers.se/~rjmh/Papers/hughes_85_lazy.pdf [3] http://www.haskell.org/ghc/docs/5.00/set/memo-library.html [4] http://www.haskell.org/haskellwiki/Memoization Entry: Data Parallel Haskell Date: Sat May 8 11:13:15 EDT 2010 Simon Peyton-Jones on Data Parallel Haskell[1]. See paper[2][3]: Harnessing the Multicores: Nested Data Parallelism in Haskell, Simon Peyton Jones, Roman Leshchinskiy, Gabriele Keller, Manuel MT Chakravarty, Foundations of Software Technology and Theoretical Computer Science (FSTTCS'08), Bangalore, December 2008. Rationale: - 1000s of processors: you need data parallelism - flat datapar is not enough - nested datapar covers a larger set of algorithms Key problem: handling nested dataparallel algorithms. Key insight: if you've got the lifted version f^ (a vectorized version of a function f, f^ = map f), you can implement the doubly lifted version f^^ = map f^ in terms of it. The basic idea is: f^^ = unconcat . f^ . concat This is flattening. Practically, the `unconcat' and `concat' functions do not generate intermediate structure; they can be implemented in constant time without copying. Representation of an array needs to depend on the types of its elements: data families. This is where fast `concat' comes for, as the _representation_ is concatenated. For higher order functions defunctionalization is used, such that the environments can be represented as tuples which reduces them to the data case. [1] http://www.youtube.com/watch?v=NWSZ4c9yqW8 [2] http://research.microsoft.com/en-us/um/people/simonpj/papers/ndp/index.htm [3] http://research.microsoft.com/en-us/um/people/simonpj/papers/ndp/fsttcs2008.pdf Entry: Map + state threading. Date: Sat May 8 14:14:27 EDT 2010 One of the patterns I use a lot in Scheme is a structure-preserving recursion (map) over a data structure where some context is updated as a side effect. I.e. map over a list with threaded state: ((state, in) -> (state, out)) -> state -> [in] -> (state, [out]) What is this abstraction called? See also [1]. As mentioned in [1], it's really just a state monad which can use the fmap function. The important thing is to see the function not as (s,i) -> (s,o) but as i -> s -> (s,o) which is State when the i is partially applied. [1] entry://../meta/20100224-220400 Entry: State monad with unit output Date: Sat May 8 17:47:57 EDT 2010 What's the purpose of a State monad that doesn't produce output, like (s -> ((),s)) ? The way I use it is to use fmap to map (In -> State s) over [In] to get [State s] which can then be sequenced to State s and started with runState. The only thing I'm interested in is the end state. But without output, this is really just a left fold (accumulator) :: In -> s -> s. What's the benefit of wrapping a fold up into a state monad? Monad transformers? It pops up in the Flatten.hs code for graph -> SSA conversion. Maybe this is related: merging monads and folds[1]. It talks about the two schools: fold vs. monads. [1] http://www.springerlink.com/index/768043006044675P.pdf Entry: Left folds in Haskell Date: Sat May 8 19:29:20 EDT 2010 Performing a left fold in Haskell can lead to stack overflows. Therefore it is suggested to use the strict function foldl' from Data.List instead. Does the same problem happen with State monads? [1] http://haskell.org/ghc/docs/6.12.1/html/libraries/base-4.2.0.0/Data-List.html#v:foldl Entry: Syntax directed vs. Semantics directed (fold vs. monad) Date: Mon May 10 14:24:05 EDT 2010 Merging monads and folds for functional programming[1]. For this Scheme nut & Haskell noob, the more interesting remarks are in the introduction which uses some terminology I wasn't familiar with: folds - syntax directed - organize on input types monads - semantics directed - organize on output types Here "syntax" refers to the structure of the data types. Generalized folds can be constructed systematically by replacing constructors with functions. (I.e. for the list constructor Nil this is a 0-argument function or a value). I briefly skimmed the paper; the basic idea seems to be that it's possible to define fold-like operators for monads and have the best of both worlds. Now I think I also understand why I have a natural affinity towards the fold approach, and have difficulty thinking in monads: * Scheme is strict and impure; monads are not really necessary. It's often useful to combine context/state with fold/map operations which are easy to express. * To a lesser extent, in a dynamically typed language, polymorphism can be used dispatching on inputs, while a statically typed language can dispatch on outputs. This is why types + monads go well together. One thing I'm interested in more is a monadic map, i.e. something that lifts (s,i)->(s,o) to (s,[i])->(s,[o]). How this this fit the bill? As a special case of monadic fold? [1] http://www.springerlink.com/index/768043006044675P.pdf [2] http://www.haskell.org/ghc/docs/6.12.1/html/libraries/base/Data-Foldable.html Entry: Tree Grafting (Monads) Date: Mon May 10 19:03:40 EDT 2010 Dan Piponi's monad post-tutorial[1]. Quote from Oleg Kiselyov: ``Monads turn control flow into data flow where it can be constrained by the type system.'' [1] http://blog.sigfpe.com/2010/01/monads-are-trees-with-grafting.html Entry: S K combinators and the Reader monad Date: Tue May 11 08:50:46 EDT 2010 The S and K combinators form a complete set of combinators that can encode all lambda terms. The proof consists of a mechanised transformation T explained in [1]. The trick is in the process of abstraction elimination in rules 5 and 6. 5. T[λx.λy.E] => T[λx.T[λy.E]] (if x occurs free in E) 6. T[λx.(E₁ E₂)] => (S T[λx.E₁] T[λx.E₂]) Abstractions provide a means to access values passed in by applications in "outer shells" of a lambda term using their name. The S combinator works the other way around, it can be interpreted to operate on binary trees that correspond to applications, propagating values to branches of a binary tree. I.e. the S combinator passes some values "under water". Starting from lambda terms, this binary tree is created by the T transform, where each application creates a fork point. Rule 5 makes sure that each abstraction is directly followed by an application, and rule 6 eliminates the abstraction by representing it as an S combinator. Short: all abstractions can be represented by passing values down branches of a binary expression tree. From this perspective it is not surprising that S and K pop up as (<*>) and return from the Reader (environment) monad. newtype Reader e a = Reader { runReader :: (e -> a) } instance Monad (Reader e) where return a = Reader $ \e -> a (Reader r) >>= f = Reader $ \e -> (runReader $ f (r e)) e Compare these to S and K. The K combinator is a non-tagged version of Reader's return. k :: a -> e -> a k x e = x The S combinator is a non-tagged version of Reader's (<*>) s :: (e -> a -> b) -> (e -> a) -> e -> b s x y e = (x e) (y e) (<*>) :: (Applicative f) => f (a -> b) -> f a -> f b [1] http://en.wikipedia.org/wiki/Combinatory_logic#Completeness_of_the_S-K_basis [2] http://en.wikipedia.org/wiki/SKI_combinator_calculus Entry: What does that mean? -- Denotational semantics Date: Thu Aug 5 19:33:57 CEST 2010 Conal about meaning[1]: ``In software design, I always ask the same question: "what does it mean?". Denotational semantics gave me a precise framework for this question, and one that fits my aesthetics (unlike operational or axiomatic semantics, which leave me unsatisfied).'' He then mentions Christopher Strachey[2] and Dana Scott[3]: ``Beware that denotational semantics has two parts, from its two founders Christopher Strachey and Dana Scott: the easier & more useful Strachey part and the harder and less useful (for design) Scott part.'' It seems that the Scott part is Domain Theory[4]. What is the Strachey part? [1] http://stackoverflow.com/questions/1028250/what-is-functional-reactive-programming/1030631#1030631 [2] http://en.wikipedia.org/wiki/Christopher_Strachey [3] http://en.wikipedia.org/wiki/Dana_Scott [4] http://en.wikipedia.org/wiki/Domain_theory [5] http://en.wikibooks.org/wiki/Haskell/Denotational_semantics Entry: Models of dataflow Date: Sat Aug 7 08:54:11 CEST 2010 I know of 4 different ways of looking at dataflow: * Functional dataflow (FRP): nodes are functions of time, functionally related. * Channels and Processes (CSP): nodes are programs reading from and writing to channels. * Synchronous state space models (SSM): functions transfer (state,input) into (next_state,output). * The Observer pattern: objects get notified of state changes of other objects through a notify() method call. In my current problem, the big issue seems to be to define the meaning of an "event" and "state". In FRP an event seems to be more of an implementation issue; i.e. how to represent the functions that make up the meaning. In CSP, an event is very explict: a write operation that triggers the "unlocking" of its corresponding read. Each process can have local state. In SSMs there are no events, only data changes, but there is a concept of "current state". In the observer there are explicit states and explicit events, but not necessarily parallelism as in CSP. The observer pattern can get messy, as it has very little high level structure apart from message passing. My question seems to be: Is an "event" necessarily "operational?", meaning here: is it related to a state transition (or is it a state transition)? On to the practical: I'm implementing a Ractive Network[1] in an environment that includes stateful objects that follow the Observer pattern. The approach I use is invalidation + lazy evaluation (I/LE). * input: write causes all nodes that recursively depend on the written node to be invalidated. * output: read recursively (re-)computes all nodes that have been invalidated. The main question for my application is: how much do we gain (and loose) from using a reactive pattern vs. a more low-level and ad-hoc observer approach? As hinted in [2], the main issues are algorithm complexity and granularity. If the granularity is large, the algorithm complexity might not really matter. So.. Can we take the best of both worlds? How can this be tied into an Observer pattern in a correct (and efficient) way? By introducing strict evaluation: a read transaction initiated after invalidate caused by a write transaction. In the I/LE implementation we still have events in the true OO sense: node invalidation. The trick is to propagate them correctly from network inputs to network outputs. Inside the network we have the benefits of the I/LE model (dependency management + linear evaluation complexity), outside we have the benefit of Observer: a clear (strict not lazy) event semantics. Some more remarks.. * About Pure Data. The Pd design has hot/cold inlets to "be done" with synchronization problems. Using it is not always easy (the trigger object) but it does have a very simple meaning: a patch is a sequential program. * Strict semantics seems to mesh a lot better with OO design. In the I/LE model, it seems best to have every write trigger a read, such that there is a direct path from write -> event handler. The 2-phase algorithm is still useful to avoid exponential complexity, but the lazy semantics is too hard to keep right when it's used in an imperative environment, by people used to OO programming. [1] http://en.wikipedia.org/wiki/Reactive_programming [2] http://en.wikipedia.org/wiki/Reactive_programming#Similarities_with_Observer_pattern Entry: Condition variable vs. semaphore Date: Thu Aug 12 12:05:35 CEST 2010 In PDP I used condition variables to signal queue writes. This seems to be incorrect. Semaphores are actually a lot better for managing work queues. First, they are simpler to use, but second they also can ensure that no events are missed. I.e. during the handling of a changed condition, the condition might change multiple times, which is missed by the handling thread. Entry: Coroutines Date: Sat Aug 14 09:14:32 CEST 2010 Relation between coroutines and one-shot (partial) continuations. This comes up very naturally in the implementation of PF: the continuation is a linear data structure that is transformed and consumed at runtime, while non-linear code is "ROM", i.e. constant to the linear core. The PF compiler (meta-system) is non-linear for a good reason: entirely ephemeral. Code is linear to mesh better with hardware, which is a finite resource. [1] http://lambda-the-ultimate.org/node/2868 [2] http://lambda-the-ultimate.org/node/803 [3] http://lambda-the-ultimate.org/node/438#comment-3228 [4] http://lambda-the-ultimate.org/node/558 Entry: Graphs without mutation Date: Thu Aug 26 11:24:28 CEST 2010 I'm facing with the following problem: * Construct a graph data structure (pointers in C structs) without using mutation in the sub-structures. This would need some kind of "magic" Y combinator-like operation. * Represent a graph in using a non-cyclic _constant_ data structure, so no zipper-like tree rewriting that requires new constructors to be called. Entry: Treasure trove: Faré Rideau's pointers Date: Sat Sep 18 14:52:38 CEST 2010 [1] http://fare.tunes.org/pointers.html Entry: Always-on / Image-based computing Date: Sun Sep 19 19:33:04 CEST 2010 What does it mean to take a snapshot of a memory image? (I.e. OS hybernate). The bottom line is that memory-structure is simple, but if memory points to external (non-memory) resources, one gets a hairy problem of re-initialization on bootup. Is it possible to design a computer that does not have any external references? I mean, start at the hardware: is it possible to design hardware without hidden state? Meaning, all state is exposed as RAM. Probably not realistically.. However, it should be possible to at least isolate initialization and limit them to an absolute minimum. What does this mean? Is initialized hardware maybe an "intermediate state", and is the real, natural state of hardware the OFF position? Is an interrupted machine and interrupted transaction that can be simply restarted. Can hardware initialization be seen as a transaction that never finishes? What about this: - An OS image (a "soft" object) is a graph structure represented in RAM which is transformed by an interpreter (CPU). - Some of the leaves in this tree are opaque objects that represent connections to the outside world (stateful objects): ports. - Opaque ports have a single method: they can be initialized. Necessary parameters for the initialization are present in the image and are transparent. - To reflect real-world scenarios, ports can depend on each other. I.e. a port initializer can be parameterized by an other, low-level initialized port. The point is that instead of seeing the whole OS as an opaque object, it might be beneficial to reduce the granulairity of opaqueness. Booting is "compilation", and hibernation and snapshotting is caching of compiled results. Moral: it takes a long time to restart a whole system. It takes significantly less time to restart only the non-memory resources of a system. Most of the beef in modern computer systems is in the in-RAM data structures, not the hardware configuration. Rebooting mostly rebuilds those data structures from more primitive (serialized, non-linked) representations. Entry: Mark-sweep GC Date: Sun Sep 19 20:44:51 CEST 2010 Trade-off: time vs. space. Apparently it's quite a bit slower than a copying collector. How does it compact? What about mark-compact? [1][2][3]. [1] http://en.wikipedia.org/wiki/Mark-compact_algorithm [2] http://comjnl.oxfordjournals.org/content/10/2/162.full.pdf [3] md5://43bfb9905329b1cac86ec1391efe5e67 Entry: void events Date: Sat Sep 25 11:25:37 CEST 2010 I'm building a reactive programming engine for a consulting project. An interesting concept I keep running into is that of "void events". Let's define an event as a pair. Reactive programming can then be seen as functions defined on events. My implementation is strict (non-lazy) to allow integration of side effects for coupling with the surrounding OO system. This definition as pairs is slightly more concrete definition as Conal's "functions of time" definition[1][2]. The main reason is that I don't see how to otherwise add strict side-effecting code. One could see the pairs as a piecewise constant representation of continuous functions. I.e. this helps reasoning about combinations of events with different time stamps: just think "what would the continuous function do?". Now, think of the qualitative difference of these two entities: 1. A push-button measurement. 1 = pressed, 0 = released 2. A stream of push events. The second one is an abstraction of the first one, but it is radically different as there is no longer a piecewise continuous function associated. The question is: is this thing "real" or an artifact of some modeling confusing? In essence, number 2 is a variant of the Dirac Impulse[3], a generalized function[4] that allows the bridge between continuous functions and discrete structures. A Dirac Impulse represents the derivative of a step function, allowing for discrete state updates as an integral of impulse. In my system they seem to serve the same purpose: a void event resembles a state change (a derivative) in the cases where it is not necessary or possible to indicate "by howmuch" the state changes. It thus seems to be an artifact of the side-effecting OO integration. [1] http://conal.net/blog/posts/why-program-with-continuous-time/ [2] http://stackoverflow.com/questions/1028250/what-is-functional-reactive-programming/1030631#1030631 [3] http://en.wikipedia.org/wiki/Dirac_delta_function [4] http://en.wikipedia.org/wiki/Generalized_function Entry: Values and Transitions Date: Mon Sep 27 11:00:50 CEST 2010 This places the previous post[1] in another light. If you want to model a dynamical system (a system with internal state), you need to have concepts of both state and transition. For physical (Newtonian) dynamical systems this is very obvious: you always need two state variables: position and velocity. For discrete systems there is a clear analogy. To be able to detect and compute in terms of value change, one needs to have access to the _previous_ value, so there are also two state variables: current and previous position, or current and difference. [1] entry://20100925-112537 Entry: Pointer Reversal for Graph Traverse Date: Fri Oct 8 00:52:21 CEST 2010 When traversing a graph (i.e. mark phase of a GC) it is possible to encode the traversal stack in the graph data structure using "pointer reversal". ( Related to zipper? ) [1] http://www.cs.arizona.edu/~collberg/Teaching/520/2005/Html/Html-39/index.html Entry: Real Time GC Date: Fri Oct 8 01:18:41 CEST 2010 Essentially, the problem with RT gc is that you need to guarantee that the collector will always catch up with the mutator. If there is enough memory to spare (buffer) then this can effectively work in practice, though it doesn't seem that hard guarantees can be obtained. For systems with little spare memory this poses a problem. [1] http://www.cs.wustl.edu/~mdeters/doc/slides/rtgc-history.pdf Entry: Refactorer: dependency graph visualisation Date: Sun Oct 10 15:52:26 CEST 2010 Refactoring is mostly about changing dependencies between code. It would be so cool to have a tool that can be used to _visualise_ dependency graphs of a whole project, such that changes can be made _while walking in that virtual space_. Entry: Pure functional programming and object identity Date: Tue Oct 19 10:35:32 CEST 2010 One of the most difficult ideas to let go I find object Identity. It pops up in my expression to SSA conversion code that needs to process code as a graph. The usual imperative approach (is this node "x" ?) don't seem to work very well. Semantically there is no problem: equality is well defined. Identity isn't necessary for that. However, identity makes it easier to _implement_ equality. There I think I still need some experience to see how this would be handled correctly. ( Obscure maybe : ) This seems to be a representation problem. Representing a graph as a list of named nodes makes implementation of equality trivial and O(1). However, mapping an original expression to a dictionary does require a full traversal. The problem is to keep dictionaries in sync: node names are _centralized_ data. I.e. combining 2 expressions needs a merge of dictionaries, and they might have different names for the same nodes. Entry: fexprs Date: Tue Oct 26 00:22:37 CEST 2010 The ultimate tension between static and dynamic. Looking from afar, the only argument that I distill from the static "eval-is-bad" camp is that reasoning about code is complicated by late-bound semantics. The argument from the dynamic "eval-is-good" side is that late-bound semantics are the most flexible (and simple?) starting point and thus preferrable as a language base. ( Macros float a bit in the middle between the fully dynamic Smalltalk approach where meaning is completely defined at run time, and the fully static typed functional languages where a large part of the meaning of code (types) can be used at compile time. I.e. in racket, macro bindings are always well defined (a macro name maps to a precise function that is known at compile time) but the way it transforms code does not preserve any other invariants. ) As Dave puts it [4]: Fexprs are bad for two reasons: they make the language hard to compile efficiently and they make programs hard to understand by subjecting the basic program definition to dynamic reinterpretation. Thomas Lord's comment[5] is quite interesting though. Also check out the Kernel[6] programming language. [1] http://kazimirmajorinc.blogspot.com/2010/10/on-pitmans-special-forms-in-lisp.html [2] http://en.wikipedia.org/wiki/Fexpr [3] http://lambda-the-ultimate.org/node/3861 [4] http://calculist.blogspot.com/2009/01/fexprs-in-scheme.html [5] http://lambda-the-ultimate.org/node/3861#comment-57967 [6] http://web.cs.wpi.edu/~jshutt/kernel.html [7] http://lambda-the-ultimate.org/node/3861#comment-57972 Entry: Algebraic Datatypes Date: Thu Dec 2 11:15:54 EST 2010 An ADT constructor application is a function application that can be undone. This is a simple and powerful idea. Entry: Representation of impedance/admittance duality Date: Thu Dec 2 11:42:46 EST 2010 -- Instead of having to perform numerical inverse all the time, why -- not represent numbers as (x, 1/x), where possibly one of the two is -- not given? -- EDIT: This turned out to be quite interesting: dual.hs ----------------------------------------------------------------------- import Data.Complex ---- Dual representation of 2-terminals. -- "Half" two-terminal: 2 primitive elements, one composition. data HTT a = Resistive a -- V = R I ; I = G V | Reactive a -- V = L dI/dt ; I = C dV/dt | Composite (TT a) (TT a) -- V1 + V2 (ser) ; I1 + I2 (par) deriving (Show, Eq) -- A two-terminal is a dualized HTT. data TT a = Primal (HTT a) -- keep duals | Dual (HTT a) -- flip duals (invert) deriving (Show, Eq) dual (Primal htt) = Dual htt dual (Dual htt) = Primal htt -- The main idea of the dual representation is generality. One -- consequence is that building a s-parameterized -- impedance/admittance/transfer function is straightforward. trans :: (RealFloat a) => (TT a) -> (Complex a) -> (Complex a) trans = flip a'' where a'' s = a' where a' (Dual htt) = 1 / (a htt) a' (Primal htt) = a htt a (Resistive r) = r :+ 0 a (Reactive r) = (r :+ 0) * s a (Composite x y) = (a' x) + (a' y) ---- Absolute representation of 2-terminals in terms of dual rep. -- To represent networks we need to tag the TT to indicate which one -- of Impedance or Admittance it represents. data ATT a = Impedance (TT a) | Admittance (TT a) deriving (Show, Eq) -- Primitive absolute elements are constructed as I/A tagged Primal TTs. -- Resistive: Impedance / Admittance independent of frequency. res r = Impedance (Primal (Resistive r)) -- resistor (Ohms) cnd g = Admittance (Primal (Resistive g)) -- conductor (Siemens,Mhos) -- Reactive: Impedance / Admittance proportional to frequency. ind l = Impedance (Primal (Reactive l)) -- inductor (Henry) cap c = Admittance (Primal (Reactive c)) -- capacitor (Farad) -- Project ATT to TT, interpreting it as impedance or admittance. imp (Impedance tt) = tt imp (Admittance tt) = dual tt adm = dual . imp -- Primitive absolute 2-terminal operations are constructed in terms -- of a Primal Composite operation. ser a b = Impedance (Primal (Composite (imp a) (imp b))) par a b = Admittance (Primal (Composite (adm a) (adm b))) j x = (0 :+ x) -- Voltage divider divider a b s = a' / (a' + b') where a' = trans (imp a) s b' = trans (imp b) s Entry: Understanding Referential Transparency Date: Fri Jan 14 14:52:28 EST 2011 If you look at the definition of Referential Transparency[1] (RT) which says that a computation can always be replaced by its value, it seems to be relatively straightforward. What I found really hard to understand is why this does away with object identity. The `eq?' function in Scheme which compares pointers has no place in Haskell, because it has a result that depends on whether its arguments come from the same evaluation (value copied) or not (value obtained by running the same computation twice). In essence this means that a value does not have an implicit, unique name. A value is just a value and nothing more. To name values, a function needs to be defined that associates a name to a value or the other way around. Of course, pointer comparison does show up in the Haskell trenches, because it is too useful for implementing some low-level forms of memoization that would otherwise lead to exponential complexity. See makeStableName[3]. It soils your code with the IO monad though, unless you go down the bumpy road of unsafePerformIO[4]. [1] http://en.wikipedia.org/wiki/Referential_transparency_%28computer_science%29 [2] http://stackoverflow.com/questions/1717553/pointer-equality-in-haskell [3] http://www.haskell.org/ghc/docs/7.0.1/html/libraries/base-4.3.0.0/System-Mem-StableName.html [4] http://www.mail-archive.com/haskell-cafe@haskell.org/msg52544.html Entry: Functional Programming is Fantastic Date: Sat Feb 12 10:45:23 EST 2011 It is great to be able to completely isolate parts of a program without _any_ worry of whether you missed a covert communication channel or side effect. On the other hand, this can be a bitch to program. It makes you think about how much you normally use covert channels and side effects to fit a square peg into a round hole. Entry: Merging Date: Wed Mar 2 11:08:17 EST 2011 In the back of my head I've gotten interested in the mergin problem. Mostly because of conflicts in source control, but also as a general idea of data updates and the `what is a change' question. I ran into Pierce's bidirectional programming work[2] before. Seems there is also a sync utility alled Unision[3] based on similar ideas. [1] http://apps.ycombinator.com/item?id=2266071 [2] http://lambda-the-ultimate.org/node/2828 [3] http://www.cis.upenn.edu/~bcpierce/unison/ Entry: Filling in gaps Date: Wed Mar 30 14:27:33 EDT 2011 Programming is filling in unspecified gaps. Experience helps to not make too many bad choices. Entry: 0,1,2 stacks : 3 kinds of programming? Date: Wed Apr 6 23:02:25 EDT 2011 These are not the same machine. 0 = FSM (regular languages) 1 = PDM (push down automaton / context free grammars) 2 = turing equivalent [1] http://en.wikipedia.org/wiki/Pushdown_automaton Entry: Parser-oriented programming Date: Sat Apr 9 10:38:31 EDT 2011 The USB driver is moving forward. I found a way to use structs in Forth, by turning them into streams, and writing parsers. The golden rule seems to be: don't use datastructures in Forth, use streams, tasks and/or state machines. The point-free style works better with ``parser-oriented'' programming. Or Stream-oriented if you want. Optimize the protocols to make use of this. I.e. a simple trick to reduce memory usage is to always prefix the size of an array instead of using a termination condition. (Pascal strings vs. C strings) If you think of it, data structures are only postponed execution. This is very apparent in a language like Haskell: think of deforestation optimization where constructors and pattern matching get combined to eliminate the data structure entirely[1]. On an embedded platform this is even more true since you really are more interested in process and IO than for a normal computer, which would be data-storage central. EDIT: Found something on LtU that advocates quite the opposite[2]. The thing is that time in computation is not of the same quality as time: it doesn't support random access (until it's buffered, i.e. "rotated" into space. EDIT2: It's an implementation issue: what kind of high-level description will produce a low-memory implementation? [1] http://homepages.inf.ed.ac.uk/wadler/topics/deforestation.html [2] http://lambda-the-ultimate.org/node/925 Entry: Reforestation Date: Fri Apr 15 14:35:59 PDT 2011 Instead of starting from a functional description and "hoping" all constructors can be optimized out, is it possible to start from guaranteed elimination, and see what subset highlevel language fits on top of that? This has always been the core of my quest. I suppose this is the Hume project[1]. EDIT: Reminds me of some of the stuff I read around PEGs: a parser-centric view instead of a language-centric view, because what you care about is not so much properties of the language itself (grammar, generation) but properties of implementation and efficient information flow. [1] http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.97.1710&rep=rep1&type=pdf Entry: Protocol-oriented programming Date: Sat Apr 23 00:42:20 EDT 2011 Staapl, PIC18 USB driver: I got it into my head that I want to program in Forth without using data structures (random access data) but just using linear time-ordered streams or channels. Forth allows very dense code construction. This is a direct consequence of its implicit data access: simply not having to store memory locations saves space. The price you pay is extra effort necessary to properly factor code. It becomes very important to have a procedure "do only one thing". This is a skill that can be learned, and in practice it works amazingly well. The trouble is, that approach doesn't work well with structured data (trees and graphs). The obvious reason is that such data representation requires the re-introduction of names or random access. Is it possible to do a similar "compression" for data? The idea is that if it's possible to use local access variables instead of random access variables in Forth code, it should be similarly possible to replace all data structures with serial, uni-directional protocols. The underlying idea is the operation of deforestation that is possible in pure functional programming. Deforestation eliminates constructor+match pairs and rewrites the code's control flow to turn such a pair into a function call or let-abstraction. The big idea here is that data structures are really just postponed function calls, or "buffers". Depending on the fan-out of the data structures (= the number of times a certain structure is used) it is often possible to not store intermediate data structures in RAM but push components to consumers directly. In practice this can often be done by introducing parallelism. I would assume this doesn't work for all kinds of code. I.e. code that has intrinsic "data storage" features, such as a database. However it should be possible to employ the principle in cases where data use is just buffers or temporary storage. In that case it should be possible to eliminate it with different control-flow factoring. One of the drawbacks of this approach seems to be that the protocol definition should be part of the design process. Just as function prototypes are really important to write good Forth code, serial data protocols should be optimized for ease of parsing. It seems better to call this approach "parser-oriented" programming: make sure that protocols between different components are defined in such a way that local resource requirements are minimized. In my current problem (USB driver for PIC18) I'm facing the problem that the input data structures are fixed, and clearly defined with a random-access approach in mind (think C structs). So to make this work in practice it is probably necessary to preprocess the input to turn it into a usable event stream. So what is the really big idea? Make data dependencies explicit. Entry: Low-level C : Exceptions or error codes? Date: Wed Apr 27 10:27:52 EDT 2011 The trouble with exceptions is memory management. If all data is stored in local variables on the call stack, exceptions implemented with setjmp are not a problem. If at any point there is non-local state manipulation, be it memory allocation or any other global state update, exceptions are very hard to get right, and incremental error passing that can undo any global changes is probably a better idea. Entry: Goldmine Date: Thu May 5 00:34:19 EDT 2011 [1] http://homepages.kcbbs.gen.nz/tonyg/projects/thing.html Entry: Object Identity Date: Thu May 5 13:16:47 EDT 2011 Baker on Object Identity[1]. I can't keep these two apart, so here's some examples: - extensional def[2]: structure fully specified, exhaustive - intensional def[3]: implementation hidden, only properties specified [1] http://home.pipeline.com/~hbaker1/ObjectIdentity.html [2] http://en.wikipedia.org/wiki/Extensional_definition [3] http://en.wikipedia.org/wiki/Intensional_definition Entry: Haskell: functions vs. structures Date: Mon May 9 09:32:38 EDT 2011 Something funny happens when data structures are immutable: data and code become more alike. The difference between a (immutable) structure and a (pure) function is that a structure is like a function where there is a time-disconnect and multiplicity-disconnect between call and function entry, meaning that data pasted in a structure will be interpreted later, and possibly multiple times. Data pasted in a function will be interpreted immediately and only once. If data structures are consumed only once, and the context of data interpretation is explicit, the difference almost disappears and it is often possible to re-arrange code such that constructor/deconstructor pairs can be simplified into function calls. This is called deforestation[1]. [1] http://homepages.inf.ed.ac.uk/wadler/topics/deforestation.html Entry: Simple checksums Date: Sun May 22 17:31:14 CEST 2011 On a tiny uC, which simple checksums are most effective? The common ones are: - add all bytes together - perform XOR Without really thinking, I'd say that XOR is worse since it doesn't really "smear out" the errors: all 8 bits are just independent parity checkers, though there are 8 of them. Entry: Fault-tolerant, stateful code Date: Tue Jun 14 16:13:08 CEST 2011 There seems to be only one guiding principle: keep the invariants of the data structure as simple as possible. It seems to make sense to split the problem of fault recovery into two parts: - temporary (local) inconsistency due to transient faults These are quite easy to handle by simply retrying the operation. - permanent inconsistencies due to permanent state mutations These are really hard if there is no redundancy to bring the state back to consistency, or if the invariants are simply too complicated to "try to be smart". Entry: intension / extension Date: Thu Jun 16 11:50:59 CEST 2011 Seems the way this is used in CS is due to Church's lambda calculus[1]: In developing his theory of lambda calculus, the logician Alonzo Church (1941) distinguished equality by intension from equality by extension: It is possible, however, to allow two functions to be different on the ground that the rule of correspondence is different in meaning in the two cases although always yielding the same result when applied to any particular argument. When this is done, we shall say that we are dealing with functions in intension. The notion of difference in meaning between two rules of correspondence is a vague one, but in terms of some system of notation, it can be made exact in various ways. In the previous section[2] it is said that: The rule that defines a function f:AB as a mapping from a set A to a set B is called the intension of the function f. The extension of f is the set of ordered pairs determined by such a rule: [1] http://www.jfsowa.com/logic/math.htm#Lambda [2] http://www.jfsowa.com/logic/math.htm#Function Entry: Referential transparency and object identity Date: Sun Jul 10 14:48:29 CEST 2011 An expression e of a program p is referentially transparent[1] if and only if e can be replaced with its evaluated result without affecting the behavior of p. This is a very strong requirement and completely destroys the ability to use object identity. From [2]: Identity is a property that an object may contain aspects that are not visible in its interface. These "aspects" might simply be references by other objects, as the just the act of referencing already creates a "hidden" relationship between the pointing objects. This relationship cannot be expressed by any value that would substitute the object. An interesting application of this is that while it is possible to recover input->output dependency information for Haskell functions of the Num class through abstract interpretation, it is impossible to recover _internal_ sharing information from a Haskell program by observing just the input->output behaviour of that program. Here "sharing" refers to bindings defined by `let' and `where' forms that have more than one reference, thus producing a dependency graph instead of a tree. In short: values do not encode who they are used by. In the presence of sharing, the output of such an abstract interpretation will be a tree with duplicate nodes. With careful input prepration, external input nodes can be unified using equality in a straightforward way, but internal nodes need to be "scanned" based only on structural equivalence of their dependency graph that traces back to the input nodes. This is really just common subexpression elimination and has not much to do with the original structure. [1] http://en.wikipedia.org/wiki/Referential_transparency_(computer_science) [2] http://en.wikipedia.org/wiki/Identity_(object-oriented_programming) Entry: A letter to a C programmer Date: Sat Jul 16 11:09:59 CEST 2011 If you ever wonder where my tendency to write weird C preprocessor constructs comes from, it is most likely from spending too much time with Racket, a Scheme dialect. http://racket-lang.org/ That language contains the current state of the art of untyped macro systems, which integrates a very powerful and simple name scope management system (modules) with simple templates ("syntax-rules") and full multiple-stage code generation ("syntax-case"). It is an incredibly powerful system. Most of what it makes possible you can't do in CPP. What seems to have happened for me though is that working with Racket macros made it possible to point a finger at exactly what is wrong with CPP and how to hack around it in some cases. Entry: Getting used to Monads Date: Fri Jul 22 11:14:12 CEST 2011 I (re-)derived my first Monad implementation, peeking left and right but luckily making some mistakes in the process. It's hard to say what actually clicked in my mind, but it seems that exposing what `bind' and `return' actually do in some situations clears up a lot of magic dust. The real problem is that, for a "low-level" programmer like me, the Monad is too abstract to start with initially. The consequences of this high level abstract construct are vast and profound and make room for understanding that is hard to find in an impure language. But still, all that space it covers needs to somehow be part of your mental framework to make sense of the usefulness. In short: this knowledge is hard to bootstrap. Entry: Monads and evaluation order Date: Sat Jul 23 13:28:51 CEST 2011 The name Monad refers to "1 output"[1]. This output is referred to as a parametric type (M t). The interesting thing is that while bind takes functions that produces a single monadic (wrapped) _output_ from from a single naked input, there is no reason that functions could not be partially applied functions. However, trying to do this immediately raises the issue of order, i.e. for partial application of a function of type: bind2 :: M a -> M b -> (a -> b -> M c) -> M c there are 2 natural implementations that have some symmetry but have a different behaviour, becuse one bind operator is "executed first": bind2 ma mb f = ma >>= \a -> mb >>= \b -> f a b bind2' ma mb f = mb >>= \b -> ma >>= \a -> f a b The same, but in do notation: bind2 ma mb f = do a <- ma b <- mb f a b bind2' ma mb f = do b <- mb a <- ma f a b From the significance of the order of lines in the do statement (or equivalently, the data dependency of the application of >>= operators) it seems plausible to accept that monads can be used for implementing behaviour that requires a certain order, i.e. state updates or CPS computations. Note that there are monads for which the order does not matter. These are called commutative monads[2]. [1] http://www.haskell.org/haskellwiki/Monad [2] http://www.haskell.org/haskellwiki/Monad#Commutative_monads Entry: Monad is a type class Date: Sat Jul 23 14:13:30 CEST 2011 Some important things to realize about monads, from my earlier misconceptions which missed the (awesome) generality of this concept: * `Monad' is a type class and is used to say something _about_ a parameteric type. For a parametric type (M t), the expression (Monad M) declares that the parametric type M implements the operations: bind :: M a -> (a -> M b) -> M b return :: a -> M a an `instance' declaration makes this explicit. * The abstract type Monad M => (M t) can carry a lot of hidden information next to "something of type t". I.e. a typical declaration of a monadic type is: instance Monad (M a b c ...) where the types a b c ... are type parameters that do not take part in the monad interface. The full type of M would be: M a b c ... t where the parameter `t' is the one that takes part in the monadic interface. The type predicate `Monad' expects a parametric type with one parameter. The "type" of `Monad' is called a "kind", and is * -> *. [1] * The occurence of type paremeter t in a monadic type (M t) does not have to reflect to a naked data item in the implementation. It can just as well be the input or output type of a function or a parameter in any parametric type. I.e. values of the (Cont v) monad do not contain concrete values v; they are functions that take a v type as input. In some sense it is not monads that are the difficult concept, it is type classes in general. The ladder of abstraction is the list: 1. basic types, not parameterized 2. parametric type, parameterized by basic types 3. type classes, parameterized by parametric types [1] http://www.haskell.org/haskellwiki/Kind Entry: Composing monads Date: Sat Jul 23 14:46:19 CEST 2011 Apparently that's not such a well-known subject[1]. What I can take from that post+comments is that the implementation of a monad is too low-level to make any general statements about, and that what would help is a more disciplined way to build monad implementations from composition of primitive monads/transformers that are better behaved. [1] http://www.randomhacks.net/articles/2007/03/05/three-things-i-dont-understand-about-monads Entry: Learning Haskell Date: Sun Jul 24 19:03:32 CEST 2011 In Haskell I find it quite difficult to guess whether I can do something or not, i.e. combining different abstraction mechanisms seems to not always work as expected. It's hard to make this more explicit but it's as if the abstraction can get so high that you completely loose all intuition. Maybe it's just my learning curve still, but I have had this going on for a while. Maybe it's also that I just don't write any really difficult code in C, and that in Scheme I resort a lot to "interpretation" or loose runtime typing because the real structure of the code isn't so clear. The good thing in Haskell seems to be that once you do manage to express your idea with a lot of static structure, the result is beautiful and likely correct and very general(izable). Entry: do notation algebra Date: Mon Aug 1 21:22:42 CEST 2011 let_ var body = do v <- var body $ return v Why is the above not equivalent to below? let var body = body var The monad I'm using is a CPS monad used to implement sharing. Entry: Awesome Prelude Date: Tue Aug 2 14:15:49 CEST 2011 Funny that in the "JavaScript types" example in [2] the same mechanism of language-specific type constructors using phantom types is used as in the tagless paper[4]. The general idea of replacing datatypes with type classes is to abstract what you want to do with it. For data types this is construction and destruction. For the BoolC class the constructors are `true' and `false' while `bool' is the destructor. data Bool class BoolC dsl where false :: dsl Bool frue :: dsl Bool bool :: dsl a -> dsl a -> dsl Bool -> dsl r Here the `dsl' parameter is the parameterized type constructor for the DSL that implements the BoolC class. The cool thing is that the same strategy works for functions class FunC dsl where lam :: (dsl a -> dsl b) -> dsl (a -> b) app :: dsl (a -> b) -> dsl a -> dsl b Here `lam' takes a Haskell function and maps it to a function in the DSL representation, and `app' does the reverse. The downside is no syntactic support, which makes it difficult to use in practice. The best approach atm seems to be to write an explicit syntactic frontend when your're designing a language to sidestep these issues. [1] http://tom.lokhorst.eu/media/presentation-awesomeprelude-dhug-feb-2010.pdf [2] http://tom.lokhorst.eu/2010/02/awesomeprelude-presentation-video [3] https://github.com/tomlokhorst/AwesomePrelude [4] http://www.cs.rutgers.edu/~ccshan/tagless/jfp.pdf Entry: CPS vs SSA Date: Tue Aug 2 17:31:35 CEST 2011 CPS has well-defined binding structure and parallel assignment. In SSA[1] this seems to be somewhat looser. Is there a real difference here? (Context: for me the point is to make _really fast code_ that goes straight onto a DSP or FPGA.) The WP article on SSA[1] mentions that SSA has non-local control flow[2] while CPS has none. (With this is meant things like exceptions and continuations, so that's not relevant for me.) Let's look for something that compares CPS and SSA[3][4], and not to forget ANF[4]. The interesting bit about SSA is the Phi functions, which are placed at control-flow joins. Wingo cites the interpretation that each basic block is a function, and that a Phi function indicates that the basic block has an argument. Wingo goes on to say that SSA is really for first-order programs and aggressive optimization of loops, while CPS is for higher-order programs. [1] http://en.wikipedia.org/wiki/Static_single_assignment_form [2] http://en.wikipedia.org/wiki/Control_flow#Structured_non-local_control_flow [3] http://lambda-the-ultimate.org/node/3467 [4] http://wingolog.org/archives/2011/07/12/static-single-assignment-for-functional-programmers Entry: Lambda or struct? Date: Thu Aug 4 09:18:38 CEST 2011 One thing one could conclude from the embedding of typed languages as functions instead of data is that functions seem to be strictly "more magical" than data structures. Is this merely a restriction of algebraic data types in a typed setting? Hence the existence of GADTs. I never ran into this kind of "difference" between data and code in Scheme, i.e. following the idea that data is completely free-form and can always be interpreted. Entry: initial / final Date: Thu Aug 4 23:56:46 CEST 2011 I found a comment explaining the concepts "initial" and "final" in category theory. See section 1.4 example 1.4.4 in Pierce's Category Theory for Computer Scientists[1]. In a category, an initial object is an object FROM which there is a unique arrow to each object in the category. In Set there is only one, the empty set, where each arrow is an empty function. A final object is an object TO which there is a unique arrow from each object in the category. In Set every one-element set is a final object. [1] isbn://0262660717 Entry: Phantom types Date: Sat Aug 6 12:37:15 CEST 2011 Recently while styding haskell I ran into the trick of using phantom types to "tag" information at compile time. I.e. a data type data Str t = Str String is internally just a string, but it is possible to use the type parameter to specify operations like: data Blessed data Raw bless :: Str Raw -> Str Blessed The function `bless' could then maintain some kind of invariant on String. The presence of that invariant can be indicated at compile time by the type Str Blessed. Hiding the constructor `Str' will make it impossible to create a `Str Blessed' data type. A limited constructor could be exported to create non-blessed strings: str :: String -> Str Raw Entry: Generalized Algebraic Data Types Date: Sat Aug 6 14:18:45 CEST 2011 Soundbyte: an ADT constructor can't have types such as T x -> T y, which are useful (necessary?) to represent the typed lambda calculus. ( Note, it is possible to use type *classes* to do this: C repr => repr x -> repr y, but that's a different story. ) In what sense exactly is a GADT generalized? [1] http://en.wikipedia.org/wiki/Generalized_algebraic_data_type Entry: SSA vs CPS Date: Sun Aug 7 23:08:18 CEST 2011 This[1] looks like a nice starting place. [1] http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.3.6773&rep=rep1&type=pdf [2] http://www.delicious.com/doelie/ssa Entry: Type-level computations / meta systems Date: Mon Aug 8 10:44:12 CEST 2011 Metaprogramming. I've seen it now from many sides, and it is fascinating how they are all subtly different but still quite similar. From logics that say something "about" code, to macro systems that generate code, either providing types (MetaOCaml) or not (Scheme/Racket). The main "force" seems to be the tension between simple logic systems that can be reasoned about, and full-blown programming systems that are limited in analysis by the halting problem (undecidability[3]). - Typed metaprogramming: * Type systems proper: Hindley-Milner[4] (just complex enough to make inference work) up to the "lambda cube"[2] which contains systems that can only be checked. Main benefit: prove things about programs. * Type level computations in Haskell using functional dependencies in type classes[1]. Benefit: allow limited form of computation without getting into undecidedness. * MetaOCaml[5]: proper multi-stage code generation in a typed setting. - Untyped metaprogramming: * Scheme's hygienic macros: generating code programmatically respecting binding structure. * Typed scheme & macros. Similar to MetaOCaml but approaching the problem from the untyped->typed side. (find some links). [1] http://hackage.haskell.org/package/type-level [2] http://en.wikipedia.org/wiki/Lambda_cube [3] http://en.wikipedia.org/wiki/Undecidable_problem [4] http://en.wikipedia.org/wiki/Type_inference [5] http://www.metaocaml.org/ Entry: Flattening expressions using liftM2 Date: Mon Aug 8 18:47:59 CEST 2011 One of the revelations of my recent Haskell study sprint is that it is possible to "serialize algebra" using liftM2. This might be idiosyncratic language for something that has a proper name, but what I mean is that the function liftM2 :: Monad m => (a -> b -> c) -> m a -> m b -> m c is the bridge between "parallel" computations that have a binary tree structure, where both legs of the tree (types a and b in the input function above) are independent, and "sequential" computations that have a fully specified order imposed by the monad structure. Entry: QuickCheck as an API design guide Date: Sun Aug 14 10:43:25 CEST 2011 Don Stewart mentions in an xmonad talk[1] that they've been using QuickCheck as a guide to designing good APIs. If the QC properties are very hard to write down, your API sucks. Another tip I've heard before: keep all your functionality pure. Only use thin layer of IO to interface with the outside world. [1] http://www.ludd.ltu.se/~pj/hw2007/xmonad.mov Entry: Applicative Transformers Date: Sun Aug 14 16:25:07 CEST 2011 Some observations to make precise: * A DSP language (of combinators) would benefit from connections that happen behind the scenes. Examples are state relations over time. * A recurrence relation / difference equation is essentially a state monad. * A Monad is also Applicative * Audio DSP is essentially a mix of State and List monads. * Monad transformers are a bit of a kludge. There is not a lot known about the algebra of monad transformers. * Applicative Transformers do not exist because applicatives are "naturally composable". [1] -> Section 4. * Is it true that the direction that is inherent in the State monad -- the function s -> (s,t) -- is what causes State to be specific enough to be a Monad? Is causality the essential element? * If a state space model can be run in reverse, would it stop being Monad? This reminds me of Steele's parallel language.. [1] http://www.haskell.org/haskellwiki/Applicative_functor Entry: Arrow = Applicative + Category Date: Sun Aug 14 18:21:43 CEST 2011 What does this mean? From my own experience trying to think about abstractions it seems that Applicative uses a curried interface, while Arrow uses tuples. [1] http://www.haskell.org/ghc/docs/latest/html/libraries/base/Control-Category.html#t:Category [2] http://cdsmith.wordpress.com/2011/07/30/arrow-category-applicative-part-i/ [3] http://cdsmith.wordpress.com/2011/08/13/arrow-category-applicative-part-iia/ Entry: Existential types Date: Sun Aug 14 21:29:21 CEST 2011 When using existential quantification[1], you can't actually do anything with the values, because you don't know the type! i.e. the difference between these two: data Foo = forall a. Foo a data Bar a = Bar a is that for Foo we don't know what type a is, and because Foo doesn't have type parameters, we can't specify it elsewhere either. Practically that means that for Foo, we can't really do anything with the values it wraps because we just don't know the types. I see two ways around this. One is explained in [1] and it boils down to giving information about a, i.e. data FooShow = forall a. Show a => Foo a Here we still don't know what type a is, but we know that we can apply the operations from the Show class to values wrapped by the Foo constructor. ( It seems this can't be done using a field accessor, but it is possible with pattern matching. ) Another way is to use the quantified variable more than once. data FooApp = forall a. Foo a (a -> Int) Here we still don't know anything about the type, but we know that we can apply Foo's second argument to its first to obtain an Int. This is the trick I've used in the a state-space model (SSM) to record initial state and state transition together. The only operation that's ever performed on state is to pass it to the state transition function. I used existential types to make the following composition operation on state machines fit the Category class. This required that the state parameter is hidden. chainSSM :: ((a, s1) -> (b, s1)) -> ((b, s2) -> (c, s2)) -> (a, (s1,s2)) -> (c, (s1,s2)) chainSSM f g = fg where fg (a, (s1,s2)) = (c, (s1',s2')) where (b, s1') = f (a, s1) (c, s2') = g (b, s2) data SSM i o = forall s. SSM s ((i, s) -> (o, s)) -- If this would be (SSM s i o) then the composition operation would -- have been :: SSM s2 b c -> SSM s1 a b -> SSM (s1,s2) a c -- which doesn't fit :: SSM b c -> SSM a b -> SSM a c instance Category SSM where (.) (SSM f0 f) (SSM g0 g) = SSM (g0,f0) $ g `chainSSM` f id = SSM () $ id As is mentioned in a post in [2], universals give generics: we don't know what the type is, and we don't care since we only pass values around. Existentials give interfaces: we don't know what the exact type is but we do care that we can perform a number of operations from a given interface. [1] http://www.haskell.org/haskellwiki/Existential_type [2] http://stackoverflow.com/questions/292274/what-is-an-existential-type Entry: The Haskell Learning Curve Date: Mon Aug 15 10:38:39 CEST 2011 Haskell is a tremendous trip. As I keep telling my fellow C trench dwellers, it's really different. Even with a couple of years of Scheme (Racket) to get used to higher order functions, it's still a different world. The reason of course is types. Here's a list of things I've learned. - Existential types[1]. This can be useful for hiding type parameter when making class instances[6]. - Phantom types [2]. Used for typed language embedding and encoding data structure invariants in the type system. I.e. "blessed strings". - Parametric polymorphism [3]. A basic tool for building generic functions that operate on data (Algebraic data types) with type parameters, i.e. list. - Ad-hoc polymorphism [4] or overloading, implemented by type classes which abstract over collections of parameterized types. What is interesting is the "1-up" that is possible by moving from a set of operations over a parametric data types, to a set of operation over a collection of different data types (class instances)[9]. - Common abstractions implemented as classes: Monad, Applicative, Functor, Eq, Show, Num, ... - Thinking of monads as computations instead of containers. An early misconception of mine instilled by one of the many monad tutorials is that parameteric types need to be data containers. It is quite possible for a type parameter of a parametric type to refer to the input and/or output type of a function. Obvious in retrospect, but a big revelation when I finally got it. This is used in the Cont (CPS) & State monads. The monad operations simply chain computations together. The end result is a function requiring a value (the initial state in State) or a higher order function requiring a function argument (the final continuation function in CPS). - Seeing Monad as a DSL with a custom variable binding structure, the ">>=" operator, which is reflected in the do notation's left arrow. - Finding out that Haskell has no meaningful object identity. Object identitiy is not referentially transparent[7][10]. - Related to the above, finding out that sharing structure (let) is not observable from the outside of a function definition. This is important in abstract interpretation when the intention is to recover sharing structure. Embedding a DSL where sharing structure is important (i.e. static single assignment SSA) then needs a specific binding form. This can be done by ">>=" in a Monad, or by embedding the language in a type class representing HOAS[8]. [1] http://www.haskell.org/haskellwiki/Existential_type [2] http://www.haskell.org/haskellwiki/Phantom_type [3] http://www.haskell.org/haskellwiki/Parametric_polymorphism [4] http://www.haskell.org/haskellwiki/Ad-hoc_polymorphism [5] http://www.haskell.org/haskellwiki/Algebraic_data_type [6] entry://20110814-212921 [7] http://en.wikipedia.org/wiki/Identity_(object-oriented_programming) [8] http://www.cs.rutgers.edu/~ccshan/tagless/aplas.pdf [9] entry://20110723-141330 [10] entry://20110710-144829 Entry: From Applicative to Monad Date: Mon Aug 15 19:50:05 CEST 2011 Each Monad gives rise to an Applicative where pure = return <*> = ap Where ap is: ap mf ma = do f <- mf a <- ma return $ f a or: ap = liftM2 ($) but not all Applicatives are Monads. See [1][2] for examples. So, does it make sense to say that I was not able to encode a certain behaviour as Applicative, but was able to do it as Monad? Yes it does, since requiring Monad is requiring more structure. What I tried to accomplish could be implemented as composition of Kleisli arrows (a -> M b), which is something an Applicative can't do. ( I'm implementing recurrence relations represented as data Sig s a = {init :: s, next :: s -> (a,s)) } and the corresponsing signal operators. I settled on signals as monad values and operators as Kleisli arrows. Note that isn't a true monad due to the dependence on the `s' parameter, which isn't constant for join ) So which primitives to implement to define both Monad and Applicative? It seems this only needs: (pure/return, join, ap) (pure/return, join, fmap) So Functor to Monad needs return+join, while Applicative to Monad only needs join since it already has return. The join operation is what implements the "monadness", the piercing of monad structure to "get stuf out" which is necessary to chain Kleisli arrows, while pure/return only put stuff inside the Monad. [1] http://haskell.1045720.n5.nabble.com/Applicative-but-not-Monad-td3142155.html [2] http://en.wikibooks.org/wiki/Haskell/Applicative_Functors#ZipLists Entry: Kleisli arrows Date: Tue Aug 16 14:47:50 CEST 2011 According to Dan Piponi, Kleisli arrows and their composition are the whole point of monads[1]. Dan's explanation is something like the following: If you want to chain (a -> M b) and (b -> M c), a straightforward but wrong thing to do is to use a function (M b -> b) that throws away all the extras. However, because M is a functor it is always possible to use fmap :: (a -> b) -> (M a -> M b) to convert (b -> M c) to (M b -> M (M c)) which composes nicely with (a -> M b). The result of this is that we end up with a double wrapping. Therefore a monad needs to have a function join :: M (M a) -> M a that restores the output of this chain to something that can be chained again. Summarized: (>=>) :: (Functor m, Monad m) => (a -> m b) -> (b -> m c) -> (a -> m c) (>=>) f g = f .> (fmap g) .> join where (.>) = flip (.) So how does this relate to the usual do notation? do a <- ma b <- mb return $ a + b or explicitly ma >>= \a -> mb >>= \b -> return $ a + b This is not simply chaining of arrows. The nesting here makes the arrows somewhat special. They are all arrows that go from some type to the result type of the do expression. The focus here is on the result of the expression, i.e. monadic values (M a) instead of arrows (a -> M b). Maybe the following makes sense: using (>=>) is pointfree or function-oriented programming, while using (>>=) or do is applicative or value-oriented programming. Anyways, in this light, comonads are straightforward to grasp, and as Dan mentions, it's not clear if there is a "codo" because comonads don't map well to the idea of binding structure. [1] http://blog.sigfpe.com/2006/06/monads-kleisli-arrows-comonads-and.html [2] http://en.wikipedia.org/wiki/Kleisli_category [3] http://www.haskell.org/haskellwiki/Arrow_tutorial#Kleisli_Arrows Entry: Arrows Date: Tue Aug 16 16:11:55 CEST 2011 The prerequisites for an Arrow instance are (.) and id from Category, which gives the basic composition mechanism, and the operations arr and first. The arr operation simply lifts functions. arr :: (b -> c) -> a b c While the first operation provides basic communication first :: a b c -> a (b, d) (c, d) I.e it's like using a stack to temporarily stash something away, here the type d, in order to perform an operation and pop it back. This is essentially a disguised form of the basic "stack shuffling" mechanism behind concatenative languages such as Forth. Apparently the other operations can be derived from arr and first. The second operation is simply the mirror of first. second :: a b c -> a (d, b) (d, c) Parallel composition takes two cables and puts them in the same tube. (***) :: a b c -> a b' c' -> a (b, b') (c, c') Fanout takes to cables that come from the same point (&&&) :: a b c -> a b c' -> a b (c, c') Note that binary algebraic operations can be applied to arrow outputs by making tuples and applying lifted curried operations. I do see that some people think of this as a clumsy interface. I'm familiar with this kind of structure through working with graphical data flow languages. And indeed, it's not easy to put this inherently graphical construct in a textual form. While point-free style can be quite powerful, it can also require a lot of intricate plumbing that would be more straightforward to express in an applicative style using named intermediates. In my first state-space model implementation, I naturally came to the ssmSer and ssmPar operations, where ssmSer is (.) and ssmPar seems to be (***) in the Arrow class. I also had lifted functions from ssmPure. This looks like a complete set. The new GHC has Arrow notation[2]. TO READ: [1] http://www.haskell.org/ghc/docs/latest/html/libraries/base/Control-Arrow.html [2] http://www.haskell.org/ghc/docs/7.0.2/html/users_guide/arrow-notation.html [3] http://www.haskell.org/haskellwiki/Monad_Laws [4] http://lambda-the-ultimate.org/node/2799 [5] http://en.wikibooks.org/wiki/Haskell/Understanding_arrows [6] http://blog.downstairspeople.org/2010/06/14/a-brutal-introduction-to-arrows/ [7] http://cs.yale.edu/c2/images/uploads/AudioProc-TR.pdf Entry: Monad from Kleisli Arrow Date: Tue Aug 16 18:46:32 CEST 2011 So I have an Arrow implementation that I know is the Kleisli arrow (K a b) of some Monad (M a). Is this enough to derive the Monad instance from the Arrow instance? The idea is that to implement bind, we need to "cut off the head" of the arrow. Maybe this can be done by the correspondence M a <--> K () a If then wi can use a correspondance such as this we might be able to pull it off: a -> M b <--> K a b The catch is probably that the latter correspondence only exists if the arrow is a true Kleisli arrow. Actually, I rediscovered something (at least the conjecture ;) from [1]. It says that indeed Monads are equivalent to Arrows that satisfy the type isomorphism K a b <-> a -> K () b ( In my particular implementation I ran into a blocking problem that has to do with bad organization, so can't pull it of in its current form. I'm still struggling with the existentials problem though.. ) [1] http://homepages.inf.ed.ac.uk/wadler/papers/arrows-and-idioms/arrows-and-idioms.pdf Entry: Existential types Date: Thu Aug 18 00:10:54 CEST 2011 I was trying to understand why pattern matching followed by application works fine, but pattern matching and returning doesn't work. In other words: why don't existentials support record selectors, and why is passing them as arguments to functions not an issue? A typical example: a type with a hidden value type and a function with hidden input type. Both are the same so the value can be passed to the function. data Exst a = forall b. Exst b (b->a) The following doesn't type check, for the simple reason that it is completely unknown what type the function produces. It can be anything. value (Exst v f) = f It's not that the pattern matching itself doesn't work. The following works fine: eval1 :: Exst a -> a eval1 (Exst v f) = f v The reason is that we don't know the type of v and f, only how they are related: we do know that if we apply f to v we get the type that the Exst is parameterized by. It is also possible to pass the value to a function that expects a parametric type that is the same as the one that's specified: eval2 :: b -> (b -> a) -> a eval2 v f = f v eval3 (Exst v f) = eval2 v f * * * Now for the real problem I'm facing, given a data type that represents a Kleisli arrow data Kl i o = forall s. Kl (i -> s -> (s, o)) Construct the type isomorphism: iso :: (i -> Kl () o) -> Kl i o iso = undefined No matter what I try. Case statements or CPS, I still can't get the types to match: it always sees the type variable in the data declaration as different from the one in any other specification. Wait, I think I finally get it. There is simply no way of knowing that the s that is passed to kl is the same as the s of kl. data Kl i o = forall s. Kl (i -> s -> (s, o)) iso :: (i -> Kl () o) -> Kl i o iso f = Kl $ \i s -> (\(Kl kl) -> kl () s) (f i) It's possible to write that line with an implicit s, but the point is still the same: the type that was fixed when the Kl that's being unpacked was created could be completely different from the new instance we're creating here. The information on what that type was is no longer present in the type of f. iso f = Kl $ \i -> (\(Kl kl) -> kl ()) (f i) In a Monad, the join operation flattens two layers of wrapping into one. Doing this with an existentially qualified type doesn't work if this data needs to be combined in any way, because all information that they might be of compatible types has been deleted. The same goes for bind: it takes information from outside the monad and inserts it inside, crossing a border where type information has been deleted. What does work is to unpack, combine, repack. Stuff that comes out of a single wrapper can all be combined together. [1] http://www.haskell.org/pipermail/haskell-cafe/2011-August/094718.html Entry: Existential Monad problem: solved! Date: Sat Aug 20 22:14:28 CEST 2011 One last attempt at trying to understand why this can't work: (.>) v f = f v -- Patern binding doesn't work with existentials. type Kl i o = i -> Kl1 o data Kl1 o = forall s. Kl1 (s -> (s,o)) bind :: (Kl1 i) -> (i -> Kl1 o) -> (Kl1 o) bind mi f = Kl1 $ \(s1,s2) -> mi .> (\(Kl1 u1) -> (u1 s1) .> (\(s1', i) -> (f i) .> (\(Kl1 u2) -> (u2 s2) .> (\(s2', o) -> ((s1',s2'),o))))) The problem here is that the s1,s2 we feed into the unpacked Kl1 are not compatible. The type of u1 and u2 is completely unknown in the expression of bind. To understand this, let's try to take a look at Ryan's answer[1]. Paraphrased, look at what happens if we have an f doing something like: f i :: Bool -> Kl1 o f i = if i then kl1 else kl2 where kl1 and kl2 could have different state types. Because the above is possible, you really can't assume anything. The problem is really that the dependence on the i input is "too powerful". In the arrow approach for the type forall s. (s, (s, i) -> (s, o)) it seems to work because everything is neatly tucked in; no state change possible. [1] http://www.mail-archive.com/haskell-cafe@haskell.org/msg92609.html Entry: ArrowApply and ArrowMonad Date: Sun Aug 21 09:41:35 CEST 2011 As mentioned in the last post[1], it's not possible to use that approach because you can't express compatibility of states in bind. However, it is possible to make an Arrow instance which has composition. What I find here in Felipe's post[2] is that it is possible to create the associated Monad[3] when making an instance of ArrrowApply[4]. So it should not be possible to do so, or it's not really isomorphic, or my previous explanation in [1] is wrong. ArrowApply[4] generalizes ((a -> b), a) -> b, which is a curried form of (a -> b) -> a -> b, from functions (->) to Arrows (-->), as an arrow that applies another arrow to an input ((a --> b), a) --> b. I tried to write it down but the wrapping confuses me. First, what does this mean in terms of non-wrapped types? a --> b == a -> s -> (s, b) ((a --> b), a) == ((a -> s -> (s, b)), a) ((a --> b), a) --> b == ((a -> s -> (s, b)), a) -> s -> (s, b) For the wrapped types this gives a very straightforward definition data Kl i o = forall s. Kl (i -> s -> (s, o)) instance ArrowApply Kl where app = Kl $ \((Kl f), a) -> f a However, it doesn't type-check. The construction of app requires the hidden type to be fixed when app is defined. However, this type depends on the _behaviour_ of app just as in [1], so there is a dependency problem which is what the error message is trying to say: Couldn't match type `s0' with `s' because type variable `s' would escape its scope This (rigid, skolem) type variable is bound by a pattern with constructor Kl :: forall i o s. (i -> s -> (s, o)) -> Kl i o, in a lambda abstraction In the pattern: Kl f In the pattern: ((Kl f), a) In the second argument of `($)', namely `\ ((Kl f), a) -> f a' * * * The bottom line in this whole discussion seems to be that these 2 types are different: forall i o. (forall s. (i -> s -> (s, o))) forall i o. (i -> (forall s. s -> (s, o))) In the latter the type s can depend on the value of i while in the former it cannot. [1] entry://20110820-221428 [2] http://www.mail-archive.com/haskell-cafe@haskell.org/msg92643.html [3] http://hackage.haskell.org/packages/archive/base/4.3.1.0/doc/html/Control-Arrow.html#t:ArrowMonad [4] http://hackage.haskell.org/packages/archive/base/4.3.1.0/doc/html/Control-Arrow.html#t:ArrowApply [5] entry://20110821-123701 Entry: Data.Typeable Date: Sun Aug 21 15:47:57 CEST 2011 According to Felipe[1] there is a way around the problem by using Data.Typeable[2]. I still need to read again to make sure I get it fully. The idea is to move some of the type checking to run-time. Of course this makes it possible to have run-time errors or "default behaviour" when the types do not match. Maybe it's possible to enforce well-behavedness using some other wrapper? The bad behaviour seems to come from control flow, i.e. pattern matching (case) or if .. then .. else. Assuming this is the case then one can say this works "if the user doesn't use control flow". I'm not so sure this is a good idea. It's a Monad, except when it's not. And you'll find out when you run the program. I think I'm sticking to Arrow unless I get a non-dynamic solution. It's a nice trick to know though. TODO: read again, test it and reply. [1] http://www.mail-archive.com/haskell-cafe@haskell.org/msg92649.html [2] http://haskell.org/ghc/docs/latest/html/libraries/base/Data-Typeable.html Entry: From Applicative to Num Date: Sun Aug 21 18:55:42 CEST 2011 It's been a couple of times that I've written something similar to the following instance declarations for Num t, Applicative (X t), where X is some kind of "bigger t". Is there a way to abstract this lifting to Num (X t)? instance (Eq (SigOp i o)) where (==) _ _ = False instance Show (SigOp i v) where show _ = "#" instance (Num o, Show (SigOp i o), Eq (SigOp i o)) => Num (SigOp i o) where (+) = liftA2 (+) (*) = liftA2 (*) abs = fmap abs signum = fmap signum fromInteger = pure . fromInteger To do it generally it might be best to restrict this so it doesn't include all Applicative instances by defining a blessing class: class Applicative a => NumericApp a The rest is straightforward. The following is for NumericPrelude: instance (Algebra.Additive.C n, NumericApp a) => Algebra.Additive.C (a n) where (+) = liftA2 (+) zero = pure zero instance (Algebra.Ring.C n, NumericApp a) => Algebra.Ring.C (a n) where (*) = liftA2 (*) instance (Algebra.Field.C n, NumericApp a) => Algebra.Field.C (a n) where (/) = liftA2 (/) Entry: Numeric Prelude Date: Sun Aug 21 20:30:29 CEST 2011 Looks like an interesting project. Standard Num is indeed a bit hackish. "Numeric Prelude provides an alternative numeric type class hierarchy. ... The hierarchy of numerical type classes is revised and oriented at algebraic structures. Axiomatics for fundamental operations are given as QuickCheck properties." And more [1]. Orphaned? [3]. Maybe not, there's activity [4]. [1] http://www.haskell.org/haskellwiki/Numeric_Prelude [2] http://hackage.haskell.org/package/numeric-prelude [3] http://archlinux.2023198.n4.nabble.com/Please-orphan-haskell-numeric-prelude-td2967163.html [4] http://web.archiveorange.com/archive/v/uW8vzWzyzFGR2S6TubjS Entry: >>= vs. >=> Date: Mon Aug 22 21:29:59 CEST 2011 The monad laws in terms of >=> just say that >=> is associative and return is the identity: f >=> (g >=> h) = (f >=> g) >=> h return >=> f = f >=> return = f So, why is do-notation (nested >>=) more prevalent? Because it provides nested variables and sequentiality, the usual playing field for effectful computations. Anyways, coming back to my SigOp language, it seems clear now that it can't be a monad because I want to structure of the computation to be fixed, because I want to use it to generate static code. Monad is too powerful. ArrowChoice might be an interesting compromise: it allows processors to be switched into different modes, where they could exhibit different types. Though it doesn't seem like I really need different types, just different paths. [1] http://www.haskell.org/wikiupload/e/e9/Typeclassopedia.pdf Entry: NOT & CPS Date: Mon Aug 22 22:36:37 CEST 2011 With a' == not a, why is? a' == a => f Because a => b == 'a v b So double negation a'' == ((a => f) => f) This is related to the type of a function that takes a continuation argument. ((a -> r) -> r) See Oleg's explanation[1]. [1] http://okmij.org/ftp/continuations/undelimited.html Entry: Concatenarrow Date: Mon Aug 22 23:41:57 CEST 2011 Instead of using Arrow notation to perform "tuple plumbing" when composing Arrow computations, it might be also useful to use a stack approach. Somebody has to have thought of that before... Essentially Arrows use binary trees for product (and ArrowChoice uses binary tries for sum using Either). Represent the empty stack by (). Applying an arrow to the top of the stack is "first". The question is then: what is the "default" representation for unary, binary, ... operations. Arrows are naturally unary and tupled (uncurried) binary. So maybe it's best to make lift/unlift from that rep to: (a -> b) -> (a, s) -> (b, s) ((a,b) -> c) -> (a, (b, s)) -> (c, s) The reason why Arrows are not curried is that there is no apply operator. This would give Monad power (ArrowApply), since arrows (whole computations) can depend on input values, which makes structure value-dependent. Anyway, it seems important to note that arrows with non-binary tuple inputs can't take inputs from other arrows, so it makes sense to standardize on a way to provide multiple arguments. The stack approach seems to be a good comprommise. So let's start with that: tuple <-> stack conversion. liftStack1 f (a,s) = (f a, s) liftStack2 f (a,(b,s)) = (f a b, s) liftStack3 f (a,(b,(c,s))) = (f a b c, s) The problem with those is that they only work for functions. To make these compatible with Arrows we need to stick to something that's accessible through tupling. Wait... it's always possible to lift plumbing functions so this is really not a big deal. Things do need to be uncurried though, so these look better: Entry: Eliminating Existentials Date: Thu Aug 25 10:43:38 CEST 2011 I'm trying to represent a sequence as an initial value :: s and an update function :: s -> (o,s). According to Oleg's reply[2] pointing to [1] it's possible to use laziness to avoid these kinds of existentials, by applying them. Some points from [1] that might be useful: - Replace functions that operate on hidden types with type class constraints (bounded quantification) if functions are constant over types. - Replace other such functions as thunks: "apply away" the hidden parameters. In my case this would mean to repesent the type as a list [o], or a list function [i] -> [o]. The problem then is of course that the original function is not observable. Maybe my original point is completely wrong then: the function is not observable anyway, unless the state is somehow part of a class that can allow initial values, "and" a run function that produces the result. class InitState r where initFloat :: Float -> r Float initInt :: Int -> r Int run :: (s, (s, i) -> (s, o)) -> ??? However, in case there is no structure that depends on input (No Monad?) it should be possible to evaluate the update function abstractly on a singleton list, obtaining the (i,s) -> o relation. But this does not expose the state output. It's actually not so hard: if I want to observe the state at some point, I can't hide it completely. Placing the evaluator in a type class seems to be the thing to do. This should allow evaluation to list processors, and machine code separately. But the stuff mentioned in the HC thread is quite interesting. I thought I understood then I see this weird other approach.. See next post. [1] http://okmij.org/ftp/Computation/Existentials.html Entry: Streams and the Reader Monad Date: Thu Aug 25 14:30:05 CEST 2011 I'm having trouble following the different ways of formulating (input-dependent) streams. Oleg wrote down the following bind function and also mentioned this[1]. See also thread[2]. data Kl i o = forall s. Kl s (i -> s -> (s, o)) instance Monad (Kl i) where return x = Kl () (\_ s -> (s,x)) (Kl s m) >>= f = Kl s (\i s -> case f (snd (m i s)) of Kl s' m' -> (s,snd (m' i s'))) This is different from what I've been trying to accomplish. But let's try to follow the diagonal bind described here[1]. In [1] the join operation for streams is written as producing a stream from the diagonal of the stream of streams input. So what is going on here? Let's just try an example. Let's try to bind the sequence [0,1,2,..] to i -> [0,i,2*i, ..] according to the definition 0. [0] 0 0 0 .. 1. 0 [1] 2 3 .. 2. 0 2 [4] 6 .. 3. 0 3 6 [9] .. .. .. .. .. .. .. ( This cannot represent streams with the kind of iterated (s,i)->(s,o) dependence I'm looking for. This is really an i -> Stream dependence, where i determines the whole stream. The trouble is that all rows are independent of each other: there can be no history relation between elements of the output stream, only *inside* a single row. ) So what is this stream combination useful for? Let's dig further. As Oleg mentions in [2], it's the Reader Monad. Very curious. See also first comment in [1]. Why? Because streams can be represented as Nat -> a and the bind would then have type: (>>=) :: (Nat -> a) -> (a -> Nat -> b) -> (Nat -> b) a >>= f = \n -> f (a n) n The reader's bind is a function that takes the environment n and uses it to evaluate the computation a, using its result to obtain another computation through f, which will be passed the environment n again. Defining the stream monad on explicit streams[1] seems confusing. Formulating it as the Reader monad makes it simpler; te body of the bind function then looks quite straighforward. So let's try to put the graph above in this formula: a n = n -- 0, 1, 2, .. f i n = n * i -- 0, i, 2i, .. b n = f (a n) n = n * n So it's clear, this cannot represent the discrete integral (partial sum) operation, because b_n is independent of a_m, m != n, and for the integral it would be dependent on a_m, for m <= n. [1] http://patternsinfp.wordpress.com/2010/12/31/stream-monad/ [2] http://www.mail-archive.com/haskell-cafe@haskell.org/msg92702.html Entry: Learning Haskell Date: Thu Aug 25 18:33:49 CEST 2011 Funny.. the stuff from last couple of posts makes me realize I'm not really reading functional programs. I'm verifying them against a mental model. If I don't have the model yet, I need to *decode* the program to build the mental model, then after that it's simpler to read because I know what "trick" to expect. Entry: Stream transformers Date: Thu Aug 25 18:36:51 CEST 2011 Most of the posts have been about these types representing input-dependent streams, where the stream is represented as a recurrence relation: (s, i -> s -> (s, o)) (1) i -> (s, s -> (s, o)) (2) (s, s -> (s, i -> o)) (3) The central question is, does i influence the the state transition function (1), with one i for each update, does a single i produce an entire stream (2), which needs some kind of stream-of-streams flattening operation, or is the state independent of the input (3). This is Arrow(1), Monad(2) and Applicative(3). These are VERY DIFFERENT. [1] http://www.mail-archive.com/haskell-cafe@haskell.org/msg92702.html Entry: Are recursive signal processors applicative? Date: Fri Aug 26 13:02:04 CEST 2011 They are arrows[1]. They do not seem to be monads due to structural limitations; monads allow data-dependent computation structure while recursive signal processors have a fixed structure[2]. However, they seem to be a generalization of monads that supports do notation when the type of the computation is properly fixed. Formulated as data Signal a = forall s. Signal s (s -> (s, a)) they also do not seem to be applicative due to a data dependency problem. It is possible to write an Applicative instance for the above type, but it is not powerful enough to encode something isomorphic to (s,i) -> (s,o) instance Functor Signal => Applicative Signal where pure v = Signal () $ \_ -> ((), v) (<*>) (Signal sf f) (Signal sa a) = Signal (sf, sa) (signalApp f a) signalApp :: (s1 -> (s1, (a -> b))) -> -- fi (s2 -> (s2, a)) -> -- ai (s1, s2) -> ((s1, s2), b) -- bi signalApp fi ai = bi where bi (s1, s2) = ((s1', s2'), b) where (s1', f) = fi s1 (s2', a) = ai s2 b = f a The type of Signal (i -> o) is isomorphic to: s -> (s, (i -> o)) The state :: s cannot be influenced by the input :: i. The type signature forbids any connection. To give a more intuitive explanation of why this is impossible, think about what happens when the recursion is unfolded. The resulting type is [i->o]. Can the functions in that list still depend on each other? The answer is a clear no. Those functions have to be pure. Suppose the list is [f0,f1,..]. If f1 depended on the input of f0 it would not be possible to evaluate f1 without evaluating f0 first. If they are in a list there is no reason why we could not just ignore f0. Funny how I still don't trust type signatures ;) Side channels are always visible! [1] entry://../meta/20110816-153448 [2] entry://20110821-094135 Entry: Streams with extra input. Date: Fri Aug 26 13:34:18 CEST 2011 Let's look at the stream with an extra input, as suggested by Oleg[1]. Op i o = Op (i -> Op i o, o) The trouble with this is that for my purpose they are not explicit enough: I need an explicit description of state transformation process between successive outputs to generate code. [1] http://www.mail-archive.com/haskell-cafe@haskell.org/msg92746.html Entry: Haskell tuples vs. lists Date: Fri Aug 26 16:06:33 CEST 2011 It's interesting to note the symmetry between: f [] = ... f (x:xs) = ... f xs ... and instance F () where ... instance (F a, F b) => F (a,b) where ... The f operates on lists of values while the F "operates" on types that can have a nested structure known at compile time, i.e. (Int,Double) or (Int,(Int,(Double,()))) Entry: mapM Date: Fri Aug 26 17:44:30 CEST 2011 I keep running into the mapfold operation but don't see why nobody ever mentions this function. foldr :: (a -> b -> b) -> b -> [a] -> b mapfold :: ((a -> b -> (b, c)) -> b -> [a] -> (b, [c]) Maybe it's because this is a combination of two monads: state and list, and it's usually written as a map? :: (a -> m b) -> [a] -> m [b] Indeed, that's the signature of mapM[1]. It might be a good idea to learn how to use this. At first glance it seems that this needs a monad class definition, and that can be a bit verbose. [1] http://hackage.haskell.org/packages/archive/base/latest/doc/html/Prelude.html#v:mapM Entry: The List Monad - Generate and Test Date: Fri Aug 26 23:31:32 CEST 2011 When people talk about logic programming with the list Monad, what is meant are things like this: import Control.Monad f n = do x <- [1..n] y <- [1..n] z <- [1..n] guard $ (x^2 + y^2 == z^2) return (x,y,z) This is one of those super elegant Haskell tricks ;) Entry: The Essence of Functional Programming Date: Mon Aug 29 23:34:20 CEST 2011 I'm reading [1] again. What I see now when seeing the monad types is that essentially the beef is really hidden behind the type M. return :: a -> M b bind :: M a -> (a -> M b) -> M b I wrote about this before in different words[2], but the real click for me is that a Monad is 1. completely general wrt. the a and b in the types above, and 2. completely abstract in that all its magic is hidden behind the type M and the bind, return operations. So if it's completely abstract, how can you create a function of type a -> M b in the first place? To be useful, every monad needs to export some function to create values wrapped in M on top of the standard composition interface made up by bind and return. M a is always a value. However, it doesn't have to be a naked value of type a. It can be the output of a function as in the Reader monad :: e -> a, or the input of a function as in the Cont (CPS) monad :: (a -> r) -> r. Wadler mentions[1] that the basic idea when converting a pure function into a monadic one is to convert a function of type a -> b to one of type a -> M b. The return and bind operations can then compose these functions, or more intuitively[3], it can compose Kleisli arrows (a -> M b) -> (b -> M c) -> (a -> M c). What I've always found strange is that the do notation has such a peculiar form if you think of it. It has little to do with Kleisli composition where arrows are not nested. A do block has a single return type M r, but several nested functions with input a1, a2, a3, ... one for each arrow. The thing is that do notation is to let (applicative style) what Kleisli composition is to function composition (point-free style). It's not so much that do is strange, it's that nested let is strange! In some sense a point-free approach is more natural compared to creating a context of many visible variables to allow random access. [1] http://homepages.inf.ed.ac.uk/wadler/papers/essence/essence.ps.gz [2] entry://20110723-141330 [3] http://blog.sigfpe.com/2006/06/monads-kleisli-arrows-comonads-and.html Entry: Needing GHC extensions Date: Tue Aug 30 19:50:49 CEST 2011 I often run into issues that don't fit well in the standard Haskell type system. Am I used to too much ad-hoc datastructures? Does my code usually have corner cases I don't recognize because I don't need to cast it in types? Entry: Cover Condition Date: Fri Sep 2 09:38:10 CEST 2011 What does this mean? the Coverage Condition fails for one of the functional dependencies; Use -XUndecidableInstances to permit this Entry: a -> M b vs. M a -> M b Date: Sat Sep 10 08:44:47 EDT 2011 What is this about[1]: A key observation of Moggi's was that values and computations should be assinged different types: the value type a is distinct from the computation type M a. In a call-by-value language, function stake values into computations (as in a -> M b); in a call-by-name language, functions take computations into computations (as in M a -> M b). [1] http://homepages.inf.ed.ac.uk/wadler/papers/essence/essence.ps.gz Entry: State Space model vs. Mealy Machine Date: Sat Sep 10 09:21:40 EDT 2011 Time to cleanup some terminology. While a State Space Model (SSM) is more general than a Mealy Machine in that it's space of states might be infinite while a MM has a finite set of states, when an SSM is implemented in hardware it is necessarily a MM because a finite approximation is made with However, as a structural description (intension?) the two are usually very different. A SSM's state is usually parametric: a single rule is expressed in terms of state coordinates, while a MM is usually case-based: each distinct state point corresponds to a separate rule expression. [1] http://en.wikipedia.org/wiki/Mealy_machine [2] http://en.wikipedia.org/wiki/State_space_(controls) Entry: Logic Date: Sat Sep 10 20:49:38 EDT 2011 I know very little of mathematical logic. When I think of logic, I mostly think of boolean algebra and logic gates. The thing is that, being educated as an engineer and not a mathematician, I'm very much biased towards semantics (sets and functions), or maybe even more the relationship between some mathematical structure and the measurable reality it models. Semantically, Boolean algebra talks about operations (functions) B^n->B, defined on the set of truth values B = {0,1} and its powers. It is the model for logic gates. Such a system is easy to work with in a muddy intuitive way because it is feasible to exhaustively verify correctness of expression manipulations. However, the approach used in formal logic is axiomatic. The point of an axiomatic system is to provide a means to derive new expressions (theorems) from a collection of initial expressions and rules of inference/construction. Such a system is called a "calculus". A derivation that leads from axioms to theorems is called a proof. In [1] Wadler mentions that: In a single landmark paper Genzen (1935) introduced the two formulations of logic most widely used today, natural deduction[3] and the sequent calculus[2], in both classical and intuitionistic variants. Reading further, it seems that the importantance of the sequent calculus is mostly because of the cut-elimination theorem[4] which is about "composition of proofs": if there exists a chain of proofs that can be combined using the cut rule, there also exists a direct proof. [1] http://homepages.inf.ed.ac.uk/wadler/papers/dual/dual.pdf [2] http://en.wikipedia.org/wiki/Sequent_calculus [3] http://en.wikipedia.org/wiki/Natural_deduction [4] http://en.wikipedia.org/wiki/Cut-elimination_theorem [5] http://en.wikipedia.org/wiki/Boolean_algebra_(logic) Entry: Protocol-oriented programming (part 2) Date: Tue Sep 20 17:22:06 EDT 2011 Started here[1]. Recently been thinking about this again because I ran into some code that is gratuitiously un-streamable, i.e. a byte stream with non-causal data dependencies across large parts (packets) that basically requires the use of large buffers. Is there a way to turn the question around? How to design a data protocol such that small buffers can be used? In absence of other constraints (i.e. robustness), a good protocol is one that is easy to print / parse. How to formulate this in terms of languages and automata? [1] entry://20110423-004220 Entry: Minimal erase binary counter for Flash memory Date: Fri Sep 23 15:31:29 EDT 2011 Problem with Flash memory: 0->1 is costly and needs to happen in bulk, while 1->0 is free. How to implement a counter in Flash that has a good tradeoff between few erase cycles, and little redundancy. - Full redundancy: one bit per increment, no erase. - No redundancy: erase on every increment. Something in the middle could be a XOR mask and a setting one bit per time, based on something like a gray code. Entry: Embedded patterns and translation Date: Sat Sep 24 14:53:43 EDT 2011 * Conversion between a sealed "foreach" API and a wide-open open/access/close. This is the universal traversal API idea[1], but in practice it doesn't work that well in C due to lack of partial continuations. * Conversion between task code to state machines, i.e. if recursion is finite and there are no arbitrary pre-emption points this should be always possible to automate. State machines are a pain to write. [1] http://okmij.org/ftp/papers/LL3-collections-enumerators.txt Entry: Inductive inputs / outputs Date: Sun Oct 2 18:44:03 EDT 2011 From the LLVM code: -- An alias for pairs to make structs look nicer infixr :& type (:&) a as = (a, as) infixr & (&) :: a -> as -> a :& as a & as = (a, as) This is useful as it's possible to inductively build function types with multipe inputs and outputs. See for a 4 in, 4 out function. Curried: (* -> (* -> (* -> (* -> (* :& (* :& (* :& (* :& ())))))))) Does it also exist in the non-curried variant? It seems that in that case it's no longer a sequence, but a tree split by the `->' type constructor: (* :& (* :& (* :& (* :& ())))) -> (* :& (* :& (* :& (* :& ())))) Why is induction (linear list structure) useful? It allows the definition of an enumerable set of type classes, starting with the base case and working up through induction, just like one would write recursive functions on recursive types. It doesn't seem that functionally the non-curried case is less powerful, just that it's more of a hassle to deconstruct the type in the inductive rule. See next post for example of how a function arity can be obtained from a function type using a type class and 3 instances. Once base case () and one for each induction. Entry: Function arity Date: Tue Oct 4 20:22:14 EDT 2011 {-# LANGUAGE TypeOperators, TypeSynonymInstances #-} -- Small test for parameterizing over types that represent multi in / -- multi out functions like: -- (* -> (* -> (* -> (* :& (* :& (* :& (* :& ()))))))) infixr :& type (:&) a as = (a, as) infixr & (&) :: a -> as -> a :& as a & as = (a, as) -- First test is to map the type signature to a pair of numbers -- representing the I/O arity. class (NbIO f) where nbIO :: f -> (Int, Int) instance NbIO () where nbIO _ = (0,0) instance NbIO os => NbIO (o :& os) where nbIO os = (ni, no + 1) where (ni, no) = nbIO (snd os) instance NbIO f => NbIO (i -> f) where nbIO f = (ni + 1, no) where (ni, no) = nbIO (f undefined) # nbIO (\a b c -> (a,(b,(c, ())))) # => (3,3) Entry: Learning Haskell Date: Fri Oct 7 20:09:53 EDT 2011 And the saga continues. I spend long times in total frustration not understanding what a cryptic type error means when I'm playing with type classes. Usually I just try to go about differently and succeed to my great surprise. In short: I often really don't get it yet. At those times I don't know where to actually look to understand why a particular construct will not work. The moral of the story is that just adding type signatures will often solve the problem, and if it really doesn't work, it's probably a conceptual error, i.e. try harder! Entry: Apply pure function in monad Date: Fri Oct 7 20:44:38 EDT 2011 Often I run into something like this: m >>= \x -> return $ f x Often with `f' being a data constructor. Does that have a name? Indeed it does: *Main> :t liftM liftM :: Monad m => (a1 -> r) -> m a1 -> m r Entry: Haskell overlapping type class instances Date: Sun Oct 9 11:50:39 EDT 2011 Sometimes it can be very useful to have overlapping instances when encoding data structures at type-level. Especially so when implementing embedded languages in Haskell. [1] http://www.haskell.org/haskellwiki/GHC/AdvancedOverlap Entry: CBN & CBV Date: Sun Oct 9 14:43:47 EDT 2011 Is it possible to use a simple interface to abstract over both these types of binary operators: a -> b -> M c M a -> M b -> M c This would make it possible to combine nested expressions (unnamed intermediates) and explicit bindings using the same interface. Something that might work is this: mi a -> mi b -> mo c Where we can have mi = mo or mi = 1, the identity monad. Let's play with this a bit. It seems that this would work, but there is a problem making it implicit. I.e. the identity monad would always need a wrapper. Overall it seems just simpler to work with (M a -> M b -> M c) functions and use an explicit return when binding nodes. See before. Entry: Lifting pure functions to dataflow functions Date: Thu Oct 13 13:55:44 EDT 2011 One of the problems that has been puzzling me for a while is to find a good representation of the transformation that maps a pure "applicative" function to a dataflow function. The simplest case is one input, one output: (i -> o) -> (i' -> o' -> m ()) Here i, o are value types, while i' and o' are reference types, and m is some kind of monad that keeps track of the reference one-time binding or multiple assignment. Generalizing this to multiple in/out is straightforward when the outputs are encoded in a recursive tuple type as mentioned before. (i1 -> i2 -> o1 :& o2) -> (i1' -> i2' -> o1' -> o2' -> m ()) An ad-hoc way would be to implement this for assignable references, though with some more effort it should be possible to use one-time binding as in the data flow language in Concepts, Techniques, and Models of Computer Programming[1]. Here the input and output types could be constrained by type classes. I wonder, isn't this just lifting functions to Arrows? [1] http://www.info.ucl.ac.be/~pvr/book.html Entry: Types Date: Thu Oct 13 15:20:10 EDT 2011 There is something extraordinary about programming with strong typing. I spend a lot of time trying to come up with correct types for doing program transformations. Sometimes it seems like a total waste but when things fall in place, usually the elegance shines. At this point it feels as if I'm never really going to "get it". As if there is so much structure hidden behind the deceptively simple constructs of type classes. Entry: Functional Dependencies and Undecidable Instances Date: Sat Oct 15 11:09:03 EDT 2011 Since I rely on them for some type-level hackery, let's look at what these actually do: UndecidableInstances FunctionalDependencies ScopedTypeVariables FlexibleContexts FlexibleInstances All except ScopedTypeVariables[2] are explained here[1]. [1] http://cvs.haskell.org/Hugs/pages/users_guide/class-extensions.html [2] http://www.haskell.org/haskellwiki/Scoped_type_variables Entry: Monads Date: Mon Oct 24 17:38:58 EDT 2011 Learning Haskell I sometimes get the wrong intuition.. An example: It was not at first clear to me that these two are not the same. (CBN) f mx my (CBV) do x <- mx y <- mx f (return x) (return y) Here I'm using the Call-By-Name (CBN) and Call-By-Value (CBV) terminology from [1] to indicate the different meaning between the two. The first one passes two (named) computations to f, while the second one passes two values (they are monadic wrapped, but they are "pure" due to the return). The main difference is that in the second step, the sequencing has already happened before f is invoked. Now why is this? It's probably simpler to see with a single case. Why are these two expressions not always the same? mx >>= \x -> f $ return x f mx The answer is that there is absolutely no reason they should be, and I don't understand why I had the idea in the first place. From the perspective of f the two cases are vastly different. In the former f will always get a value that's the result of a return, which means it is "simple" or "pure", while in the latter f can receive a value that could be more complex. Here's a counterexample using the list monad that illustrates this difference. The first test passes a one element list to f and does that 3 times, collecting the result of each evaluation in a list. The second test just passes the list. t1, t2 :: Monad m => (m a -> m b) -> m a -> m b t1 f mx = mx >>= (\x -> f (return x)) t2 f mx = f mx f l = [length l] l = [1,2,3] (t1 f l, t2 f l) -- ([1,1,1],[3]) [1] http://homepages.inf.ed.ac.uk/wadler/papers/essence/essence.ps.gz Entry: Monad transformers Date: Mon Oct 24 22:30:34 EDT 2011 Two questions: 1. How can it be? Following [1]. Dropping wrappers for clarity, we have: StateT s m a = s -> m (s, a) which is a state monad parameterized by a monad. If m = Id we get the ordinary state monad: s -> (s, a) if two of these are chained we get StateT t (StateT s Id) a = t -> (StateT s Id) (t, a) t -> s -> (s, (t, a)) So the deal is: a monad transformer is parameterized by a monad in such a way that simply substituting a different type will give a meaningful result. 2. How can it be used? Each transformer needs to define a method that provides access to the wrapped monad: lift :: (MonadTrans t, Monad m) => m a -> t m a It looks like the MTL has some type classes defined that prevent the use of multiple lift applications to dig into the onion[2]. [1] http://web.cecs.pdx.edu/~mpj/pubs/modinterp.html [2] http://www.haskell.org/haskellwiki/Monad_Transformers_Explained Entry: Zipper: data structure derivatives. Date: Tue Oct 25 13:22:11 EDT 2011 I need an inverted tree for the syntax representation of a simple flowchart language with lexical scope for primitive data bindings and mutually recursive functions (only tail recursion, no call stack). data Expr = Let Bind Expr | LetRec [Fun [Var] Expr] Expr | If Var Expr Expr | App Fun [Var] How to systematically derive? Each recursive Expr node needs to be turned around. Writing this as a polynomial with variable: x = Expr And coefficients: b = Bind f = Fun [Var] -- no recursion here, so can be combined in 1 term v = Var a = App -- same We get for the constructors in the order above: b x + (f x)^n x + v x^2 + a The derivative of the polynomial is: b + (n+1) (f x)^n + 2 v x The 2 numeric constants that appear distinguish between the different branches of the LetRec and If trees. I'm puzzled that Bind is just a value though. Let's try to reconstruct a type from the polynomial. Hmm.. there's too much information lost. The numeric constants refer to different constructors, and the x seem to refer both to original trees and inverted trees. But the general idea seems to work out: If has 2 choices, one for each branch, and LetRec has a couple for which of the branch we're at. Let has no selector, meaning there is only one constructor that refers to the inverted Let list. So let's do it manually. EDIT: It appears to not be necessary. The trick is to use delimited continuations (which are zippers implicitly). In [1] this takes the form of a state-continuation monad, which in my Haskell implementation looks like this: makeVar :: TypeOf (Code t) => (Code t) -> MCode (Code t) makeVar term@(Code sterm) = SC c where c (CompState n) k = Let var sterm (k state' ref) where state' = CompState $ n+1 ref = Code $ Ref $ var var = Var typ nam 0 typ = typeOf term nam = regPrefix ++ (show $ n) [1] http://en.wikipedia.org/wiki/Zipper_(data_structure) [2] http://www.cs.rice.edu/~taha/publications/conference/pepm06.ps Entry: OverlappingInstances, IncoherentInstances Date: Wed Oct 26 16:33:32 EDT 2011 Notes: - GHC's default behaviour is that exactly one instance must match the constraint it is trying to resolve. - It is fine for there to be a potential of overlap; an error is only reported if a particular constraint matches more than one. - The -fallow-overlapping-instances flag instructs GHC to allow more than one instance to match, provided there is a most specific one [1] http://www.haskell.org/ghc/docs/6.6/html/users_guide/type-extensions.html Entry: Commuation Date: Thu Oct 27 10:05:16 EDT 2011 How to name this pattern? What happens a lot to me when programming with type classes in Haskell is that I run into commutation problems, meaning that I run into operations like a (b t) -> b (a t) and their inverse that encode a morphism between the two types with different nesting order, basically saying that a and b commute. In general this doesn't hold: such functions usually do something significant, and might overall not be invertible. However in other cases the morphisms might be bidirectional and somewhat trivial. Is there a way to represent such a morphism in an abstract way? I.e. is there a way to automatically derive "trivial morphisms" for cummutative type constructors? Sorry, no example as this is just a vague hunch.. Entry: Invertible Functor? Date: Thu Oct 27 11:09:02 EDT 2011 I recently ran into a] post about invertible functors. Can't find it now. This is the type of application and abstraction in the embedding of a typed lambda calculus: _app :: r (a -> b) -> r a -> r b _lam :: (r a -> r b) -> r (a -> b) Where _app would just be from an applicative functor, here _lam is the inverse. Entry: Sussman: We Really Dont Know How To Compute Date: Thu Oct 27 22:24:00 EDT 2011 - Brains are fast - Computing: limiting factor is programmers / programming - Generics and abstract evaluation - Autodiff - dynamic reconfigurability - trade provability for flexibility - propagators -> recent breakthrough: cell = info about value, not value -> cells merge insformation monotonically [1] http://www.infoq.com/presentations/We-Really-Dont-Know-How-To-Compute Entry: Functional Dependencies Date: Fri Oct 28 13:47:56 EDT 2011 Main idea is that multi-param type classes are relations = sets of tuples, and can be seen as relational databases. The basic problem with multi-param type classes is ambiguity. When composing operations, it might be that some type parameters that are in the constraints no longer appear in the right hand side. I.e. TypeRel a b c => a -> b Here 'c' is ambiguous. See [1] section 2.4 for examples. [1] http://www.reddit.com/r/haskell/comments/7oyg5/ask_haskellreddit_can_someone_explain_the_use_of/ Entry: Type Scoping Date: Fri Oct 28 15:04:35 EDT 2011 Why are these 2 not equivalent? # as type variables are not the same _lambda = lambda where lambda :: Args Value as ras => (ras -> Identity (Value t)) -> Value (as -> Identity t) lambda f = Value rf where rf as = do Value t <- f $ unpack $ Value (as :: as) return t # as types are the same _lambda = lambda where lambda :: forall as ras t. Args Value as ras => (ras -> Identity (Value t)) -> Value (as -> Identity t) lambda f = Value rf where rf as = do Value t <- f $ unpack $ Value (as :: as) return t Entry: Continuations Date: Sun Oct 30 01:22:10 EDT 2011 It is rather instructive to creative use of a continuation monad for Let insertion[1] implemented in a pure functional style; I'm writing one in Haskell. It uses a combination of CPS and direct style to be able to create nested expressions, effectively manipulating the toplevel continuation. [1] http://www.cs.rice.edu/~taha/publications/conference/pepm06.ps Entry: Cross Stage Persistence Date: Mon Oct 31 10:59:05 EDT 2011 I was thinking about code generation, and didn't see what a true multi-stage language like MetaOCaml would bring over the typed-class-and-untyped-syntax approach that is easy to do in Haskell. Of course, as mentioned elsewhere (probably somewhere here[1]) the big deal is cross-stage persistence. For maximum flexibility, you want to be able to call library functions in generated code. For me that's not an issue, since I'm only using it for offshoring. The target is limited in that there are mostly no libraries needed; it's all low-level calculations. [1] http://okmij.org/ftp/ Entry: DB normalization Date: Mon Oct 31 17:18:46 EDT 2011 For abstracting my Apache logs I'd like to do something like this: (url, ip, client) but represent it as this: (idu, idi, idc) (idu, url) (idi, ip) (idc, id-client) because: 1. many of the urls & ips & clients will be duplicated 2. i'm interesting in the set of unique url, ip, client, ... 3. implementation: does it search faster, use less storage? Is this non-normalized? I'm not sure, though putting them back together is a join[1]: If columns in an equijoin have the same name, SQL/92 provides an optional shorthand notation for expressing equi-joins, by way of the USING construct: SELECT * FROM employee INNER JOIN department USING (DepartmentID); Thinking about this a bit, it makes sense to replace fields with IDs in this way for the reasons above whenever the only useful operation on that field is just equality. Equality distributes over keys. Other fields like data, number of bytes, ... that support other operations (i.e. range) should stay in the main table. [1] http://en.wikipedia.org/wiki/Join_(SQL) Entry: Generalized Arrows Date: Tue Nov 1 09:03:26 EDT 2011 I was intrigued by the use of the word "metaprogramming" in: Like Haskell Arrows, generalized arrows provide a platform for metaprogramming. Unlike multi-stage languages and Haskell Arrows, generalized arrows allow for heterogeneous metaprogramming. Arrows support metaprogramming only when the guest language is a superset of Haskell, because every Haskell function can be promoted to a guest language expression using arr. Generalized arrows remove the assumption that this sort of promotion is possible. This enables heterogeneous metaprogramming. [1] http://www.cs.berkeley.edu/~megacz/garrows/ Entry: Edge filtering for Interrupt On Change Date: Tue Nov 1 13:28:37 EDT 2011 Given CHANGE and STATE registers, and assuming that reading the CHANGE register also clears it so it can detect new hardware events, how to reliably filter out edges? Suppose we have an ISR that is triggered by CHANGE becoming 1, reads CHANGE, then reads STATE. Then we filter an edge based on the value of STATE. For short pulses we filter on the trailing edge. Can it reliably detect an arbitrarily short pulse that is long enough to be handled by the hardware, but might be too short to be seen by a port readout, or that is shorter than the interrupt to CHANGE readout? In the case of a short pulse, these are the possible ordering of edge and register read events, with l,t the leading and trailing edges and C = CHANGE read and S = STATE read. (1) l C S t C S both C S are in time to see first edge (2) l C t S C S S is too late ... (3) l t C S bot C S are too late .. In the first 2 cases, 2 interrupts are triggered, while in the second case only one is triggered as the HW edge detect did not see the second edge. Using just the value of S is not enough to detect only one edge per pulse, as this would trigger on both edges in case (2). Remarks: * The above seems to work for pulses that are long enough or short enough, but the middle case is puzzling. * Is this a problem? How to fix it? * Why didn't I hear about this before? Probably because it's solved by using long enough pulses such that interrupt -> C,S reads all happen before the trailing edge. * For level triggered interrupts of course one triggers on the leading edge because it is never going to be missed as it will only be lifted after acknowledgement which is necessarily after detection. The scenario above is for an Atmel AT91SAM7. After a bit of Googling I find this in Microchip AN566[1] which talks about using the IOC pins for handling external interrupts: An interrupt pulse with a small pulse width requires less overhead than a wide pulse width. A small pulse width signal must be less than the minimum execution time of the interrupt service routine, while a wide pulse width must be greater then the maximum time through the interrupt service routine. They mention these 2 good cases (1) and (3), but not the bad case (2). Maybe for the PIC there is no problem because there is an atomic read? I'm not sure exactly how the mechanism works there.. Indeed, It looks like there is only a single interrupt flag, not a per-pin flag as on the AT91, and this flag is cleared when the input pins are read. So here there are only two cases, the C and S operations are atomic: (p1) l CS t CS (p2) l t CS Is there a way to fix the AT91 problem? Does a 2nd read solve it? Let's try C S C. l C S C r C S C l C S r C l C r S C l r C S C At first sight it looks like this makes it at least possible to distinguish the cases, but it makes it harder to do in parallel for a number of pins.. FIXME: check with clear head.. [1] http://ww1.microchip.com/downloads/en/AppNotes/00566b.pdf Entry: Ad-hoc syntax design Date: Tue Nov 1 16:14:19 EDT 2011 There are two ways to look at languages: - Properties of the grammar. - Properties of the parser. For an ad-hoc language the latter is far more useful to focus on than the former. It's cool to be able to derive parsers from grammars automatically, but the restrictions that are necessary to make this work well require some getting used to. On the other hand, if you focus on keeping the parser simple so it can be done by hand using recursive descent, there are a couple of ways to make language design decisions that keep the parser simple. In general this is: "avoid backtracking". Usually, some amount of backtracking is necessary, but it's probably best to keep it local and bounded such that it can be implemented by a simple linear succession of tries that don't need large context store, i.e read something that's finite size and stick it in a buffer, then try to parse it in any of a couple of ways. Entry: Relational Databases Date: Wed Nov 2 15:03:27 EDT 2011 I don't know much about relational databases[1]. However, I did read the chapter on relational programming in CTM[2] again yesterday, and was reminded that relational programming is essentially logic programming. CTM uses the Oz language to unify (!) a lot of different programming concepts, essentially by separating variable and values, allowing to separate: - variable creation (here's this variable) - variable binding (that variable is the same as this variable) - value creation (this variable has that value) The interesting thing here is that you can have both directed (functional, dataflow) but also undirected information flow, which is essentially logic programming. The big deal in programming with relations (predicates) is exactly that: information can flow in many directions. This is the SQL "WHERE" clause. [1] http://en.wikipedia.org/wiki/Relational_database Entry: Lifting & subclassing Date: Mon Nov 7 19:39:16 EST 2011 Is lifting the same as subclassing? I.e. in OO, a derived class can call it's superclass' methods, which is really similar to lifting an operation over a larger type, i.e. what 'return' ('pure') does. The difference is that in the OO case it happens automatically, while in the (typed) FP case it uses an explicit conversion. Entry: Integer Programming Date: Fri Nov 11 10:50:38 EST 2011 I got the coefficient data for an integer programming (IP) problem in a spreadsheet. I want to get it into Haskell. How to do this? I first tried to copy & paste them into emacs and use keyboard macros to beat them into shape so they can be directly represented in Haskell source code. This doesn't seem to be such a good plan. Is there a way to dump out a part of a spreadsheet in a format that's easy to get into Haskell? Maybe CSV is the simplest way, so I'm trying that first: copy and paste part of a table into a new spreadsheet, then save as CSV. Entry: Elementary function evaluation Date: Tue Nov 15 20:12:23 EST 2011 Probably for PIC/dsPIC: sin/cos/exp/... Entry: Condition Variables are not Semaphores Date: Sat Dec 10 15:28:21 EST 2011 A condition_signal() will only unblock a condition_wait(), but it will not cause semaphore-like behavior where condition_wait() will not block if there is a condition waiting. A condition variable is really just a queue of threads that are waiting to be woken up. ( Observed in some obscure thread-priority bug where one thread was not allowed to start up to the point it was actually waiting on a condition variable, so it missed a signal causing a deadlock. ) [1] http://en.wikipedia.org/wiki/Monitor_(synchronization)#Blocking_condition_variables Entry: Composite return values in C Date: Thu Dec 15 09:49:21 EST 2011 This is something which I never use because I thought it was not possible, but returning composite values is not a problem in C. What I wonder is how the ABI handles this. struct foo { int a; int b; int c; }; struct foo make_foo(void) { struct foo foo = {}; return foo; } As mentioned in [1], see the -fpcc-struct-return and -freg-struct-return options in [2]. It seems that this is in essence not a problem. [1] http://stackoverflow.com/questions/161788/are-there-any-downsides-to-passing-structs-by-value-in-c-rather-than-passing-a [2] http://gcc.gnu.org/onlinedocs/gcc/Code-Gen-Options.html Entry: Continuation Monad & compilation Date: Sun Dec 18 11:14:07 EST 2011 It looks like the continuation monad is a very important/useful abstraction for compiling tree structures with lexically scoped identifiers, especially when you want to have an idea of "current context" in which those identifiers are defined, i.e. the operation: Insert at current point in the subtree a definition of identifier ID and evalate the rest of the syntax generation in a deeper subtree that cannot escape the context of this definition. Essentially what a continuation monad can do, used in this way (as a partial continuation) is to make sure that subsequent continuation manipulations can't escape a subtree. I find this remarkable to the point of leading me to change my mental picture of a partial continuation as a "guaranteed consistent context". What I don't understand though is why partial continuations appear so naturally in Haskell's contunation monad. Maybe the way I'm using it in the meta/dspm compiler is just a bit ad-hoc special-cased for this to show up that way.. Entry: Functors Date: Tue Dec 27 08:50:22 EST 2011 Writing a typed embedded language I've been running into a pattern of why I call "type commutation", which turns up when writing representations of functions/structures in terms of functions/structures of representations, i.e.: 2x1D (1) r (a -> b) <-> (r a) -> (r b) (2) r (a, b) <-> (r a, r b) 1x1D (3) r (x a) <-> x (r a) Case (1) is almost the Functor type in Haskell, which expresses this for functions. To be exact it is a bidirectional functor[1]. What is this generic pattern called? I.e. not just functions but generic 1,2,3,... dimensional type constructors? Also, these all seem to be commutations between * -> * (i.e. r in the above), and * -> * or a multi-argument kind (i.e. x, (->), (,) in the above), but never between multi-argument kinds. Or maybe it does. This is a relation between two * -> * -> * kinds. 2x2D (4) (a -> b, c -> d) <-> ((a,c) -> (b,d)) Anyways, such a relation has many more degrees of freedom that don't seem to make sense, like replacing c (input) and d (output) in the rhs of (4). There's a clear pattern here. I'm missing some language to talk about it. Probably category theory. Is this a natural transformation[2]? As a diagram, representing (,) tupling as => a -> b a b || || -> || \/ \/ \/ c -> d c d With lots of handwaving, I guess so. A natural transformation maps functors to functors. In the left diagram. -> is an arrow, => is a Fuctor, and in the right diagram, => is an arrow and -> is a Functor. Anyways... for later deconfusion. [1] http://hackage.haskell.org/packages/archive/fclabels/0.4.2/doc/html/Data-Record-Label.html#3 [2] http://en.wikipedia.org/wiki/Natural_transformation Entry: Mapping trees to integers Date: Sat Dec 31 09:46:36 EST 2011 More specifically in Haskell: given a tree which is only terminated in nodes that are temselves mappable to integers, how to map such a tree to an integer (i.e. for exact hashing?) Mapping (positive) integer sequences to integers is quite straightforward. As a base use the primes, and express the tuples as prime powers. Anything that can be mapped to positive integer sequences can be encoded that way. So how about trees? Since part of the problem is solved, the question remains: how to map an arbitrary tree into a sequence of positive integers? Let's start with this [1] Actually.. It's a lot simpler maybe to just map everything to bits. Here's the two implementations mapping the datatype Type to integers. data TypeName = AFloat | AInt | ABool | AVoid -- atomic | ATree TypeTree -- composite | AType Int -- indexed type (see PrettyC.hs) deriving (Eq,Show) data TypeTree = AAtom Type | ANil | ACons Type Type deriving (Eq,Show) data Type = Type TypeName TypeOrder deriving (Eq,Show) type TypeOrder = Int Prime-encoded positive sequences: primes :: [Integer] primes = sieve [2..] where sieve (p:xs) = p : sieve [x|x <- xs, x `mod` p > 0] hashPos :: [Integer] -> Integer hashPos is = hp is primes where hp [] _ = 1 hp (i:is) (p:ps) = p ^ i * hp is ps typePos :: Type -> [Integer] typePos = typ where name AFloat = [1] name AInt = [2] name ABool = [3] name AVoid = [4] name (AType i) = [5, 1+i] name (ATree t) = [6] ++ tree t tree ANil = [1] tree (AAtom t) = [2] ++ typ t tree (ACons t1 t2) = [2] ++ typ t1 ++ typ t2 typ (Type n o) = name n ++ [1+o] Binary sequences: hashBin :: [Integer] -> Integer hashBin = hb where hb [] = 1 hb (b:bs) = b + 2 * (hb bs) typeBin :: Type -> [Integer] typeBin = typ where -- One case, no prefix. typ (Type n o) = (name n) ++ (num $ toInteger o) -- 6 Unique prefixes. name AFloat = [0,0,0] name AInt = [0,0,1] name ABool = [0,1,0] name AVoid = [0,1,1] name (AType n) = [1,0] ++ (num $ toInteger n) name (ATree t) = [1,1] ++ (tree t) -- 3 Unique prefixes tree ANil = [0] tree (AAtom t) = [1,0] ++ (typ t) tree (ACons t1 t2) = [1,1] ++ (typ t1) ++ (typ t2) -- Self-delimiting numbers. num 0 = [0] num n = [1, mod n 2] ++ (num $ div n 2) [1] http://stackoverflow.com/questions/3596502/lazy-list-of-prime-numbers Entry: Register Allocation Date: Wed Jan 11 11:47:31 EST 2012 It looks like both Staapl[1] and the DSPM language in meta[2] will eventually need some form of register allocation/reuse if I'm going to compile down to PIC code without relying on a C compiler to do that for me. (OTOH, I wonder how good AVR GCC is doing that optimization job.) [1] entry://../staapl [2] entry://../meta [3] http://en.wikipedia.org/wiki/Register_allocation Entry: State machines Date: Wed Jan 25 17:22:06 EST 2012 State machines have popped up a lot lately: - Current consulting project: how to make an exhaustive test for a relatively implicitly specified state machine. - Staapl: defining state machines (protocol-oriented programming) in a concatenative language: functional specification and instantiation (register / global variable allocation). - meta/dspm/Loop: SSA / CPS / ANF without procedure calls. For most embedded work they seem to be a good solution, but can sometimes be hard to test. Is there a good way to link the high level specification and low level implementation with a good testing strategy. Entry: State machines / parallellism and resource allocation Date: Thu Jan 26 06:59:00 EST 2012 And then.. I'm thinking that in this whole parallellism debate, shouldn't we go back to "writing" electronics instead of programs? So I wonder, is that really just a problem of resource allocation? SM's are finite, but most programming models are infinite (infinite memory for storage and execution stacks/continuations). Of course this model breaks down because this infinite general model has to be "small enough" to be implemented on a finite machine. Entry: Reader monad and order Date: Sat Feb 18 10:07:32 EST 2012 I'm trying to capture the idea of "context dependent state transformation". Something doesn't quite add up here... There seem to be a couple of ways to formulate this. Let c be context, s be state and R be a Reader monad, which is a context-dependent computation: R a b == a -> b ( Concretely: c is "world state" of an animation, i.e. read-only or "stiff" background info like current time, and s is "object state" of an animation, meaning the animation's dynamics = state of its equations of motion. ) A) c -> s -> s or R c (s -> s) B) s -> c -> s or s -> R c s C) (s,c) -> s Where R is a reader monad. The computations I want to fit in a framework have initial s and c available at the same time, so C) is the type that corresponds best to reality. Why is there an ambiguity when trying to write this as a Reader monad? Which of A) or B) is the correct/appropirate one? Is the Reader monad the appropriate model? Is this a Co-Monad? (EDIT: The answer seems to be Yes[1]). Something I ran into before while trying to capture state machines / state space models is that the following correspondences are not really bijective. How to make that "really" precise? (a,b) -> c <=> a -> b -> c (a,b) -> c <=> b -> a -> c EDIT: Above isn't expressed well. According to [1] these really are the same. I just changed the animation types from m (s -> s) to s -> m s It seems that while this doesn't make a difference for the Reader monad, for other monads it does. I.e. I could use a state monad to thread the RNG state without trouble. [1] http://comonad.com/reader/2008/kan-extensions-ii/ Entry: State Space Models: Arrow and generalized functors. Date: Sat Feb 18 10:28:05 EST 2012 I'm trying to find a good way to represent state-space models in the standard Haskell type classes (categories?). The basic form is the relationship between an update equation and (infinite) sequences. ((s,i) -> (s,o)) -> ((s,[i]) -> (s,[o])) This is a generalization of a haskell Functor in terms of Arrows instead of functions. The update equation is an Arrow: U s i o = (s,i) -> (s, o) Normal Haskell Functor F: fmap :: (i -> o) -> (F i -> F o) Generalized Haskell Functor in terms of arrow A instead of arrow/function (->): fmap' :: A i o -> A (F' i) (F' o) What I'm interested in is then the less general F' = [] (U s) i o -> (U s) [i] [o] Where A = U s is the Arrow parameterized in the threaded state object. So the final types are something like this: fmap :: Functor f => (i -> o) -> f i -> f o gfmap :: Arrow a, GFunctor f => a i o -> a (f i) (f o) What is GFunctor? It's a pattern I don't recognize. It pops up in less general form (GFunctor == []) in iterated functions: gfmap :: ((s,i) -> (s, o)) -> (s,[i]) -> (s,[o]) gfmap f (s,[]) = (s,[]) gfmap f (s, i:is) = (s'', o:os) where (s',o) = f (s,i) (s'',os) = gfmap f (s', is) See next post. [1] http://www.haskell.org/ghc/docs/latest/html/libraries/base/Control-Applicative.html#t:WrappedArrow Entry: Functor in terms of Arrow Date: Sat Feb 18 11:23:55 EST 2012 Dear HC, Does AFunctor below have a standard name? It's a generalization of the Functor class in terms of Arrow instead of (->): fmap :: Functor f => (i -> o) -> f i -> f o afmap :: Arrow a, AFunctor f => a i o -> a (f i) (f o) It pops up in less general form (AFunctor = []) in iterated functions (difference equations / state space models), where the arrow is the update function parameterized by state type: data Iter s i o = Iter ((s,i) -> (s,o)) instance Arrow (Iter s) More concretely: > afmap :: ((s,i) -> (s, o)) -> (s,[i]) -> (s,[o]) > afmap f (s,[]) = (s,[]) > afmap f (s0, i:is) = (sn, o:os) where > (s1, o) = f (s0,i) > (sn, os) = afmap f (s1, is) > f (s,i) = (s', s') where s' = s + i > is = [1,1,1,1,1] > os = afmap f (0,is) -- (5,[1,2,3,4,5]) Cheers, Tom Entry: Forking a random number generator? Date: Sat Feb 18 13:39:19 EST 2012 For an animation framework I need random numbers in the "leaf nodes" of an animation tree. However, I don't want to introduce a serial dependency over the tree, i.e. through a state monad to track RNG state. Is it possible to "split" an RNG such that it has a tree-like (Reader monad / S-combinator) dependency graph, while keeping the sequences generated in the leaf nodes of this tree "random enough". There is a random number generator that is seeded by integers: mkStdGen :: Int -> StdGen so maybe the question is: how to fork integers? Or, how to fork them enough such that collisions are rare? A very un-informed way would be to just multiply the seed by a prime number. This will not reduce the configuration space and give a reasonable "decorrelation" if the prime number is large enough? The funny thing is that the decorrelation itself will be simpler to express as a serial operation when forking a random number of states. To parallellize this again a list of primes could be used.. Then when lists of primes arrive, it's probably also possible to just use binary trees: shift by one and fork +0, +1 though that seems to run out of states faster. Entry: Monad transformers Date: Sun Feb 19 09:58:44 EST 2012 Forget the creative forking of last post, I'm going to use Reader + State. I had to write a small example program to understand the wrapping / unwrapping mechanism. > f :: s -> M s > f = undefined which is wrapped in this monad onion: > type M = ReaderT String (StateT Int Identity) Given a value :: s and the function :: s -> M s, we can unwrap one layer at a time. First peel off the ReaderT, then the StateT and last the Identity. > run s = s' where > mStateT = runReaderT (f s) "Context" > mIdentity = runStateT mStateT 123 > Identity (s', _) = mIdentity ACCESS: > getInt :: M Int > getInt = lift get > getString :: M String > getString = ask (Check these later; timing out..) [1] http://hackage.haskell.org/packages/archive/mtl/2.0.1.0/doc/html/Control-Monad-Reader.html [2] http://hackage.haskell.org/packages/archive/mtl/2.0.1.0/doc/html/Control-Monad-State.html [3] http://cvs.haskell.org/Hugs/pages/libraries/mtl/Control-Monad-State.html [4] http://cvs.haskell.org/Hugs/pages/libraries/mtl/Control-Monad-Reader.html Entry: Constraint programming, layout & choreography Date: Fri Mar 23 20:47:42 CET 2012 The trick is to be able to non-causally describe convergence. It's easier to do this "backwards in time". Same goes for spacial parameters. Example: have two animiation meet in one point. It's simpler to directly specify that they meet at time t=t' than to figure out when to start them such that this property is met implicitly. Example: how to center an asymmetric animation based on its final configuration. I need a way to convert a constraint-based description to a linear sequence. This most likely requires a 2-step approach: 1) solve all unknowns and 2) perform a coordinate transformation or lookup. So how to do this? I'm probably going to be helped with just a local constraint propagation solver (equations as bi-directional functions). Entry: Relational, Logic, Constraint: CTM Date: Sat Mar 24 16:50:17 CET 2012 Logic is Relational w. inference (composite relations) Constraint is Relational w. fundeps + conversion to functional. Better way? See CTM. Entry: Invertibility through sparseness Date: Sat Mar 24 17:05:35 CET 2012 Thinking a bit more about constraint solving. If all constrains are linear, then GE is the way to go. What makes local propagation interesting is that it allows solution of nonlinear equations that remain unique (invertible) through sparseness. I.e. there's a difference between an equation like xy = 1 and x^2 - y = 0 The former is a bijection while the latter isn't. The interesting observation is that many practical systems are nonlinearly constrained but remain invertible, or at least locally invertable for a wide range around the solution. Entry: Invertible nodes constraint solver: limitations? Date: Sat Mar 24 17:16:31 CET 2012 So what are the limitations of an invertible node solver? I'd say: no dependency loops. If there is a loop, more powerful "primitives" are necessary. Each loop needs a "global" solver. Though, note that it's OK to have loops in the (undirected) network, but not in the derived DAG. So how would this be called? An N-in, M-out function that's multi-directional, i.e. any pair of in/out can be swapped. In case information is lost in one direction this will be related to bi-directional lenses[1]. See next post. Looks like I need bijective lenses. [1] entry://20120325-111325 Entry: Pierce's lenses - Bidrectional programming Date: Sun Mar 25 11:13:25 CEST 2012 Bi-directional is less strict than bijective: sometimes information is lost in one way, in which case the other direction is an update operation that takes into account some of the information present in the original source. get :: S -> T putback :: T x S -> S get (putback (t, s)) = t putback (get s, s) = s + putback (t2, putback (t1, s)) = putback (t2, s) The last one is forgetfulness and is optional. Has to do with delete vs. undo. If putback ignores the S argument it is bijective. Sometimes too strong but nice when it holds. [1] http://www.cis.upenn.edu/~bcpierce/papers/lenses-etapsslides.pdf Entry: Adding arrows to a network Date: Sun Mar 25 11:39:36 CEST 2012 A bijective constraint network (equation network) is a collection of nodes and relations, where each node takes part in a number of relations, but is output to only one relation. Solving the network is to determine by which relation a node is determined. The fact that this is local allows focus on compositon. Can this be done using a simple bitmask? I.e. this is a form of 3-state logic: 0 output 1 input x unknown Probably the I/O assignment and actual function "production" can be separated. - When a node is asserted, for each connected equation, determine if the degrees of freedom are reached and propagate for all relations not self. Using an imperative algorithm this seems straightforwad. Is there a functional way? Taking a break. This probably needs some more background processing.. The input is a network so I'm thinking of using some kind of spanning tree representation. The result is also a DAG, so a simple tree rep won't work for the result. (If this where possible, the network might have been represented by "a" solution, where changing of I/O configurations would simply transform this solution). Entry: The usefulness of local state / working with graphs in Haskell Date: Sun Mar 25 13:10:56 CEST 2012 Some handwaving to follow. I find it hard to do real work in Haskell. Many, many algorithms take advantage of in-place updates of data structures, moving from one consistent configuration to another one in a (conceptually) single step. Often this incremental update is key to the efficiency of the algorithm. Creating a full duplicate of the data structure at each algorithm step is often too expensive and probably also annoying, since it often needs more info than a local update only. It seems that most functional algorithms use some kind of inside-out representation using zippers, which allows an abstraction of the current edit point while keeping updates cheap: most of the deep structure is reused while the only the local edit point needs a new construction. So let's dig into this a bit deeper. Given a generic graph structure, how to practically represent it in Haskell? Google gives me this[1]. It describes how to work with graphs, focusing on the operations: cata/ana-morphisms. It's interesting how abstraction is kept fairly high, while the nitty gritty uses an imperative algorithm. How to load this in my head? Some take-away points: - knot tying requires node-equality to allow recursive traversal. since there is no pointer equality, this requires unique identifiers for each node. overall it seems more of a hassle than anything else; finite lists of nodes/edges may make more sense from an implementation pov. - imperative algos (i.e. in ST[2]) aren't necessarily evil. for graphs they are probably way too useful/efficient to dismiss (i.e. node marking vs recording node tags in a dictionary). [1] http://www.haskell.org/haskellwiki/The_Monad.Reader/Issue5/Practical_Graph_Handling [2] http://www.haskell.org/haskellwiki/Monad/ST Entry: Evaluation of equation network Date: Sun Mar 25 17:29:34 CEST 2012 Input: - set of nodes - set of relations that refer nodes - subset of nodes with initial values Output: - compute the corresponding output values or fail. It's important to not change the form of the algorithm when the subset of specified values changes. I.e. in the problem at hand, what is important is the value of all nodes. Which nodes that serve as input should be shielded from the code that uses the node values. However, it might be useful to perform the search in 2 steps: compile I/O config to function, then use function multiple times. Thinking about this, it seems that the most important part here is to actually specify the inputs. How to do this without loosing composition? I.e. if it's encoded in types then the types already specify the structure.. I lack experience to find a way to express this properly.. ( input spec ? ) -> X..XX.XX... partially completed -> XXXXXXXXXXX fully completed, after evaluating equations The structure is known at compile time. f x y = do [a,b,c] <- nodes 3 in a x sum [a, b, c] -- sum == 0 prod [a, b] -- product == 1 solve 'solve' returns a structure of all the nodes, currently a list (of floats) but this should probably be a heterogenous type 'nodes' creates a list of node variables 'in' initializes a node value 'sum' and 'prod' are multi-directional constraints. Trouble with this approach is that it doesn't compose: the whole network needs to be defined at once. Well, maybe not. Let's try to implement this with the ST[1] monad and see where it breaks. EDIT: it seems there is definitely some advantage here to separate content from structure, which in Haskell can often be done using type classes. It seems that the whole "network building" could be done as a compile-time computation. Would this make it easier, or is this just one of those neat side-tracks that doesn't add much to the end result? [1] http://www.haskell.org/haskellwiki/Monad/ST Entry: Compile time mutable operations Date: Mon Mar 26 12:00:11 CEST 2012 I want to perform some operations on a network data structure at compile time, i.e. using type classes and fundeps. Is it worth finding this out, or should I look for a better way to do compile time programming in Haskell? I sure do miss Scheme macros. The direct but incremental approach is a lot easier to navigate. Entry: The ST monad Date: Mon Mar 26 12:18:17 CEST 2012 I need some mutable arrays, i.e. Data.Array.ST and have no idea how it works. So, here's some Q&A. - How to create a mutable array of size 10 with all 0 elements? arr <- newArray (1,10) 0 - How to convert a mutable array to an immutable one? The ST Array lives in the ST monad, so use newArray and operations on the array to construct a monadic value ST (STArray). runSTArray can be used to convert this to Array. That solves construction and final output. - How to iterate over the elements of an array? Probably simplest is to use forM_, which is essentially a combination of map (construct a list of monadic computations) and sequence (weave a list of monadic computations into one computation). The function `indices' caused some confusion as this is only defined for Array, so I use the following approach: forA a f = do (a,b) <- getBounds a forM_ [a..b] f - Updating array elements: use writeArray. [1] http://hackage.haskell.org/packages/archive/base/latest/doc/html/Control-Monad.html#v:forM-95- [2] http://www.haskell.org/haskellwiki/Arrays Entry: Indexing binomial combinations Date: Mon Mar 26 15:55:16 CEST 2012 What is the simplest way to "name" the different combinations in a select I from N problem? I.e. select 2 from 3: 1 -> .|| 2 -> |.| 3 -> ||. Select 2 from 4: 1 -> ..|| 2 -> .|.| 3 -> .||. 4 -> |..| 5 -> |.|. 6 -> ||.. What I used above is just binary counting (digit reversed) and skipping all entries that don't have the proper number of elements. This seems quite straightforward, but I prefer something that doesn't need a search. For my app I is usually a lot less than N (mostly 1 and in some cases 2), how do I get the sequence number in the list above starting from the coordinates of the empty spots? For I=1 this is already the desired encoding: 1 -> .||| | 2 -> |.|| | ... n -> |||| . Actually it's quite trivial when using a nested datastructure: - N lists, first element from N - N-1 lists, second element, each from N-1 possibilities Problem solved. EDIT: Sun Apr 1 12:02:11 CEST 2012 I made it work with the code below. Question is if this is really useful, and if an implicit table ref isn't a better way to go because: * final result needs to be a table * a single function acting on a table might be simpler to use than a bunch of functons + shuffler routines. EDIT: Actually this is not even correct. It indexes proper permutations while what I need is just selections (all permutations of outputs and inputs separately can use the same function) I'm moving to an implicit, imperative array-based implementation. See next post. -- A multi-directional equation node is represented by a tree of -- functions, one for each possible combination if inputs and outputs. -- Each level in the tree fixes one output. Canonical ordering of -- outputs is from left to right starting at index 0, and spanning all -- remaining nodes at a particular level (i.e. first level has N -- elements, second level N-1, ...) -- I.e. for a 5 node, 1 output network -- IIOII <-> [2] -- OIIII <-> [0] -- IIIIO <-> [4] -- -- for a 5 node, 2 output network: -- -- IIOOO <-> [0,0] -- OOOII <-> [4,3] -- IOOOI <-> [0,3] -- The ordering of the EquFun list follows a canonical list of -- permutations (defined elsewhere). See equFunIndex data EquImpl = EquImplFun EquFun | EquImplSelect [EquImpl] equImplRef' :: EquImpl -> [Int] -> EquFun equImplRef' = f where f (EquImplFun fn) [] = fn f (EquImplSelect fns) (i:is) = f (fns !! i) is -- Convert list of I/O to permutation tree index. data EquIO = EquIn | EquOut deriving (Eq, Show) equIO2Ref = f [] 0 where f c n [] = c f c n (EquOut:e) = f (c ++ [n]) n e f c n (EquIn:e) = f c (n+1) e -- For debugging: using 0,1, instead of EquOut/EquIn equIO = map f where f 0 = EquOut f 1 = EquIn Entry: Node binding (equation solving) : sequential approach Date: Sun Apr 1 12:11:48 CEST 2012 It seems too much hassle to find a "family of functions" approach, especially because the result just needs to be a table. Might be simpler to just do it in-place, and define a single "fill in solver" for each primitive equation type. Still, that makes it difficult to separate the structural compilation step (network -> function) from the evaluation step. Still very confused. So what about this: - structural compilation (abstract evaluation) produces indices for the shared data function. - function is an array -> array map, parameterized by 2 lists of indices for input and output. Let's make an example for a linear functional. I'd like to be able to work with mutable arrays inside the implementation, so which should be the array at the interface boundary? This is determined by the type of runSTArray :: Ix i => (forall s. ST s (STArray s i e)) -> Array i e which is Array[1], for which the simplest constructor is: listArray :: (IArray a e, Ix i) => (i, i) -> [e] -> a i e After playing with direct imperative access and looping indices for a bit, it turns out that there are simpler ways. I.e. the function below solves a sum functional given a vector of coordinates and the index to update. sumA = Data.Foldable.foldr1 (+) type Nodes = Array Int Double eqSum :: [Int] -> Nodes -> Nodes eqSum [o] ia = runSTArray $ do oa <- thaw ia writeArray oa o $ (ia ! o) - sumA ia return oa But.. this doesn't take into account that the equations should have references to nodes. It seems that it's best to put the lowlevel stuff inside an ST monad and work with references. ( The excursion to arrays was actually just a roundabout way of working with references.. maybe also to not have to specificy holes by keeping them implicit. ) type Node s = STRef s (Maybe Double) foldNodes f = foldM (\accu el -> do me <- readSTRef el return $ me >>= f accu) foldNodes1 f (x:xs) = foldNodes f x xs What about this as basic datastructure: - nb of nodes to satisfy - ordered list of references to nodes - op: fold over Just components Probably an array of STRefs is easier to work with since it allows for direct indexing. [1] http://www.haskell.org/ghc/docs/latest/html/libraries/array/Data-Array-IArray.html#t:Array Entry: Imperative programming in Haskell Date: Tue Apr 3 12:52:25 CEST 2012 So I'm kind of fed up with this inability to express imperative algorithms in Haskell. Let's go for it. 1. collection of nodes -> Set STRef 2. collection of equations bound to nodes -> [STRef] For specification we don't need to use String as node names; all can be embedded in a monad such that lexical names can be used for nodes. Ze Monad: [n1,n2,...] <- makeNodes n eq1 <- newEq [n1,n2,...] eq2 <- newEq [n1,n2,...] input n1 v1 -- (*) input n2 v2 -- (*) solve [eq1,eq2] return $ values [n1,n2, ...] MAIN IDEA: The part marked (*) is what we'd like to change easily (in the code) without having to change all the other code. Entry: Product of State and ST? Date: Tue Apr 3 13:16:26 CEST 2012 Does it actually make sense to use a product of State and the ST monad? Cant the global state go in an STRef? Anyways, I couln't quite figure out how to do this (too abstract). But I'm starting to get confused again. Is it possible to build a structure and its contents separately? Structure can be reused. I planned to not think of that but maybe it's actually simpler to keep them separate. I.e. we don't need solvers that know how to scan for undefined variables: this can be done completely generic. The output of a network compilation step is a program (which solvers to run in what sequence connected to which nodes. Entry: Interesting Recursion Patterns Date: Thu Apr 5 22:44:36 CEST 2012 I ran into an interesting tree recursion pattern where there is both a globally threaded state and a "fan-out" environment. This doesn't seem to be such an exceptional pattern though. Entry: Equations for box layout Date: Fri Apr 6 01:16:06 CEST 2012 Turns out there is a specific kind of equation that doesn't seem to be easier to implement than with a dedicated algorithm that reflects its recursive structure directly: - sum child box sizes into one box, propagate upward - receive 1 location and scale info and propagate downward The algorithm is a very elegant approach using continuations to chain everthing together. More later! Entry: Why are "stateful maps" not part of standard functions? Date: Fri Apr 6 11:35:38 CEST 2012 Meaning: (s,a) -> (s,b) -> [a] -> (s, [b]) Maybe this is because it's more easily handled in the state monad, in combination with forM? I'm not convinced, compare: test = flip runState 0 $ do forM [1..10] $ (\i -> do s <- get put $ s + 1 return $ i + s) With: -- foldr with output stream foldo :: (a -> s -> (s, b)) -> s -> [a] -> (s, [b]) foldo f = fld where fld s [] = (s, []) fld s (a:as) = (s, b:bs) where (s', b) = f a s (s'', bs) = fld s' as test1 = foldo (\a s -> (s+1, a+s)) 0 [1..10] Why is 'foldo' not a standard library function? Actually, it is! [1] mapAccumL :: (acc -> x -> (acc, y)) -> acc -> [x] -> (acc, [y]) mapAccumR :: (acc -> x -> (acc, y)) -> acc -> [x] -> (acc, [y]) [1] http://www.haskell.org/ghc/docs/6.12.2/html/libraries/base-4.2.0.1/Data-List.html Entry: forM Date: Fri Apr 6 19:59:52 CEST 2012 Maybe it's simpler to just provide a forM-like abstraction for each container datastructure instead of a fold. forM :: [a] -> (a -> m b) -> m [b] forM' :: D a -> (a -> m b) -> m (D b) it seems simpler to use than a fold. I wonder though if this is somehow equivalent. Probably is, but in what way? Actually, this seems to be Data.Traversable[1]. [1] http://www.haskell.org/ghc/docs/6.12.2/html/libraries/base-4.2.0.1/Data-Traversable.html Entry: Data.Traversable vs. Data.Foldable Date: Sat Apr 7 11:51:09 CEST 2012 Follow-up from last post. From [1], a Traversable is a Foldable that is also a Functor. This can be understood by observing that Foldable will just iterate and thread a state, while Traversable will iterate, thread state, and build a new data structure. If you cancel the output datastructure you get a fold, while if you cancel the threading you get a map. What I find strange is that all of these can be seemingly defined in terms of forM. Let's try. The missing insight seems to be that the constraint on traverse is Applicative, and not Monad. Why is this? Below t is the container type. traverse :: Applicative f => (a -> f b) -> t a -> f (t b) The question is: why is a monad not necessary? Or, why is the "monad part" of a monad that's more powerful than the "applicative part" not used? Last time I was looking into this, the take-home argument was: monads have value-dependent control flow, while arrows do not. Not sure about applicative though. The important part for the Monad version of traverse is that it imposes sequential traversal, eventually imposed by the composition of calls to (>>=). Can the same be done with applicative? Maybe my main problem is that I don't see how applicative imposes sequencing? Let's explore that in a separate thread. BING! The missing link is sequenceA, which bridges lists of computations and the sequencing operator of an applicative functor. This makes it sort of obvious why Traversable just needs sequenceA. However it seems simpler to define for directly, since this can be converted to fmap by insert the identity Applicative. [1] http://www.haskell.org/haskellwiki/Foldable_and_Traversable Entry: Applicative and sequential operations Date: Sat Apr 7 13:21:05 CEST 2012 Monads impose sequential operation through bind (>>=, >>) or the Kleisli composition (>=>). This translates easily to "do" notation (let* for the schemers) which makes the sequentiall nature obvious. However, Applicative functors also supposedly abstract side effects, though I've never really understood what these "less powerful" side effects really are. In essence, what's the real difference between Applicative and Monad? This SO article[1] says: Compared to monads, applicative functors cannot run its arguments selectively. The side effects of all the arguments will take place. Obviously the difference is to be found in the API, so this would be the difference between <*> and >>=. The bridge between lists and sequencing seems to be this function: \a as -> (:) <$> a <*> as :: Applicative f => f a -> f [a] -> f [a] which can be used in a foldr to turn a list of computations [f a] into a computation that produces a list f [a]. I think I just rediscovered sequenceA for lists: sequenceA :: Applicative f => [f a] -> f [a] sequenceA cs = c where c = foldr push (pure []) cs push c cs = (:) <$> c <*> cs That fills a gaping hole in my understanding. Life will be different from now on ;) Combining sequenceA and map then gives traverse/for. It's interesting how sequenceA is part of Traversable, i.e. that there is no less generic version in Prelude that only works on lists. Maybe that's a good thing actually - more general from the start. Letting this sink in for a bit it makes perfect sense: The sequential nature of Applicative functors should really be compatible with sequentially traversing a complex data structure. [1] http://stackoverflow.com/questions/2104446/how-do-you-use-control-applicative-to-write-cleaner-haskell [2] http://www.haskell.org/ghc/docs/latest/html/libraries/base/Data-Foldable.html [3] http://www.haskell.org/ghc/docs/6.12.1/html/libraries/base/Control-Applicative.html [4] http://www.haskell.org/ghc/docs/6.12.2/html/libraries/base-4.2.0.1/Data-List.html#5 [5] http://www.haskell.org/ghc/docs/6.12.2/html/libraries/base-4.2.0.1/Data-Traversable.html Entry: Are state space models Applicative? Date: Sat Apr 7 13:58:40 CEST 2012 Finally discovering sequenceA (which bridges lists and the 'sequencing operator' <*> of an Applicative functor) makes me think that Applicative is probably also a better abstraction for state space models, as it's a lot easier to use than Arrow. Entry: Left/Right confusion in mapAccum Date: Sat Apr 7 15:18:50 CEST 2012 Basically, I used mapAccumR and my datastructure was reversed. This is kind of strange, and it doesn't seem to be explained in the docs. Have to find the source later. But the basic idea is that "left" and "right" refer to the order in which the list is traversed. The result list is always the same order. mapAccumL :: (acc -> x -> (acc, y)) -> acc -> [x] -> (acc, [y]) mapAccumR :: (acc -> x -> (acc, y)) -> acc -> [x] -> (acc, [y]) *Main> mapAccumL (\acc x -> (acc, x)) 0 [1,2,3] (0,[1,2,3]) *Main> mapAccumR (\acc x -> (acc, x)) 0 [1,2,3] (0,[1,2,3]) *Main> *Main> mapAccumR (\l e -> ((e:l),e)) [] [1,2,3] ([1,2,3],[1,2,3]) *Main> mapAccumL (\l e -> ((e:l),e)) [] [1,2,3] ([3,2,1],[1,2,3]) From the last example it's clear that L starts at the beginning while R starts at the end. So as long as the operator is associative, the result shouldn't matter. Entry: foldr from traverse + monad Date: Sat Apr 7 16:53:35 CEST 2012 Defining foldl is simple using the State monad: foldl f s b = s' where (_, s') = runState (traverse f' b) s f' b = modify (flip f b) Original question: Instead of "updating" state we could build a nested computation with a hole in it. Is this somehow dual? Indeed, and the approach is quite straightforward. foldr f s b = k' s where (_, k') = runState (traverse f' b) (\s -> s) f' b = modify (\k -> (\s -> k $ f b s)) Instead of getting a new state value by applying a function to it, it works the other way around. It takes the hole in which the final value of the computating is inserted and replaces it with a different hole, observes what is put in that hole, modifies it and puts the result in the original hole. Both approaches are dual: one updates values, the other updates holes. Once duality pops up you find it everywhere of course: - foldl / foldr are dual in the order they traverse and "cons" a list - the above also works with State and RState (reverse State) - in Data.Foldable the dual of a monoid is used to implement foldr/foldl in terms of foldMap (library/base/Data/Foldable.hs[1]) [1] http://www.haskell.org/ghc/dist/7.0.4/ghc-7.0.4-src.tar.bz2 Entry: Reverse State monad Date: Sat Apr 7 18:09:01 CEST 2012 It uses knot tying to construct a bi-directional data flow. From [1]: newtype RState s a = RState { runRState :: s -> (a,s) } evalRState s f = fst (runRState f s) instance Monad (RState s) where return x = RState $ (,) x RState sf >>= f = RState $ \s -> let (a,s'') = sf s' (b,s') = runRState (f a) s in (b,s'') get = RState $ \s -> (s,s) modify f = RState $ \s -> ((),f s) put = modify . const Probably best to do this instead of the last 3 hardcoded methods: -- Is this allowed to be in here? instance MonadState s (RState s) where get = RState $ \s -> (s,s) put s = RState $ \_ -> ((),s) And also: instance Functor (RState s) where fmap = liftM instance Applicative (RState s) where pure = return (<*>) = ap I don't find this in the standard libraries. Is it removed? Maybe this is now a combination of ReverseT and State? I used this monad to implement the bridge between traverse and foldr. This works as long as the hardcoded "modify" is used. Making RState part of MonadState results in an infinite list of the same element. Maybe this is just bottom and a conseqence of the knot-tying interfering with the default "modify" from MonadState. foldr f s b = s' where (_, s') = runRState (traverse f' b) s f' b = modify (f b) [1] http://lukepalmer.wordpress.com/2008/08/10/mindfuck-the-reverse-state-monad/ Entry: Applicative and Functor in terms of Monad Date: Sat Apr 7 18:26:18 CEST 2012 A pain in the ass, but can it be done automatically? -- If we have Monad M then: instance Functor M where fmap = liftM instance Applicative M where pure = return (<*>) = ap Entry: Duality: values <-> continuations Date: Sun Apr 8 01:15:54 CEST 2012 Values and continuations are supposedly dual (for an interesting example see [1]). From[2]: We can think of continuations as a lack of , or request for values. We can see this duality in another way: values are the present's view of past computations, while continuations are the present's view of future computations. So if a function takes a value to a value, what takes a continuation to a contination? [1] entry://20120407-165335 [2] http://www.google.be/url?sa=t&rct=j&q=values%20and%20continuations%20are%20duals&source=web&cd=12&ved=0CC4QFjABOAo&url=http%3A%2F%2Fciteseerx.ist.psu.edu%2Fviewdoc%2Fdownload%3Fdoi%3D10.1.1.48.4255%26rep%3Drep1%26type%3Dpdf&ei=HMqAT6HPHurJ0QXC9ej9Bg&usg=AFQjCNH4Tl0WmfN-p9Gb17FvGVV60N_UAA Entry: Applicative Functors Date: Sun Apr 8 10:29:43 CEST 2012 As is obvious in a sort of Zen way, just look at the API to see what an abstraction does. Look at what it really means without going too much into story.. 1. Needs to be a functor (i.e. container) :: (a -> b) -> f a -> f b 2. It needs to support application :: f (a -> b) -> f a -> f b What I missed before is sequenceA sequenceA :: Applicative f => [f a] -> f [a] sequenceA cs = c where c = foldr push (pure []) cs push c cs = (:) <$> c <*> cs Thinking about this as: convert collection of element PRODUCERS into a PRODUCER of a collection of elements. What happens above is construction using (:) but what if construction is something like: Tie the output state of the first to the input of the second. Something didn't yet click... But one of the key elements is sequenceA. Trying the original paper again[1]. Some observations from [1]: * These all start from pure functions in the examples (a pure function applied to funny arguments). However, after the first <$>, the result is no longer a pure function, but a collection of partially applied functions. The ability to store such a collection is just a property of a Functor, i.e. up to here we just used fmap = <$>. After that, once we have a *collection* of functors to then further apply it to elements that are also in collections requires <*>. * A reason to define traverse directly is that the usual definition as sequence . map will traverse the struture twice (if the compiler can't optimize this out that is..) * The Monoid / phantom Applicative relation seems interesting but doesn't quite snap into place for me.. Try later. * Different between Applicative and Monad: for a Monad the value of one computation can influence the second, while for Applicative the structure of a computation is fixed. Moral: If you've got Applicative, that's good If you've got Monad, that's even better! If you need Monad, that's good If you need only Applicative, that's even better! * Applicative are composable. * Monoidal = symmetric interface for Applicative: unit :: f () <,> :: f a -> f b -> f (a,b) This separates the "combination" from the "computation" which can be done by fmap, i.e. :: (a,b) -> c. Which illustrates an important point: it's the combination of two values into one that allows sequentiality to emerge. [1] http://www.soi.city.ac.uk/~ross/papers/Applicative.pdf Entry: Experimenting with Applicative Date: Sun Apr 8 11:18:39 CEST 2012 EDIT: Probably mostly meaningless.. Let's play with this a bit. Take the Functor (s ->). A functor value F o is then (s -> o), i.e. an output where state is abstracted. The <*> for the Applicative version of this would chain together (s -> (i -> o)) (s -> i) to (s -> o) This is actually the environment variable from [1]. Can this implement state machines? Doesn't look like it because 's' can't depend on the valueo of the i->o or i types, because those types are completely general (container!) and thus not accessible to the implementation of <*>. What could happen though is a fixed update s->s that's applied on every <*>, i.e. the increment of a counter. Let's construct an example of "just" threading state. import Control.Applicative data Thread s a = Thread (s -> s) a runThread (Thread inc a) = (a, inc 0) appThread :: Thread s (a -> b) -> Thread s a -> Thread s b appThread (Thread cf f) (Thread ca a) = Thread cb b where b = f a cb = ca . cf instance Applicative (Thread s) where (<*>) = appThread pure a = Thread (\n -> n) a instance Functor (Thread s) where fmap f = (pure f <*>) An example of this would be a counter, wich counts the number of elements used in a computation. makeCount x = Thread (\n -> n + 1) x [1] http://www.soi.city.ac.uk/~ross/papers/Applicative.pdf Entry: Monoidal Date: Sun Apr 8 11:55:42 CEST 2012 Called Monoidal in [1] but doesn't seem to be used in this way in general. Basic point: define Applicative in a more orthogonal way by extending the properties of a Functor by: unit :: f () <,> :: f a -> f b -> f (a,b) where f (a,b) -> f c can just be handled by fmap. [1] http://www.soi.city.ac.uk/~ross/papers/Applicative.pdf Entry: Stateful iteration (for / forM) or unfold? Date: Fri Apr 13 14:29:47 CEST 2012 In writing the layout engine, I run into a dilemma when writing loops. Either I use a classical approach (for / forM) where all side effects can be kept in a Monad or Applicative instance, or I work with a combination of map/zip and unfold, where the "stateful" part is constructed separately. The conclusion I tend to draw is that it doesn't matter much. Sometimes it's just useful to perform unfold before map to have some decoupling. Other times its simpler to combine iteration and state.. But if you can do one, you can of course do the other.. Entry: Just update state. Date: Mon Apr 16 08:19:57 EDT 2012 What's the simplest way to have a collection of mutable variables? I'm writing an animation state update routine and the number of parameters is getting quite large, so it becomes hard to do something like: update (Params a b c ...) = Params a' b' c' where a' = ... b' = ... c' = ... also because not all parameters are updated at each step. Maybe this is a symptom of bad factoring, but frankly I don't have the time right now to look into it too deeply, so I wonder how to do just have "no fuss mutable state". From memory this seems to be the ST monad and STRef, which is like IO and IOref, except that it can be executed like this: 1. create a bunch of STRef from immutable data (= thaw) 2. perform mutable computations on STRef 3. copy contents of STRefs to immutable data (= freeze) Entry: Monad transformers Date: Mon Apr 16 11:24:20 EDT 2012 Question: I know (intuitively) I want to combine 2 monads. How to know which order they go in? For some the order doesn't matter, for other it does. Practically, I want to combine state and reader. So I just used this: runSR m = evalState (runReaderT m r0) s0 and the nice surprise is that operations are automatically lifted: even if state is the inner monad, get/set/modify "just work". See also "Monad Transformers Step by Step" [1]. To find out whether monads commute, it's probably best to just write out the types (without wrappers) and verify manually. An example of a pair of transformers that does not commute is MaybeT and StateT[2]. [1] http://www.cs.virginia.edu/~wh5a/personal/Transformers.pdf [2] http://en.wikipedia.org/wiki/Monad_transformer Entry: Updateable state by name Date: Wed Apr 18 15:38:12 EDT 2012 Sometimes mutable state combined with hierarchical namespace is really what you want. How can the following be handled in haskell? hierarchical.namespace.variable = 123 I guess the variable would be an STref. The hierarchy can then be implemented with whatever datastructure that is suitable. Entry: Maybe and monoid Date: Wed Apr 18 16:14:07 EDT 2012 Is this function meaningful in general? f :: Monoid m => Maybe m -> m f Nothing = empty f (Just m) = m Entry: Real-world programming in Haskell Date: Sun Apr 22 09:38:40 EDT 2012 A beginner's critique on using Haskell for real-world programming (i.e. not just writing a compiler or other data -> data converter). Haskell is great for refactoring code in small increments. However, introducing state is often a fairly large structural change. I found this to be a non-trivial learning step. Contrary to what is advertised often, adding state to a functional algorithm isn't easy: it's a large syntactical change touching a lot of code. Once the framework of state is in place (Monad or Applicative or Arrow ...), THEN it becomes easy to pick-and-choose the side effect. However, going from pure to side-effect isn't always straightforward. The pattern I've seen is that state is usually necessary to make the leaf nodes of a tree processing algorithm share some information. In an imperative language this is a no-brainer. In a pure functional language this is a truly big change which has to be anticipated, i.e. better always think that state is going to be part of the picture. As seen on HN[1], "Functional programming" in the real world isn't about dogmatically eliminating mutable state. It's about managing it. It's about modularity. As always, abstraction is your friend. The basic idea in pure functional programming seems to be that "recursion is a low-level operation". You should always abstract recursion behind an operator. This way, moving from pure to side-effects is a lot easier. One way I've found useful to do this is by using Traversable (or Foldable) instances. Traversable gives you a "map with side-effects", i.e. the good old for loop with local loop state. Once a for loop is present, ad-hoc, local monad contraptions can be used to perform any kind of side-effect during traversal. This is a nice approach, but requires some getting used to. The big revelation while working with this kind of approach is that making all connections explicit shows quite clearly how imperative programming often relies on "arbitrary connections" hidden in access to global state, breaking modularity. [1] http://news.ycombinator.com/item?id=3858698 Entry: Compiling FRP to state machines Date: Mon Apr 23 13:22:07 EDT 2012 FRP (playing with functions of time) seem to be a nice way to construct any kind of "choreography", being it animation, music, user interface, ... State machines are a de-facto way of representing reactive systems in a low-level, low-resource form. Is there a way to connect both? Maybe it's time to read Conal's paper[1] again.. [1] http://www.delicious.com/redirect?url=http%3A//conal.net/papers/icfp97/ Entry: Resist structure Date: Mon Apr 23 15:01:25 EDT 2012 Existing structure that can't reflect a new requirement is always the cause of development slowing down. Is there a way to avoid it altogether? Does that make sense in any way? Probably not. What's the closest point that actually makes sense? Make structure as general as possible? For every constraint (== structure) provide maximum flexibility == keep things orthogonal. Entry: Database (push vs. pull) Date: Mon Apr 23 15:28:45 EDT 2012 Whenever a program becomes highly parameterized (i.e. text layout) it seems best to structure the parameter store as a database, i.e. instead of using explicit data structures and push information to the correct place, it seems simplest to push only a DB reference, and let the app query the necessary parameter. Entry: Params with defaults Date: Wed Apr 25 09:47:50 EDT 2012 Find a different approach to do this: alpha = execState (forM attrs scan) False where scan Alpha = put True scan _ = return () which sets a default and picks overwrites it with the last matching attrib in the list (if any). Trouble: this seems to requre a default case (scan _). Is it possible to do it in a different way such that this default case can be abstracted? I.e. can mismatches be mapped to nothing instead of raising an error? Entry: Forth Direct vs. Indirect threading? Date: Mon May 14 09:18:05 EDT 2012 I forget.. What is again the difference? IIRC, indirect threading is easier to implement in C... Starting out with just straight-line code, it might be simplest to forget about code fields altogether. So, what is pointed to by IP? It's essentially an opcode for a VM. Again: what's the simplest threading mechanism to use to implement Forth in C? I think it's called switch threading, where the main loop is something like: switch(*ip++) { case CMD_EXIT: ... ; case CMD_LIT: ... ; } Disadvantage: this needs a "call" opcode for composite words. So it's essentially one layer on top of subroutine threading (ST for a stack VM). Probably, CALL can use half of the address space. Also, this needs both CALL and JUMP for tail calls + RETURN for primitive words. Is it really simpler? What about making the instruction stream abstract? Calling a word == pushing a new instruction stream, and popping it at the end (EXIT). Entry: Representing self-delimiting numbers Date: Mon May 14 17:36:51 EDT 2012 Start out: one byte = 8 bits = 256 values. To make this self-delimiting, some of the values can be used as extensions. Using one bit is simple, i.e. midi-style, but 2^7 is an awkward number. Entry: Model-View-Controller vs. functional GUI Date: Mon Dec 24 21:11:31 EST 2012 I'm trying to find out why writing GUIs is such a pain, and whether it is possible to simplify the approach using some functional programming tricks. MVC[2] might make sense in the OO world, but why all this state? The problem is really simple: a GUI is an animation that responds to user input: frame0 -> input -> frame1 -> ... each frame represents a different interpreter. There's a loud voice in my head screaming FRP[1], but I don't see immediately how everything links up... There should be a way to relate the idea of cursor into a model tree (zipper in a model) to cursor into a gui tree. Can both be the same? Is there a way to relate model and gui in a more direct way such that the interface emerges somehow "automatically" ? I believe that MVC as used in OO world often has not so clear separation between concerns, but the basic idea is very simple (see figure in [2]): View is generated from Model Controller updates Model Here, the model is usually fiarly obvious. Generation of View from Model should be a no-brainer (projection to gui parameters followed by injection into the drawing system). Updates from the Controller to the model should also be simple. However, the tricky bit is how View and Controller are tied to each other. I.e. a mouse click on a canvas probably needs to be dispatched depending on how exactly it is rendered by the model. In my own use, I've always found the V/C distincion to be quite arbitrary. Maybe I just don't understand, but there seems to be something missing: something that resembles the "physical". Maybe there should be an intermediate point? Abstract Model - Physical Model Where interactions to the physical model are simple physical events, like one would expect from a game engine, and the constraints of the physical model are directly mapped to the constraints of the abstract model. What about this: define 2 maps a -> p, p -> a, relating the abstract and physical models in such a way that changes to the physical model that are not consistent do not map back to the same representation. What I describe above is almost exactly this[4]: One of mind-opening ideas behind Functional Reactive Programming is to have an event handling function producing BOTH reaction to events AND the next event handling function. Thus an evolving system is represented as a sequence of event handling functions. Reactive Banana[5] sounds interesting. [1] http://en.wikipedia.org/wiki/Functional_reactive_programming [2] http://en.wikipedia.org/wiki/Model%E2%80%93view%E2%80%93controller [3] http://broadcast.oreilly.com/2008/10/mvc-as-anti-pattern.html [4] http://stackoverflow.com/questions/2672791/is-functional-gui-programming-possible [5] http://www.haskell.org/haskellwiki/Reactive-banana Entry: Problem with FP Date: Mon Dec 24 21:35:02 EST 2012 Is that it is too easy to fall into the state trap by just hiding it in "extended datastructures", i.e. a pure structure "with some metadata attached", which changes its meaning. Hard to explain.. Basically whay I want to say is that FP design is more about finding data representations that do not require encoding of exceptional or intermediate situations, or at least to make them more explicit so they stand out more. Maybe what I'm trying to say is that things that are solved in OO programs by setting a variable, are solved in FP programs by adding another constructor to create a non-orthogonal concept: "what I would like it to be + some ugly diff with reality". Entry: ping-pong layouter in C Date: Tue Dec 25 23:16:52 EST 2012 I have some Haskell code that performs a hierarchical box layout as: stacker: box1, how big are you box1: I'm .... (possibly recursively determined) stacker: box2, how big are you ... stacker: parent, I'm this big parent: ok, go here, and shrink/grow stacker: box1, go here and s/g... box1: here's my list of allocated children stacker: box2, go here and s/g ... box2: here's my list of allocated children stacker: parent, here's my list of allocated children ... See function boxLayout in [1]. This is essentially a 2-pass algorithm with a bottom-up and a top-down information flow. In Haskell this is implemented as something like this layout :: abstractbox -> (boundingbox, (location, stretchedbox) -> children) Basically, an abstractbox is mapped to a boundingbox (determined by calling this kind of function recursively on contained children) and a continuation. The continuation is passed the final location and dimension of the box as determined by the parent after gathering bounding box information. The call to this continuation then recurses down the chain a second time to place children and return the result upstream again. This is a lot more elegant that implementing a similar approach using mutable state, since it requires only local reasoning, and no manual flow control. I would like to do the same thing in C. I can use a GC so I could create closures manually. The GC is limited however: just a CONS cell memory. A closure in C could then be abstracted as: closure(env, arg1, arg2, ...) where env is a data structure built on top of CONS cells. The question is then: where does the magic come from? Is it merely in the GC - i.e. not having to keep track of structures that have been used up? So, is it worth using a 3rd party collector (I don't want the Boehm monster) or can the simple CELL collector in libprim be used without too much notation overhead? Alternatively, for a simple algo like this it might be good enough to keep track of the intermediate strucutres and free them once per iteration. The data should be fairly contained. That might be easier since proper C structs can be used directly. Also, it might be possible to do the layout part off-line and generate just the (static) event processor chain. [1] http://zwizwa.be/darcs/hgl/Box.hs [2] http://code.google.com/p/dpgc/ Entry: Ad-hoc polymorphism is higher order abstract syntax Date: Wed Jan 23 13:52:49 CET 2013 It seems that Haskell type classes, implemented at run time using extra "implementation" parameter passing, is sufficient to implement any kind of higher order syntax representation, to the point that they are somewhat equivalent. Entry: Promises Date: Sun Feb 17 20:46:58 CET 2013 Also mentioned in [1]. Looks like promises are means of constructing DAGs, i.e. result events in terms of (pure) operations on intermediate events, but they also have a sort of "Maybe monad" built in, in that breaking of promises can propagate, just like proper result values. The promises are based on the idea of Futures, which Racket has an implementation of[2]. [1] http://www.youtube.com/watch?v=b0EF0VTs9Dc [2] http://docs.racket-lang.org/reference/futures.html Entry: Love the Lambda Date: Wed Feb 20 15:12:22 CET 2013 ( From email correspondence. ) Teaching this stuff is not easy. I applaud the brave ones that attempt it. I remember struggling very hard to understand the one-argument function thing when I was learning OCaml (similar to F#). It's hard to track back what actually changed in my understanding, but it has something to do with changing "basic substrate" in how I think about programming. My understanding at this point is that *pure* functional programming is fundamentally different from imperative programming. I find the "you already know X" approach in a lot of teaching to be a bit misleading. The problem is in the reliance on weird ways of combining higher order functions. Because lambda in essence is actually quite simple once it becomes natural, all the difficulty is in building these intricate combination patterns out of different types of functions. I've been discussing with Antti a bit that "lambda is low-level". It is somewhat of a machine language in pure FP in that it is the only thing that can be used to build any kind of program structure. Anything else builds on top of it. For me it took several attempts spread over about 2 years before the a -> b -> c thing started to make sense. Then another 2 years to stop being afraid of higher order functions (functions taking function as input) and type constructors entering the picture, e.g. map :: ( a -> b ) -> ( List a -> List b ) What made me understand this, is to write programs, get deeply confused and suddenly end up with a new way of looking at things. I remember that starting with Scheme, I found "map" to be such a revelation. One way of looking at it is that everything else is sort of a generalization of "map". I'm thinking that teaching the use of and then the implementation of map might be a sort of optimal point in teaching functional programming. Nobody cares about dorky factorials, but lists are intuitively obvious. Discussing map ties into functions, recursive data structures, anonymous functions (lambda), higher order functions and recursion. At this point I think that pure FP 1. is really not rocket science, the underlying ideas are fairly simple, but 2. is *very* different from thinking sequentially and it takes some getting used to. The big change is to wrap your head around thinking in data-flow and "composable things" instead of state and sequential manipulation of state. Basically, I think that there is really no shortcut in learning this, but it is definitely learnable if you have a basic understanding of programming. Pure FP is a new "computer architecture" that requires a different way of thinking. But there are ways to make the transition from imperative to pure FP a bit less painful. My journey was C -> Perl -> Python -> Scheme -> OCaml -> Haskell Each transition required a little shift in thinking. There are probably other ways tailored to one's specific experience. Love the lambda ;) Entry: Don't write a one-pass compiler Date: Fri May 10 19:20:44 EDT 2013 One of those non-obvious obvious things.. It's often hard to do in one pass because of a global->local information flow. At least a pass "leaving holes" is necessary, i.e. a lazy approach. That is, unless you're using a lazy language, in which case some magic might be possible performing in one pass what ordinary morals need to do in two. Entry: Computers for Cynics - Ted Nelson Date: Wed May 29 18:43:08 EDT 2013 Very funny. [0] http://www.youtube.com/watch?v=KdnGPQaICjk [1] http://www.youtube.com/watch?v=Qfai5reVrck [2] http://www.youtube.com/watch?v=c6SUOeAqOjU [3] http://www.youtube.com/watch?v=bhzD2FKEEds [4] http://www.youtube.com/watch?v=_xL19f48m9U [5] http://www.youtube.com/watch?v=_9PmIkAYhI0 [6] http://www.youtube.com/watch?v=gWDPhEvKuRY Entry: State Machines Date: Fri Aug 2 11:36:00 EDT 2013 Writing a deeply embedded program as a collection of state machines removes the need for an RTOS. This can be good, because an rtos by itself is a big requirement, and threads are usually not very efficient. Especially so if a proper factorization of a problem would contain many different threads. So, I've been thinking about writing a (simple) racket system that combines ideas from - Antti's Bream - Tom Hawkins' Atom - MyHDL Basically, tail-recursive Scheme code (without any real recursion or re-entry) is a relatively high-level way for writing state machines, as it solves the variable binding problem: Scheme's lexical scope avoids a global state structure with lingering "don't care" variables as would be the case in an explicit C implementation. Such a scheme program can be turned "inside-out" to yield a C implementation with all blocking points turned into exits + state entry (condition==true is mapped to state change). An optimized implementation could use register re-use. So simple, no-frills approach could work well. ( What I'm describing here is probably 90% of what is in existing good C compiler for a small microcontrollers. ) So what's the concrete problem I want to solve? Build a translator from shallow (non-properrecursive) tail-recursive blocking Scheme code to non-blocking state machines with a condition abstraction. Write it the way a microcontroller works: - allow for polling operation: do state transition when READY flag is set. - allow for interrupt: avoid polling, just run the update when an interrupt occurs, notifying a certain flag is set. Looking at VHDL's sensitivity lists: these seem to be there only for simulation, i.e. to update the state of a process whenever an input changes. Synthesized logic will do whatever its circuit does. Essentially this is the bridge between a physical system (the circuit) and a simplified model: an event-driven digital system. Entry: MyHDL Date: Sat Aug 3 16:45:00 EDT 2013 Looked at MyHDL and found out that my initial understanding was wrong: it does not use generators to implement state machines (FSMs). It uses generators to implement events, i.e. signal change wait points. Values returned by a generator are used by the MyHDL scheduler to wake up simulations. The reason has probably to do with synthesizability. I'm not sure exactly how that works in MyHDL, but it seems that an *implicit* FSM in the form of a generator is too opaque to recover state. This is an essential part of the abstract-syntax-oriented approach I'm proposing: access to syntax is essential because a syntacting transformation is necessary to map yield/wait syntax to state machines. Entry: State machine generator Date: Sun Aug 4 10:38:00 EDT 2013 * Convert MyHDL-style yield/wait syntax to explicit state machine + wakup list. * Implmenent MyHDL-style event simulator Entry: Suspend / Resume syntax translation Date: Tue Aug 6 19:56:00 EDT 2013 Essentially, a yield point in a piece of code needs to: - capture all variables visible at exit - re-instate them at re-entry Instead of re-instating variables (C implementation), it seems simpler to just replace variables with object references. However, the C optimizer can eliminate re-instantiation of variables that are not used. Still, re-using storage space is not something that happens this way. How to go about that? It requires some form of register-allocation algorithm on the variables visible from different states. The trade-off is that copying is wasteful, but indirect accesses are more expensive. Keeping all state in an indirect object makes task switching very fast, and maybe that is what we should optimize for when aiming at a design that can support high task count. A dumb direct approach is to prefix all variables with level names that unambiguously encode the position of the variable in a scope nesting. E.g. l_1_2_ is the variable in the second level 2 block nested in the first level 1 block. An additional optimization step could use this encoding to perform sharing, i.e. l_1_ and l_2 can be shared, since they are never visible at the same time. In a first iteration, we could stick to a single function body, i.e. no nesting. Later, nesting can be implemented using inlining. Is this all there is? - Functionality: flatten variable scope into a structure, using block coordinate name prefixing. - Optimize: - Leave "temp" variables on the stack, i.e. those that do not cross a yield point. - Share storage space between mutually exclusive block scopes. Entry: State machine translator - how to start? Date: Sun Aug 11 10:20:05 EDT 2013 This is an exercise in dealing with control flow. What are the necessary elements? - function inlining: any (higher order) function call that contains a yield note - variable binding analysis. - function body rewrite bridging yield statements. It's hard to get a good overview, so maybe best to start writing code. Approach: - Abstract interpretation of Haskell Language.C Entry: Tasks vs state machines Date: Fri Aug 16 21:22:10 EDT 2013 I wonder if it is really just about the switching mechanism: - task: switch stack pointer, registers and other global/CPU state - state machine: switch current object i.e. from a system's perspective, the difference is in task switching speed. If nothing has to be copied, this can be quite fast. However, the cost is amortized through slower indirect access. What non-preemptive tasks/state machines (let's consider them equivalent) have in common is a better handle on where the data actually is: no time is wasted in "caching" per-thread state. Entry: Salea Logic in Racket? Date: Sun Aug 25 12:27:40 EDT 2013 Maybe that's something to try out the design of the state machine generator. I.e. bootstrap it in Scheme macros first, then see what can be done at the C language level. An interesting avenue there is to make a translator to Staapl, i.e. work on a way to do abstract state machine specification that gets compiled down to fixed memory addresses. There are two interesting problems: - Folding code "inside out" at yield points. - Optimizing data / state allocation. Entry: Objects are names for things you don't control Date: Sun Sep 8 18:21:40 EDT 2013 Basic idea: when data is just data, don't turn it into an object. Work with "dumb data" as much as possible. The border between the two seems to be: " IS THE THING MADE ENTIRELY OF BITS IN MEMORY ? " Objects are things that encapsulate state, i.e. a printer is an object. It contains paper, not something you can influence as a programmer. However, a document is not an object, it's a data structure. Other ways f putting it: - self-contained pieces of information should be values, not objects. - using an object to representi a value sequence is just an optimization - replace objects with processes (another manifestation of sequences). Entry: Iterator Blocks Date: Sat Sep 21 23:57:45 EDT 2013 The task/SM translation I was thinking about is the same as "iterator blocks" mention here [1]: Iterator blocks allow to have both advantages at the same time: - Their code looks pretty much the same as with internal iteration - The compiler transforms this code into a class/object/type implementing the interface of an external iterator - The generated iterator object can often be implemented very efficiently via a simple FSM This[2] aso mentions "yield" in C# being converted to a state machine. Another one[3]. Looks like the C# semantics is same/similar to python's: only yield in main method. So looking again at python generators: a 'def' creates a generator when the body has a yield operation in it. A yield operation in a called function doesn't work as expected: control nesting can't be implemented in function calls: all control nesting needs to be local in the definition. [1] http://michaelwoerister.github.io/2013/07/26/Iterator-Blocks.html [2] http://www.marshut.com/nxyuu/the-future-of-iterators-in-rust.html [3] http://blogs.msdn.com/b/shawnhar/archive/2010/10/01/iterator-state-machines.aspx Entry: Loop transformation algebra Date: Fri Dec 13 14:37:47 EST 2013 Is there a way to separate a canonical representation of loop operations from its implementation as a management problem for intermediate storage? Entry: Good Tech Blogs Date: Sun Dec 29 20:26:23 EST 2013 https://gist.github.com/jvns/8172943 Entry: Data Direction and Control Flow: 2 x 2 = 4 Date: Sat Mar 1 00:51:13 CET 2014 A little arcane, but quite fun. Do you push or pull data? It depends. Some types, using dataflow parameters (as in Oz, easly poor-man-modeled usin C++ references or C pointers.) - sink : write(from x) - source : read(to x) - operation : process(to y, from x) These can be neatly composed: sink * op = sink op * source = source op * op = op Now, what I often forget is that these have duals. There's a thing that "puts something in a sink", and a think that "pulls something from a source". In practice, what are these co-objects (anti-objects)? If sink, source, and operations are models of data processing (push, pull and flow), the co-objects correspond somehow to physical ports or the operating system transferring control flow to a program when an event occurs. A co-sink is something that writes into a sink. Note that an co-sink is not a source! The asymmetry is the caller/callee relation. | caller callee -------+--------------------- sink | sends receives source | receives sends process is caller for both send and receive. then there's a missing 4th case: the buffer, which is callee for both send and receive. It seems that "push" programming (sink-oriented programming) is the most natural, as it has pysical-time coinciding with execution on a CPU. So is "pull" programming (function evaluation) then only a model? Is the concept of evaluation just upside down? Entry: Dynamic typing / eval and polymorphism Date: Sun May 25 10:18:08 EDT 2014 Trying to fit a square peg into a round hole: designing a DB schema for work for a model that is partly OO, i.e. it has some data to type mapping. If I understand correctly, the this kind of value to type relation is not something that ordinarily works in a RDB, but is possible to emulate with conditional functions. Entry: Qt Pyside layoutChanged recursion Date: Tue Jun 17 13:15:43 EDT 2014 I can't help but thinking that cross-connecting view updates should just work. It doesn't. How does a human programmer solve this issue in Qt? It must pop up a lot. Simply redraw everything? Maybe have a look at QML[1] [1] http://en.wikipedia.org/wiki/QML Entry: A rust project: blog database! Date: Thu Jun 19 00:59:10 EDT 2014 Something that needs speed to be useful is a word-based index into a body of text, and yes I do have a body of text! Entry: State machine compiler Date: Fri Jun 27 09:55:47 EDT 2014 Some ideas: - In deeply embedded applications, there is no dynamic creation of state machines. This is important for architectures such as PIC where memory indirection is very expensive as compared to flat memory access. Optimize for static state machines. - Language-wise, there are a couple of levels: - pure functions + recursion - blocking imperative procedures: explicit dynamic yield/suspend - non-blocking imperative event handlers (e.g. object or case statement) - The most useful transformation is that from blocking procedures to non-blocking event handlers state machines. The essential operation is to capture the current environment into an object. - To tackle this problem: start with a scheme compiler, and perform the continuation capture operation at yield[6]. This needs: - lambda - apply - begin (imperative sequencing) I started working on this before. Where's that code? [1] entry://20130811-102005 [2] entry://20130804-103800 [3] entry://20130802-113600 [4] entry://20130816-212210 [5] entry://20130806-195600 Entry: CoArbitary Date: Thu Aug 14 19:39:15 EDT 2014 "the CoArbitrary class continues to confuse me"[1] To make an arbitrary function a -> b, make a generator for b based on a generator for b and some "shuffling" applied through the value of a. Why is there a 0 in the following list? *Main> shrink [1,2,3] [[],[2,3],[1,3],[1,2],[0,2,3],[1,0,3],[1,1,3],[1,2,0],[1,2,2]] because shrink 1 produces 0. [1] http://www.reddit.com/r/programming/comments/1mcu8/roll_your_own_window_manager_haskell_and/c1md04 Entry: The mess we're in Date: Sat Sep 20 16:43:53 CEST 2014 Joe Armstrong's condenser. - abolish names and places [1] https://www.youtube.com/watch?v=lKXe3HUG2l4 Entry: State machines Date: Sun Sep 28 18:56:37 CEST 2014 For testability, the important part is non-divergence, meaning that the effective state space / input space is rather small. For system design, the reason to pick state machines is synchronicity: i.e. design with *GEARS*. Somewhere in there is a simple formalism that allows reduction of complexity of state machines, making them verifiable, while at the same time providing a better language syntax to specity "gear" relations. Entry: LLVM haskell Date: Sun Sep 28 21:39:45 CEST 2014 [1] http://www.stephendiehl.com/llvm/ Entry: Mirage Date: Mon Oct 6 01:06:55 CEST 2014 This is truly amazing! [1] http://www.infoq.com/presentations/mirage-os Entry: Static actors Date: Fri Jan 2 17:18:53 EST 2015 So one would use actors to ensure robustness. According to Joe Armstrong, you need at least 2 machines to have robustness so concurrency is an essential element. Splitting tasks in supervisors and "happy path" application code allows one to not handle errors locally: just let it fail and let the supervisor restart = separation of concerns. Interesting, but a bit resource intensive. I wonder if it's possble to find subsets of this where the implementation is actually done by a static set of state machines executing with static scheduling, and statically known message queue sizes or getting rid of mailboxes altogether (i.e. size = 1: just a variable to pass to another state machine). So what are the sets of constraints that need to be verified to make actors reduce to static state machines with static or at least predictable scheduling? One particular element would be the need for a hidden "clock" property, meaning that there is a concept of logical time that would create equivalence between messages. Essentially, for each kind of event, there is a DAG that computes (input,state) -> (output, state). I.e. the computation is finite. For any kind of event, the response of all actors is to block waiting for a new message. All feedback should either be through internal state, or externally to the system (e.g. the real world). I actually have code for this. Maybe revive it? Entry: folds Date: Sat Jul 11 22:08:20 EDT 2015 If sequences are best exposed as folds, how do you reprsent a map over a fold? Basically, I want the fold itself as a variable, not a function call. Entry: Reactive Programming Date: Sun Jul 12 23:50:11 EDT 2015 Note related to [1], just primed.. A reactive program creates a dataflow graph. So focus on that graph and its evaluation. If a network is static, all code can be compiled to push mode only. For every event there is a static path of updates to be evaluated. [1] http://research.microsoft.com/apps/pubs/default.aspx?id=158828 Entry: The How and Why of Fitting Things Together - Joe Armstrong Date: Fri Jul 17 23:28:58 EDT 2015 Make everything look like an Erlang process: turn N^2 into N! [1] https://www.youtube.com/watch?v=ed7A7r6DBsM Entry: LING: Erlang on bare metal Date: Sat Jul 18 15:15:40 EDT 2015 "Linux is just a snowball of drivers." Basically, nobody cares about the OS; it's just that this particular one seems to have the right drivers. [1] https://www.youtube.com/watch?v=GIzTxuXvpxM Entry: Simplicity Date: Wed Aug 12 17:33:10 EDT 2015 There is another reason to "optimize" code: simplicity. Things are getting so ridiculously unreliable that it seems time to start over: toss out the OS. Entry: Two-object state? Date: Sat Sep 5 15:30:37 CEST 2015 Erlang's imperative part basically is message passing. I've noticed this pattern: A two-process combo where one implements the control flow, and the other implements memory access through RPC. What other side effects can be implemented that way? EDIT: Really? "Our alternative to a monad transformer stack is the single monad, for the coroutine-like communication of a client with its handler." [1] So this is the Eff Monad. I did run into that name before but don't recall the concept. Maybe it was the Eff language base on algebraic effects[2]. And also PureScript[3]. [1] http://okmij.org/ftp/Haskell/extensible/index.html#introduction [2] http://www.eff-lang.org/ [3] http://www.purescript.org/learn/eff/ Entry: CPS in Javascript Date: Sat Sep 5 21:08:59 CEST 2015 http://matt.might.net/articles/by-example-continuation-passing-style/ Entry: Ownership is Theft: Experiences Building an Embedded OS in Rust Date: Sat Oct 3 12:35:51 EDT 2015 However, embedded platforms are highly event-based, and Rust’s memory safety mechanisms largely presume threads http://amitlevy.com/papers/tock-plos2015.pdf Entry: Sequences as folds. Date: Fri Feb 12 11:45:16 EST 2016 Quick remark. Representing sequences as folds quickly gets into Arrow territory. e.g. trying to pass things "on the side" because the iteration scheme is fixed, and the list structure is gone. ( Formalize more? ) Entry: LL,LR polish,reverese polish Date: Tue Feb 23 18:14:34 EST 2016 http://blog.reverberate.org/2013/07/ll-and-lr-parsing-demystified.html Entry: USB and acks Date: Fri Feb 26 10:27:06 EST 2016 Thinking about how to make a reliable packet transport mechanism over an unreliable but ordered transport, using something like usb: - packet + checksum - ack - 1-bit sequence number to drop duplicates Entry: Sequences vs. folds Date: Tue Mar 22 13:54:58 EDT 2016 For a project, I'm using abstract folds to represent iteration over possibly infinite sequences. It meshes well with the actor model: tail recursion, processing an ordered message stream. However, it is interesting to see (or re-discover) the distinction between the two views. With some hand waving, it seems that I am rarely interested in the sequence iteself. The point is almost always a reduction of the sequence into some form of object that represents a property of the sequence. 99% of the time, that is best represented as a left fold in a strict language. Entry: A state machine language Date: Thu Mar 24 11:27:57 EDT 2016 It would be interesting to build a front end around a state machine model I've been using recently. See zwizwa.be/git/sm The core element is an event queue. An ordered list. Lock-free structures are currently not needed. There are a lot of possible ways to implement this. What is the canonical one? A priority queue: - remove max (min) - insert This[1] suggests a binary heap implementation. [1] http://algs4.cs.princeton.edu/24pq/ Entry: CRDT: eventual consistency Date: Thu Apr 28 17:04:16 EDT 2016 Set-based updates that are associative, commutative and idempotent, can be consistently and automatically merged. https://en.wikipedia.org/wiki/Conflict-free_replicated_data_type https://www.youtube.com/watch?v=bhYKrSUqSlo Entry: Distributed systems theory for the distributed systems engineer Date: Sat May 14 09:24:55 EDT 2016 http://the-paper-trail.org/blog/distributed-systems-theory-for-the-distributed-systems-engineer/ Entry: unique identifiers Date: Sun Jun 5 22:20:02 EDT 2016 It's (likely?) not possible to generate unique IDs without a central authority. One more reason to be fault-tolerant. And an interesting illustration of how programming Erlang turns some age old assumptions on its head. Shared state is possible, but it requires a process to handle its sequentiality, so stands out like a sore thumb. Entry: why does printing code look so ugly? Date: Fri Jul 8 01:34:16 EDT 2016 same for graphics drawing code, and anything that produces a stream of instructions in a non-functional way. Entry: Futures in Rust Date: Sat Aug 20 15:47:30 EDT 2016 http://aturon.github.io/blog/2016/08/11/futures/ Might be interesting to use this instead of writing an ad-hoc state machine compiler. The above is already a sate machine compiler. Entry: Declarative distributed programming Date: Thu Aug 25 11:23:37 EDT 2016 Building (heterogenous) distributed applications is hard. Why? It seems inevitable to escape a design that relies on distributed state, and communicating stateful objects have very complex interaction patterns. I've enjoyed using Erlang to mainly make the problem known: - You will not understand the problem you're solving until you've seen your proposed solution fail - due to distributed interacting state. - The only hope you have to get anywhere near implementing your requirements is to iterate development. Updates are essential in this process. - It makes sense to focus on the happy path, and design for robustness through restarts: if something unexpected happens, give up and go back to a known state. Move design effort into designing supervisors. - There is probably no coincidence that the supervisor approach resembles the idea of the "germ line" in biological evolution: everything except the germ line is disposable, as long as it runs long enough to propagate the germ line. All this leaves the functional programming mumbo-jumbo out of the picture, or at least delegates it to the meta level. Entry: Blocking tasks to state machines Date: Mon Aug 29 12:07:03 EDT 2016 What I want is to be able to code something that is clearly a state machine, but would be easier expressed as a collection of communicating processes. Basically, stick to RPC / coroutines / generators. But implement it as synchronized state machines. Entry: Streams vs. coordinates (transposes) Date: Mon Aug 29 15:08:19 EDT 2016 Transposes are easy to express as operations on coordinates. Compositions of transposes then become compositions of functions. I wonder if it is possible to do the same kind of transformation on a stream of data? E.g. by caching the "block" coordinate. The trick: 1. Compute output stream as a stream of highlevel coordinates. 2. Convert these coordinates to physical coordinates 3. Memoize the reads 4. For monotonous input, output is monotonous and it can be implemented using a one-way reader. Clever, but nothing new -- links in to Feldspar's basic idea. This needs to sink in for a bit. Properties used to advantage: - monotinicity in -> monotonicity out (means random access is not needed, streaming + caching works) - composition of coordinate-processing functions is *much* simpler and easier to express easier than composition of functions containing loops that move bytes around. Entry: TDM closed formulas and derivatives Date: Mon Aug 29 17:39:32 EDT 2016 Additionally, is it possible to compute an update equation from an explicit formula? There must be a way if the translations are of a certain type! E.g. define "derivative", and implement it for the operations. The idea of "cycling through channels" should somehow correspond to exponential functions / roots of unity. Entry: Eliminate the OS Date: Fri Nov 25 12:32:28 EST 2016 Operating systems by themselves should not be too hard to eliminate : most code can be tucked away in a library. The main problem is still: what about the hardware? Currently the only way hardware is "cheap", is when it is adapted to an os through drivers. Can this be disentangled? Can devices just be devices? Loosely coupled things that send messages? What I've seen mostly from writing drivers, is that the problem is that on a chip, nothing is really independent: a lot of configuration needs to happend before something takes proper shape, and ties into somethnig else. Entry: device drivers Date: Fri Nov 25 12:50:03 EST 2016 So, what is the real problem? Writing device drivers. Talking to hardware, fixed things. Adapters. Translators. Basically, making new toys run old code. Device drivers are boring because they do not look glamorous, but they do pose a fundamental problem: nothing speaks the same language. And concerning device drivers, some problems are: - protocols that are very state-sensitive - tied into global infrastructure which has constraints (clock config, io config, ...): this makes disentanglement, abstraction hard - non-orthogonal configuration: only specific combinations supported - learning how to use hardware through datasheets is hard - reliance on sequential configuration changes: CPU writes to registers, with some constraints on order but some arbitrariness as well - not everything is idempotent Once you have a better way to writing device drivers, it should be easier to change hardware also, e.g. find more optimal representations. Also, check out rump kernels. Entry: Joe Armstrong & Alan Kay - Joe Armstrong interviews Alan Kay Date: Fri Nov 25 12:52:28 EST 2016 https://www.youtube.com/watch?v=fhOHn9TClXY - sketchpad - http://www.cba.mit.edu/events/03.11.ASE/docs/Minsky.pdf - question your beliefs (negate them) - monads are a kludge: why do you treat this as a religion? - inverse vandalism: making things because you can My beliefs: - FP is for writing compilers vs. do everything functional - device drivers are a necessary evil vs. optimize writing device drivers Entry: idempotency and desired state Date: Mon Nov 28 14:53:03 EST 2016 How to you intelligently bring something to a desired state? When the succession of state progressions is linear, steps could be performed conditionally. Entry: event-driven systems Date: Wed Dec 7 15:53:11 EST 2016 Why is there such a dichotomy between: 1. dispatch of multiple events from a single wait point 2. wait for one specific event, then proceed sequentially Likely this is artificial, e.g. these are two possible implementation forms of event-driven programming. When there is a clear ping-pong going on, the latter is more straightforward as it allows recursive decomposition, if the dispatcher handles more than one "client", the direct dispatch is better. There is a natural way to view these: they correspond respectively to the callee and the caller in an RPC call. So from there, maybe it is client systems that are better expressed sequentially, and server systems that are better expressed with dispatch. Entry: Joscha Bach 4rth C3 lecture Date: Sat Dec 31 01:23:13 EST 2016 DNA is not a blueprint, but an OS. At no point is there ever no cell. https://www.youtube.com/watch?v=K5nJ5l6dl2s Entry: Make illegal states unrepresentable Date: Fri Jan 20 11:08:27 CET 2017 What is the analogy for cases where it is not possible to do this, but it is possible to further constrain the data structure? The point is that the illegal states don't make it past any kind of machine interpretation. So whether this is a simple explicit constraint (the shape of the data structure), or some constraint that is expressed ad-hoc, it shouldn't matter. Maybe this really flies in the face of the original point? So let's refine: 1. try to express all constraints as structure, as types, or as proofs. (whatever your tool set allows). 2. if that doesn't work, express it as properties that are exercises by an automatic test generator such as quickcheck. https://vimeo.com/14313378 Entry: GUIs: constraint vs. reactive Date: Sat Feb 11 10:58:49 EST 2017 Maybe the correct paradigm for guis isn't reactive programming, it's constraint programming? The main problem in UI programming in OO-fashion, is to propagate changes. On input, the model changes, which should reflect other views. Ways to solve this: - manual notification spaghetti - recompute entire view once model updates - "directionalize" the constraint program that describes widget relations Entry: Sequences as Folds -> Fused Loops Date: Mon Feb 20 16:39:27 EST 2017 A great advantage of representing sequences as folds is that loop fusion is free. And even more general: arbitrary stream processing can be expressed like this where "chunk sizes" can vary between stream processors in a very straightforward way. Write this up, and turn it into a C or Rust code generator. It works well in practice because often it is not possible to pick chunks sizes, and not automating that step will always create a mess of ad-hoc for loops in C. Entry: Nested folds and intermediate results Date: Mon Feb 20 22:01:21 EST 2017 What I call "intermediate results", is a pattern in DSP processing, where the time-ordered data dependency is broken and where there is a "spatial" or "multi-pass" component. This is hard to formulate in a loop-folding based functional approach because there is no longer a single loop. Algorithms are essentially multipass if there is a global->local data dependency. Are nested folds as they appear in the "sequences as folds" approach a way to deal with this? Entry: Tree diffing Date: Tue Feb 21 16:51:39 EST 2017 http://stackoverflow.com/questions/5894879/detect-differences-between-tree-structures https://en.wikipedia.org/wiki/Graph_isomorphism_problem Entry: Pi Calculus Date: Mon Mar 6 12:04:41 EST 2017 Would it be useful to spend time learning the Pi Calculus? http://erlang.org/pipermail/erlang-questions/2003-November/010783.html Entry: State machine notation Date: Mon Mar 6 12:07:28 EST 2017 I need a good notation to represent a state machine as a set of equations in a way that allows some properties to be extracted, or at least quick-checked. I spend way to much time "hacking" machines using incomplete reasoning. Events "act on" states. Events are operators, so let's represent them with capitalized identifiers, leaving lower case identifiers to represent states. s1 A = s2 At this level of abstraction, there are no simultaneous events. Some possible extensions: - To represent simultaneous events, compose two or more "proto events" into one event. - Not sure how to implement dependency in events, such as timeouts. Example: a debounced trigger. events: A - activating edge R - releasing edge T - timeout after a states: i - idle w1 - active edge seen, waiting for releasing edge or timeout w2 - timeout expired t - releasing edge seen, fully triggered not modeled: after the occurance of A, a T is scheduled for a moment in the future. if multiple A happen, multiple T would be scheduled. (not a good way to model this). i A = w1 w1 T = w2 (passed, waiting for release) w1 R = i (filtered) w2 R = t (fully passed) t _ = t (once triggered, ignore events) Entry: State machines Date: Mon Mar 6 12:43:32 EST 2017 The thing is really that often you don't want to write down a state machine's transition rules explicitly. Why? Because that is a very low-level description of the problem that often requires introduction of intermediate states that you really don't care about in your problem description. What you want is to write down some other model, that talks about actual events -- often event filtering -- and leaves out the details. Some patterns I've ran into: - Sequentially perform a number of operations, possibly finitely nested in loops, and wait for a set of events. This is the most common one, and corresponds to what otherwise would be an execution thread in a typical multitasking OS. - Perform resets / restarts when "exceptional cases" occur. The debouncing problem in the previous thread is more easily expressed as a sequential process: events: A - activating edge R - releasing edge T - timeout after a process: 1. Wait for A 2. with timeout_process { 3. wait for T,R R -> reset to 1 T -> continue } 4. output A 5. wait for R 6. output R 7. stop Notes: - The scope here indicates a resource that needs to be cleared on leaving the scope, in this case the timer process/state. - The second timeout process is an _essential_ part of this. There is something on the outside of the main process that can not be represented by something inside it (apart from emulating a scheduler). - This can be implemented by CSP (channel-style) and Erlang-style multiprocessing. Entry: Extending timer capture values Date: Wed Mar 8 18:32:29 EST 2017 Code looks like this: CR TR CR TR CR = capture interrupt is checked, and if there is a value it is loaded. after this a new capture event can happen. TR = current timer value is read. The task is then to: - extend the read capture value correctly - update the extension counter on T rollover The ambiguity is in the location of the actual capture event and the rollover event in the grid imposed by CR and TR. TR CR C TR CR TR, or TR CR TR C CR TR T0 T1 T2 It is straightforward to time extend C, if we assume we have an extension counter E that represents the state of the extension after the "nearest" rollover in the count. if C is near the small end -> extend with E large end -> extend with E - 1 The question is then, how to update E when a T rollover occurs? It seems this cannot be done without causing a race. I can't say exactly why, but I also can't answer the question about when to update E such that it can be trusted when it is read to extend the C value that is read at CR time. My intution was to do the update "far away" from the point of use. The solution I had was to use two extension counters, using the assumption that a counter is not used for a very large margin around the time when it gets updated, effectively "double buffering" it. counter use update M(id) Q2 Q3 Q4->Q1 W(rap) Q4 Q1 Q2->Q3 Then based on whether the captured count is in one of these regions, the extension is easy to compute: Q1 W Q2,Q3 M Q4 W-1 I currently beleive it can not be done by updating a single counter at or near the rollover, because the order of the events is not known. EDIT: One possibility is to keep track of one extra bit in the extension to disambiguate if the "nearest" rollover has been accounted for. This likely is equivalent to keeping two counters, because they always stay only one bit apart. Essentially, that one bit encodes the information "does the current extension account for the 'current' rollover or not?". This bit is the high bit of the last read timer: 0 : rollover has occured 1 : rollover has not occured But then this still needs an extra bit to take into account _whether to look_ at that bit. And we wouldn't if we're in Q2,Q3, but we would if we're in Q4,Q1. So an aternative way: - roll over the main counter in the straightforward way. this gives a 32-bit value that is "near" the capture event. - based on whether the capture event was before or after the rollover, extend it with the correct side. EDIT2: "extending with the correct side" can be done simpler: Given the notation above, we know that - C =< CB =< T => C =< T - The extension E:T can always be computed correctly There are only two possibilities: (E-1):C or E:C If E:C > T it must be (E-1):C That's it The condition that C happened at or before T is the property that allows disambiguiation. EDIT: So the fundamental event ambiguity is: TR R C TR vs TR C R TR I.e. did the rollover R happen before or after the capture event C. The second TR will catch the rollover, but extension of C depends on the orer of R and C. The disambiguation works because in the second case (C <= R), E:C will turn out to be past TR, which has to be wrong, so (E-1):C is the correct extension in that case. Checks TC C TR R TC TR TC TR C R TC TR The C<->TR swap doesn't seem to matter in this case. Summary: The problem was finding the right way to look at this. Another solution: https://e2e.ti.com/support/microcontrollers/msp430/f/166/t/276588 Entry: Code is data because of right fold Date: Sat Jul 1 12:18:12 EDT 2017 Any recursive datastructure has a generalized right fold by making constructors abstract. This can remove the need for an explicit representation as a data structure. A fold is enough. ( More generally: data's only reason for existence is to be passed as input to code. The only thing that is important, really, is the code that produces real-world effect. ) Entry: recursive descent parsers Date: Wed Jul 5 10:25:45 EDT 2017 About parsers. I don't really understand table parsers, and most parsers I've written are recursive descent parsers, as most languages I need to parse are very lisp-like, e.g. do not need backtracking. I would like to get to a point where this "feedforward" nature is expressed in a more direct way. Take as example the GDB status language, which in first approximation is a nested set of key value bindings and is quite representative of a lot of configuration languages out there: ::= | ; not specified further ::= | ::= "{" "}" ::= "" | | "," ::= | ::= "=" The property of the parser is to be able to pop a character and determine what to do with it without putting it back. This way the structure is a left fold. A natural way to represent this is to split the tokenizer and parser. The tokenizer will just collect non-control characters in a list, and upon encounter of a control character, will push the atom somewhere. This "somewhere" is what this is all about. Representation: the "current expression" is an inside-out term with a hole in it (a zipper or cursor). The tokenizer will fill this hole whenever an atom is parsed, and will create a new expression with a hole. The point is then to establish the meaning of the control characters as "hole transformers". E.g. each control character is a function that takes an atom and a hole, and produces a new hole. The "primitive hole" is then just an object. When the parser calls this it will will terminate. Now one by one, define the meaning of the control characters. "," if the current hole is an atom, change it to a list hole, otherwise, append to current list hole "=" if the current hole is an atom, change it to a binding "}" delete the hole at the end of the list (or fill it with nil?) This structure needs to know the type of the current hole to be able to transform it. That is not the same as a lisp parser. How to represent this? Because of continuation transformation it seems impossible to represent a continuation as a function. I can't really stabilize thought about this. There is something about the idea of attaching concrete meaning to individual characters that is very appealing, but oth it seems convoluted. There is no need for backtracking, but the "hole transformation" is definitely a form of going back and correcting a previous assumption. Let's forget about "=", but do the list control characters first. "}" close current list hole with a nil "{" insert a list hole in the current hole "," push a pair into the current list hole A hole is not a continuation, it is a continuation transformer: (Obj,K) -> K Some context: - parse starts with a continuation (a hole). - a control character takes the current token, pushes it into the hole, and updates the hole "}" is tricky as it has two meanings: - empty atom, insert [] - non-empty atom, insert [a] I lost it... it seems like a good idea but I can't get a hold of it. Maybe the reason that this is difficult is that I'm not separating out the tokens. Thinking a bit, it seems that what makes this difficult is exactly the postfix nature: "," and "=" change the meaning of the text that comes before. Maybe this can be solved in the tokenizer? E.g. the tokenizer should turn the input stream into a prefix-only stream, transorming [a = b] into [= a b]. Note that this is not possible for lists, as those are delimited, but that might not be such a problem. In any case this does start to look like a waste of time for the current task of parsing the GDB message format. Entry: Parser combinators Date: Wed Jul 5 12:21:59 EDT 2017 http://www.little-lisper.org/website/pc/index.html http://www.goodmath.org/blog/2014/05/04/combinator-parsing-part-1/ http://eprints.nottingham.ac.uk/237/1/monparsing.pdf Entry: The most general I/O processor? Date: Wed Jul 5 15:33:04 EDT 2017 For the tokenizer, what works was: - input as an outer iterator - output as a left fold How to turn an inner iterator into a fold? You can't because the control flow of the fold is one-shot. The way to do this is to block the fold when it's running, and that needs a task. Look at fold:gen(), it would be similar. EDIT: Works, but... Converting fold into source is a leaky abstraction. This popped up only when writing parsers. I wonder if it makes sense to then turn the parser into a source, as sources are easy to convert to folds. Doesn't seem so. A parser is naturally written with individual (blocking) read/write calls. (Fold is write, while inner iterators are read). Entry: 6 Iteration structures Date: Wed Jul 5 15:58:59 EDT 2017 Some representations of sequences. https://github.com/zwizwa/erl_tools/tree/master/src - fold.erl : left fold - pfold.erl : left fold with early stop - source.erl : inner iterator (stream) - iseq.erl : infinite sequences (almost special case of source.erl) - sink.erl : sink-parameterized generator - igen.erl : impure generators - unfold.erl : pure sequences represented as (finite) unfolds The difference is in which operations are explicit: fold,pfold: functional write (state update) source,iseq: functional read unfold: functional read with explicit state sink: imperative write (abstract function or process send) igen: imperative read Since these are a nice orthogonal mix of classes, a there might be a more appropriate naming scheme. These are duals in the caller/callee sense. For the functional ones there are finite/infinite vs full/truncate. (EDIT: this was editited to add unfold.erl) Entry: Manual parser Date: Wed Jul 5 22:25:34 EDT 2017 So I ended up writing a manual recursive parser. I had to resort to a hack to be able to solve the equal infix operator, which is patched at two places: close and atom. %% I: input %% Q: current queue %% S: stack of queues p([open |I], Q, S) -> p(I, [], [Q|S]); p([close |I], Q1, [[{eq,K}|Q2]|S]) -> p(I, [{K,r(Q1)}|Q2], S); p([close |I], Q1, [Q2|S]) -> p(I, [r(Q1)|Q2], S); p([{atom,V}|I], [{eq,K}|Q], S) -> p(I, [{K,V}|Q], S); p([{atom,A}|I], Q, S) -> p(I, [A|Q], S); p([equal |I], [K|Q], S) -> p(I, [{eq,K}|Q], S); p([comma |I], Q, S) -> p(I, Q, S); %% (1) p([], Q, []) -> r(Q); So equal quite literally changes the meaning of the last object parsed. It also changes the continuation: we're no longer putting the result in the queue, but in the second slot of the pair. {Q,S} is a represetnation of the continuation. S is always a stack of Qs, but there are two kinds of Qs: - list the hole at the end of a list - {eq,K} the hole in the second slot of the pair The rules likely become simpler if the continuations are made abstract. Let's give that a try. {Q,S} -> {[],[Q,S]} I find CPS hard. Maybe it's just lack of training, but making that translation really doesn't come naturally. Initial Q=[], S=[] This is a push to a list, which as an ordinary function is: fun(V) -> [A|V] end What I miss is muscle memory.. Functions in CPS form look like this (e.g. the Haskell do block): (define (pyth x y) (sqrt (+ (* x x) (* y y)))) (define (pyth& x y k) (*& x x (lambda (x2) (*& y y (lambda (y2) (+& x2 y2 (lambda (x2py2) (sqrt& x2py2 k)))))))) Note that the only reason to use CPS is to be able to also pass the input as part of the state using just tail recursion. So here's a systematic approach: - write a recursive parser in direct style - convert it to CPS mechanically - add the extra input argument So in erlang it is actually not necessary to write it in CPS, because the input can be abstracted away into a read() call by using another process. But it's nice to keep things pure of course.. EDIT: actually, this needs some form of back-patching for "=" unless that is changed to do it directly. Entry: line assembler iteration pattern Date: Mon Jul 10 15:47:29 EDT 2017 Currently: state machine. Push in a chunk, get a chunk in reply or not. This fits the map+filter operation, or fold + unfold. Entry: definition control-dominates use Date: Tue Jul 11 11:59:38 EDT 2017 Can this principle be used to ensure caches are coherent? Title is a quote from Olin Shivers from a talk on control flow, CPS. I believe presenting his scheme loop macro. The basic idea is that a variable can't be referenced before it is initialized, by construction. Entry: Intersection between igen.erl and source.erl Date: Sat Jul 15 10:58:06 EDT 2017 Problem: can't turn a list into an igen without creating a separate process. But it is possible to constrain the interface such that: - reader will only "pop once". - the "next" thunk is explicitly updated How to guarantee single-use of the the "next" thunk? This doesn't seem possible without storing state somehwere -- same problem as implementing proper lazy evaluation. So it's not possible to constrain this at run time. Can it be expressed in types? EDIT: no generic solution, so solve it in the implementation. I.e. if a function uses the input souce in a "pop once" fashion, it can manually convert an igen using igen:to_source_leaky/1 and guarantee proper usage such as performing igen:close/1. Entry: Just put the constructors in a dictionary Date: Mon Jul 17 16:51:52 EDT 2017 It's such a cool pattern to abstract a data structure's constructors as functions in a dictionary, e.g. to implement a generalized right fold. Entry: foldable Date: Mon Aug 7 14:25:15 EDT 2017 monoid necessary? associativity, but in a left fold there is a certain notion of "time" that seems to contradict associativity? probably looking at this wrong. Entry: CCC Conal Date: Mon Aug 7 15:38:07 EDT 2017 http://conal.net/papers/compiling-to-categories/ https://www.youtube.com/watch?v=vzLK_xE9Zy8 Presented as an alternative to EDSLs. Generalize the standard lambda form to CCC representation, with overloadable operations for id,const,abstract,apply. 45:00 Interesting if the derivative is taken to be the linear map (s.t. the derivative of a linear function is that linear function), out follows that derivative distributes over composition: Df o Dg = D(f o g) The remark about the reuse is interesting in the autodiff example. In the beginning I thought this would not be the case. It's a problem I've run into many times. However, in the graphs it is the operations that are not reused. I.e. the addres, the multipliers, but I guess that is the whole idea to parallelize it. - Interval analysis. - SMT Constraint solving Bottom line: do not invent new vocabulary just to get different interpretations. Any monadic computation gives rise in a natural way to a CCC. So it's clear, this is what RAI is going to be built on! It's really what I've been missing. Entry: Decarative vs OO Date: Fri Aug 25 16:16:02 EDT 2017 A declarative presentation model can act as an impedance match between a stream of incoming user edit events, and a stream of outgoing view update events. Elements: - (Event, PM, DM) -> (PM, DM) - (PM, PM) -> Commands EDIT: This is a powerful abstraction, and is currently reshaping the way I think about cache synchronization across high-latency links. Essentially, you "encode the setters" in the presentation model. More later... This can be made all generic, where the "diff command interpreter" sends only one type to the view: update. Entry: Paths: trees as flat key-value maps Date: Sat Aug 26 10:26:13 EDT 2017 Updating leaf nodes in a hierarchical data structure can be done using paths, where each element in the path represents one layer of hierarchical wrapping. It can be beneficial to keep in mind that there is a bi-directional map between: - The hierarchical data structure as embedded in a (funcional) programming language. - The "flattened" version of this structure, represented as a key-value map, where keys are (encoded versions of) data structure paths. An example of the latter is: - A database table or flat key-value store - The 'id' attribute to DOM element association in a web browser An added advantage is that the "path" representation is easily diff-encoded, bridging "declarative" and "object oriented" worlds. Entry: Declarative vs. OO Date: Sat Aug 26 11:00:55 EDT 2017 This generalizes quite a bit, all the way to any stateful API. It likely makes sense to build a declarative state model on top of such a stateful API, where the differences ARE the API. And for cases where it is difficult to do this, the approach could be seen as a way to structure a stateful API. Stateful APIs will likely always be necessary for efficiency reasons. It could also be a good way to test state transitions using property based testing: generate different states and have system transition between them using the commands generated. Entry: differentiating constructors Date: Tue Aug 29 03:54:31 EDT 2017 Differentiate constructors. Basically this is about updating web views from small changes in algebraic data types representing the view model. React is a nice idea, but it is too much focused on the DOM. I believe there is a simpler way by focusing on re-interpreting the rendering function. Rendering a web page is a function from VM -> DOC. The VM can be "diffed" into path ins/del/set operations. (VM0,VM1) -> [DVM]. It should be possible to programmatically derive those operations from the original VM -> DOC function. ( VM0, VM1, VM -> DOC ) -> [ DDOC ] Probably based on som ( VM -> DOC ) -> [ DVM -> DDOC ] function. Conal hinted at this. Maybe look it up. There must be some information about this in the Haskell world. Entry: Trees vs. Paths Date: Sat Sep 2 13:33:47 EDT 2017 This correspondence has been on my mind. It is actually quite trivial, but has far-reaching consequences for organizing data in a "functional way". E.g representing data structures as functions, databases. Maybe this is what Kmett's lenses libarary is about. Entry: Trees vs. Paths: derivatives? Date: Thu Sep 7 16:22:51 EDT 2017 For finite differences it is possible to define a derivative operator that behaves as the ordinary derivative. Can this be done for finite differences of trees as well? The "zipper" is more like an analog derivative in that it is centered at a single point. Entry: Twitter thread on "derivatives" Date: Fri Sep 8 21:12:39 EDT 2017 https://twitter.com/tom_zwizwa/status/906316251559022592 Phil mentions "incremental lambda calculus". https://github.com/paf31/purescript-incremental http://www.informatik.uni-marburg.de/~pgiarrusso/ILC/ http://www.informatik.uni-marburg.de/~pgiarrusso/papers/pldi14-ilc-author-final.pdf The rules in the paper are: - free: D(x) = dx - abs: D(\x.t) = \x.\dx.D(t) - app: D(s t) = D(s) t D(t) - closed: D(c) = [] The interesting thing is that both abstraction and application "double up". It is interesting that this is the curried versions of "dual number" form, in the same way this can be done for numerical automatic differentiation: - abs: D( \x -> t ) -> \(x,dx) -> D(t) - app: D( s t ) -> D(s) (t, D(t)) So it's probably possible to write the entire thing bottom-up, building trees from primitive trees by composing their "dual number" representation. So how would this be represented exactly? So this is lambda calculus. How to make this practical? All primitives need to be translated as well. For data structures this boils down to constructors and accessors: For maps, this is: fun(K,V) -> #{ K => V } maps:merge/2 maps:get/2 These need to be curried, then transformed in the normal way. So the problem boils down to: - write a tree transformation function as the composition of the 2 constructors and 1 destructor functions. - combined with other Erlang construct expressed as functions - perform the transformation In the paper, Bags are used as an example of a primitive data structure. What I used in diff.erl are "bags with tags". I.e. finite functions. Can the fact that these are _actual_ functions be used? E.g. the getter is definitely function application, The constructors are some form of function abstraction. So where does the computational saving come from? From not re-evaluating parts of the expression if the input is constant. The paper mentions something about "nil detection". In ILC, the "plugins" define the primitive operations. v0 (+) d = v1 v1 (-) v0 = d Derivative definition: f (a (+) da) = f a (+) f' a da So the part I don't get is the need for the "Nil changes are derivatives" detour. Clear though, that the way forward is to really understand how the function changes work: it countains the entire point of "pushing this through" abstraction and application. Also, trying Conal's Haskell plugin would probably be a good idea to get a more grounded understanding of all of this. EDIT: Reading again the part about change structures on functions. The important thing to note is that df is a function of two arguments: df a da I still don't get how this is introduced, defined... Go back to section two and read it again. The nil change is iportant. Entry: Why is cache invalidation so hard? Date: Sun Sep 24 00:21:48 EDT 2017 https://martinfowler.com/bliki/TwoHardThings.html ( Otoh, naming is hard because it is arbitrary but still constrained by common cultural denominators. ) Is it the same problem as garbage collection? How to predict if a piece of data will be needed in the future? It would require an oracle -- something that uses inacessible information. GC is a lower bound to that -- we're sure something won't be used if it can't be reached by traversing a program's data graph. Entry: Re-inventing computing: no I/O Date: Mon Oct 9 14:37:38 EDT 2017 http://www.haskellforall.com/2017/10/why-do-our-programs-need-to-read-input.html http://conal.net/blog/posts/can-functional-programming-be-liberated-from-the-von-neumann-paradigm Thinking about this, the "rube goldberg machine" is a reality. For me, it mostly manifests as a build and test system. And a collection of code that depends on infrastructure. On a deeper level: I've been writing code that is "actor-heavy", where this explicit I/O pattern is very apparent. It is important to realize in the context of Conal's post, that actors are an implementation mechanism. This is exactly what I've felt in implementing erl_tools library functions: the composable parts are almost inescapably pure. The need for multiple processes almost feels like a failure (e.g. when using intermediate tasks or partial continuations to perform between iteration representations). It's interesting that Conal takes the input to brains, and the output from brains as just boundaries of a system as a whole. A good point is: What looks like imperative output can be just what you observe at the boundary between two subsystems. I/O is what's on a wire between two systems. It is an implementation detail. Can actor systems be designed in this way? That would be my main question at this point: looking at the Erlang system I've just built, is there a way to look at the protocols and see them as an implementation of something more abstract? Thinking about this, I really want to revisit the idea of re-defining what a computer is: an augmentation of the human brain/body system. Anything else will ensure you get stuck into a local optimum created by current technology. Too stuck in the obvious to discover new truths. So let's keep this in mind: explicit I/O is a consequence of how systems are implemented, and can likely be eliminated, and turned into function application, or shorter: TURN I/O INTO FUNCTION APPLICATION Looking at the application, I see the following uses of message sending: - Server RPC, essentially abstracting some state, and exposing it to otherwise isolated processes. - Tracing / notifications: allowing some other task (or tasks) to check progress of a particular sequential process. RPC already looks like function application. It just misses referential transparency. But this can always be traced back to the necessity to keep state in a single place. The notifications are essentially used to also modify state. I.e. once a particular (physical) operation has occured, some other operations become possible on the modified state. To map this into a more functional approach, one has to do away with the essential sources of state: the hardware, and the hardware implementation constraints. Who is to blame for this state, really? The need for explicit state seems to always be rooted in physicality. I/O can be eliminated as implementation detail upto that phusicality. At some point this needs to be made explicit. The question is then: can you avoid exposing the physicality to the user? In my case: no. The idea that data is in a certain location is a key element of the device as it presents to the user. The only thing anyone is interested in is the data, but the stateful interface is a limiting factor. One pattern: if the RPC is 1-1, i.e. one client and one server, it disappears in the model: it can be abstracted away almost completely. Only in the error handling and time delays it is apparent. If the RPC is N-1, there is an essential part that cannot be eliminated. And this is not I/O (RPC already looks like normal function calls), but it is state. in my case, this is state that in some sense is essential (exposed to the user), or very hard to eliminate due to implementation constraints. It seems the core idea is still to get rid of explicit state, much more than to get rid of explicit I/O, which seems to already happen if there is an incentive to write functional style code. ( I'm counting environment-parameterized code as functional "enough". Read-only environment are still much less restrictive than shared state. ) Entry: source.erl vs unfold.erl Date: Mon Oct 23 15:01:46 CEST 2017 ( Also added unfold.erl to previous post. ) To make this work, defined unfold.erl which has a type {S,F} with F :: S -> {E,S} This is roughly equivalent to pure lazy sequences in source.erl L :: () -> eof | {E, L} The difference is that S and F are explicit in the former, but hidden in the latter, which likely makes it a better canditate. Odd that I was thinking about explicit unfold as something better than a lazy sequence. Just arbitrary fluke? Entry: hans followup Date: Tue Oct 24 16:54:08 CEST 2017 - look at agda - check out courseware (ask link) - fully abstract compilation Entry: Dependent types and proofs Date: Sat Nov 4 13:39:50 EDT 2017 Why are dependent types and formal methods linked? Agda, Coq have them. They are used to encode quantification for predicate logic. Entry: Stream processing patterns Date: Fri Nov 10 08:44:01 EST 2017 I'd like to generalize the lessons learned from fold.erl and source.erl into a Rust-style library -- i.e. not dependent on G.C. There is something about sequences and state machine traces, and how they merge and transform that is very interesting to me. There is some structure that I cannot yet fully exhibit. Entry: Treat things that are isomorphic as the same Date: Thu Nov 16 12:17:03 EST 2017 Basically, it seems more important to not break interfaces than it is to "clean up" naming schemes. As long as renames are "total" it should be fine, but for many things, especially when long-lingering state is involved, this cannot be guaranteed. Main problem: maintaining some form consistency as systems evolve over time is very hard without "binding" being well defined (i.e. collapsing the decoupling name to actual reference). Entry: Unknowingly making the wrong assumptions Date: Thu Nov 16 13:52:25 EST 2017 Those damn unknown unknowns. Therefore it helps to make explicit all decision points. Expose the design as a decision tree. Entry: Functional vs. Idempotent Date: Mon Nov 20 15:59:10 EST 2017 Is idempotent just another word for improperly optimized lazy evaluation? Entry: Fibers are not Threads Date: Fri Nov 24 23:28:58 EST 2017 http://lambda-the-ultimate.org/node/5478#comment-95163 https://en.wikipedia.org/wiki/Fiber_(computer_science) Basic ideas: fibers are a sequential programming construct with indeterminism, where concurrency is an optimization. Entry: the real problem is integration Date: Sun Nov 26 16:52:14 EST 2017 And that's where we're all stuck at. That's where non-ortogonality shows up. Entry: Are monads just sequential execution? Date: Wed Nov 29 12:02:21 EST 2017 ( a -> m b ) ( b -> m c ) ( a -> m c ) They guarantee that something (implemented inside the 'join') is always interleaved between two independent subcomputations. It replaces ordinary composition with something customized. Similar to arrows, but for arrows alledgedly the structure of the computation is fixed, while for monads it is value-dependent. How so? Entry: parsing Date: Thu Nov 30 01:15:57 EST 2017 https://softwareengineering.stackexchange.com/questions/338665/when-to-use-a-parser-combinator-when-to-use-a-parser-generator Entry: Broadcast events are late binding Date: Sat Dec 2 13:22:40 EST 2017 They are an inversion of a/multiple RPC call/s. Entry: LALR parsers Date: Sat Dec 2 18:19:37 EST 2017 Time to understand this: how to use, and what the normal form means wrt. how it is implemented. But first, implement some parser combinators first. The problem, really, is to write down the grammar. Entry: Transitioning Date: Tue Dec 19 09:59:08 EST 2017 Here's a pattern I run into. I have a collection of state machines that operate independently, and I want to create a new state machine that is the abstraction of a parallel bundle, but conceptually behaves as one. What I'm arriving at, is the need for the concept of "transitioning" in the abstraction. I.e. the state diagrams cannot be the same. For each lower level state transition: A_n -> B_n The higher level bundle needs HA -> HAB -> HB Where HAB happens before each A_n -> B_n transition, and HB happens after. E.g (HA, A, A) -> (HAB, A, A) -> (HAB, B, A) -> (HAB, B, B) -> (HB, B, B) It might even be better to not make up new states but indicate that HA = (A,A) HB = (B,B) And have several explicit intermediate states: HAB = (A,B) HBA = (B,A) Where it doesn't matter which of these two traces happens: (A,A) -> (A,B) -> (B,B) (A,A) -> (B,A) -> (B,B) So what am I actually looking for? A way to name these intermediate states. Conclusion: it seems simplest to use carthesian product and derive the higher level states from the lower level states. Entry: contained vs. association Date: Tue Dec 26 11:13:34 EST 2017 Is an implementation detail: Works for code and data: - extend a record vs. create associated table - extend a program vs. subscribe to events Is this the expression problem? https://en.wikipedia.org/wiki/Expression_problem - add cases to data - add functions over data without recompilation. Solved in Racket through mixins. Entry: category theory is type theory Date: Tue Dec 26 15:46:52 EST 2017 https://cs.stackexchange.com/questions/3028/is-category-theory-useful-for-learning-functional-programming https://cs.stackexchange.com/a/3256 The basic topics you would want to learn are: definition of categories, and some examples of categories functors, and examples of them natural transformations, and examples of them definitions of products, coproducts and exponents (function spaces), initial and terminal objects. adjunctions monads, algebras and Kleisli categories following http://events.cs.bham.ac.uk/mgs2012/lectures/ReddyNotes.pdf - elements can be emulated by morphisms: 1->A. - some categories are not "well-pointed", so in general a category is more general than a set - not all denotational models for programming languages are well-pointed, even for simply typed LC. It's really in those small remarks: Recalling that categories are graphs with certain closure properties, we would expect that maps between categories would be first of all maps between graphs. Entry: merging forks Date: Thu Dec 28 11:27:11 EST 2017 A lot of parallel processing seems to be about merging modifications to an initial fork point. How to look at this? https://en.wikipedia.org/wiki/Merge_algorithm A "pure" merge is a set/relation union, and is in that sense trivial. In practice, there are impure merges, where nodes (key->value maps) are replaced based on some other order relation. So how does a 3-way merge work in this context? https://en.wikipedia.org/wiki/Merge_(version_control) It is a hack. Given a diamond graph: P = parent B1,B2 = branches M = merge B1 |= P = B2 -> M=B1, and conversely B1 = B2 != P -> M=B1=B2 B1 != B2 != P -> conflict Entry: julia Date: Sat Dec 30 01:12:32 EST 2017 https://docs.julialang.org/en/stable/manual/parallel-computing/ https://devblogs.nvidia.com/parallelforall/gpu-computing-julia-programming-language/ https://github.com/JuliaGPU/OpenCL.jl Entry: haskell dsp Date: Sat Dec 30 01:19:14 EST 2017 https://idontgetoutmuch.wordpress.com/2017/06/02/1090/ https://hackage.haskell.org/package/accelerate Entry: bloom filters Date: Sat Jan 6 01:20:05 EST 2018 https://en.wikipedia.org/wiki/Bloom_filter negative outcomes are always true, positives can be false. works for applications where individual false positives are not a problem, e.g. caching, optimizations. Entry: Hans meeting Date: Mon Feb 19 11:11:42 CET 2018 - Marcus Voelter DSL Engineering - Engineering consulting vs. product development - Ge moet het kunnen uitleggen Entry: Foldee? Date: Thu Feb 22 12:08:20 CET 2018 How do you call the function that is folded? (e,s) -> s ? Candidates: - update (left fold) - concat (right fold) It's not exactly a monoid operator, because types can be different. Entry: Embedding Erlang in Haskell Date: Wed Feb 28 16:38:39 CET 2018 In general, the thing that is missed when doing so is that doing this for the language isn't so much of a problem, but doing it in a way that can "mock" the standard library is a very different problem. This is a lot of work either way: - rewrite primitives - write mock functions, possibly introducing errors This only really makes sense for very low-level code, e.g. uC code without any kind of OS. For anything else, it is probably best to just write code in Haskell, and create some kind of API. It's probably possible to make an RPC API that is typed at the Haskell side, to be able to test code. Entry: Loops: MIMO Date: Sun Mar 4 11:29:01 CET 2018 Pattern that comes bac: - loop over multi in/out chunked buffers - suspension on in or out Use this as base abstraction, then implement it based on substrate. Entry: Equality is relative Date: Mon Apr 9 13:30:24 EDT 2018 Equality is for two things to be indistinguishable relative to an API, i.e. it is not able to tell them apart by manipulating them in any way to produce different results. Where of course "difference" is defined in terms of some other kind of more primitive notion of equality. Entry: Presentation model from #{} Date: Sun Apr 15 13:34:47 EDT 2018 (gwtest_tom@panda.zoo)13> diff:diff(#{}, #{a => #{b => #{ c => 123}}}). [{insert,[a],#{b => #{c => 123}}}] So it will return a tree as a node, which can then be diffed recursively to produce multiple insert commands. I'd like to make this canonical. Basic idea: I currently do not have the machinery to derive updates from a presentation model from scratch, but it might be possible to make the "insert" canonical, such that only a single mental model needs to be used. Try this with the thermostat. [{insert,[a],#{b => #{c => 123}}}] Would then become [{insert,[a]}, {insert,[a,b]}, {insert,[a,b,c], 123}]. This requires the notion of an empty container. Note that 'insert' is a full construction (all bells and whistles), while 'update' might be a small mutation inside that node. So there is still some duplication. What I really want is: [{insert,[a]}, {insert,[a,b]}, {insert,[a,b,c],_} {update,[a,b,c],_,123}] I.e. separate the creation of the hole, and the filling of the hole. It could even use the '_' atom to indicate this. This is best rephrased in the language of (nested) environments, variables and bindings: (exo@10.1.3.2)13> [diff:split(C) || C <- diff:diff(#{},#{z => #{a => #{b0 => 0, b1 => 1}}})]. [[{env,[z]}, {env,[z,a]}, {var,[z,a,b0]}, {bind,[z,a,b0],0}, {var,[z,a,b1]}, {bind,[z,a,b1],1}]] As a side effect, this also cleans up cluttered nested layout definitions. And it would be easy to implement, because all the commands will be generated from an empty diff. As a slight variation, if the structure is known, the creation of the environment can already create all the structure for the variables such that {var,_} can be ignored. Same for nested env. Summary: 3 concepts are important: - nesting, which maps to layout structuring - leaf nodes that contain a value - values EDIT: Tried this. Not 100% perfect, but it's a start. Entry: Transactions, diffs, editable views, lenses Date: Sat May 5 07:15:45 EDT 2018 Maybe time to revisit Kmett's tutorial again? Although I do have the feeling the hyper-abstraction done there is overkill. What I need at this point, is a practical way to just edit the damn database tables. So what _is_ an editable view? And how to make them less ad-hoc? An edit is a state to state map: s->s. An editable view v is something that can take a user event e, and map it to a model edit. v = e -> (s->s). There is no way around specifying the enumeration of edits. However, once a primitive set of edits is completed, they can likely be composed and their compositions transacted. Entry: Tree to DB Date: Thu May 31 18:01:10 EDT 2018 1. Iterate, creating (path,val) pairs 2. For any substructure where path has the same shape, map it to some coordinate vecor 3. Insert coordinate ++ val into database Et voila. Finite functions. Entry: Pattern matching netlists Date: Wed Jun 27 08:16:19 EDT 2018 Problem: given a flattened circuit netlist. Replace a given sub network structure by another. On schematics this is easy: draw a circle around the subcircuit. Cut it out, and replace it with an n-port. How to automate the circle drawing part? What is a subcircuit? An algorithm for finding its boundary nets and internal components. An example: opto coupler network: - remove 4-port IC U231 (type HMHA2801) - remove 2-port resistor (U231:4 , 3.3V) - remove 2-port diode (U231:1, U231:2) - remove 2-port resistor (U231:1, U231:2) - remove 2-port resistor (U231:1, input) Now this is a heuristic. If there are any other components connected to those nets that look the same, the heuristic will not be able to choose. Unlikely that this will happen but this is not an automated algorithm. It could be used to generate a representation that needs human intervention. I see no other way than to explicitly name the components and the nets. Instead of replacing networks, a simpler approach might be to model individual components directly. Then, by manually assigning semantics such that the overall semantics is the desired one, it would be possible to simulate. In particular, the HMHA2801 optocoupler network is current driven, while the simulation I'm interested in is only logic level. This means that some resistors will need to be modeled as shorts, some as open circuits and the optocoupler itself as a logic gate (inverter in this case). That should work, leaving the problem of directionality. Either model a component as a relation, or as a function, explicitly identifying inputs and outputs. The latter might be more appropriate, but it does not allow for bi-directional signals such as I2C. How to cope? An example is the I2C bridge TCA9509. The model needs to be a relation. How to express that a relation has a funcional dependency? The trick is likely to represent it as a set of functions, one for each I/O configuration. However it might be possible to side-step the problem by introducing "tests" on the circuit as a whole. E.g. assert both ends with random inputs and see whether it is part of the relation or not. This is maybe what logic validation is: assign a random value to all nodes of interests, and compute whether it is part of the circuit or not. E.g. a resistor and diode can be modeled as open, short: short: \(p1,p2) -> p1 == p2 open: \(p1,p2) -> True And the opto coupler is: short_opto \(p1,_,_,p4) -> p1 == p4 But it misses the constraint that p1 is an input. How to model the idea that: - if p1 driven -> p4 driven - if p1 not driven -> p4 undefined Without encoding it as a function? This requires multi-valued logic: Just $ Just True Just $ Just False Just $ Nothing (driven but invalid,undefined) Nothing (not driven) By taking a circuit and assigning some inputs (Just $ Just x), all relations can be evaluated and either lead to a contradiction, or a solution. The solution can then still have Nothing or Just Nothing. In this representation, short: [(a,a) | a <- [Just (Just True), Just (Just False), Just Nothing, Nothing]] open: [] But instead of reinventing, it might be better to map everything to a logic program encoded in Haskell. The example here uses the List Monad to filter out a solution based on exhaustive search and the 'guard' operation from Control.Monad https://wiki.haskell.org/Logic_programming_example I'm all for simple solutions. How to start this out? Another question: how about using a logic program to turn the circuit into a function? Or a collection of functions parameterized on directionality of some signals? EDIT: After a walk, I know what I want. I want functions. Not all subcircuits are functions, so they can be parameterized somehow, giving a family of functions. Representation can be untyped, just an endofunction on netlist partitions. Relations are too cumbersome to work with. How to turn a netlist into a function? The necessary information at each node is to identify the driver node. There can be only one driver node. All nodes that do not have a driver will be circuit inputs. EDIT: Lost train of thought after getting interrupted again. EDIT: Other problem solved, brain free again. So how to tackle this. The thing to do with a function is to evaluate it. That already determines which nets are driven. From the nets that are driven, a depth-first evaluation can be propagated using a decision procedure. To drive a pin with a value: - Find net associated with pin - Find all other components in the net - For each component that has all its inputs driven, propagate The entire circuit will then be modeled as a function from input connector pins to output connector pins, parameterized by some directionality configuration. There will only be 3 kinds of nets: - inputs (provided) - outputs (provided) - internal nodes (rest) I've been here before, but the structure is somehow different. Make a generic evaluator? EDIT: Once representation is there, algorithm seems straightforward. I do wonder what the core problem is. Directionalize a graph? EDIT: Took a long refactoring stretch to finally express it in a simple way. It all boils down to data structures, again. EDIT: Indeed, been there before. The key data structure is the input wait list, updated in response to pins being asserted, in turn propagating changes. The rest is to make it easy to translate net values to input port values. Entry: Representing I/O Date: Wed Jun 27 13:59:29 EDT 2018 Abstract it as a tristate triplet? Entry: MyHDL Date: Wed Jun 27 17:10:08 EDT 2018 So instead of going through building a network evaluator, what about generating MyHDL code and have it do the lifting? Maybe not such a problem since the evaluator isn't very difficult. Entry: Haskell graphs Date: Wed Jun 27 17:39:25 EDT 2018 https://hackage.haskell.org/package/algebraic-graphs Entry: A theorem in a context Date: Thu Jun 28 10:56:17 EDT 2018 Something mentioned in Robert Harper's Homotopy Type Theory (HoTT) lectures is the idea that theorems can be local. I do not understand why this would not be possible in classical logic, but it is clear that "locality" is something that is desirable in programming. Local theorems show up in contexts all the time in programming, as functions defined in some lexical context. Entry: Circuit netlists: structural and semantic operations Date: Sat Jun 30 12:05:43 EDT 2018 See also haskell.txt The main ideas: - A netlist is a partition (a set of disjoint sets). This realization provides some guidance about how to represent netlist operations as set and partition operations. See Partition.hs - A short is a relation and I/O behavior is a function. These are fundamentally different concepts. - Shorts are netlists. Implementing shorts is computing the union of two partitions, based on a "coalesce" operation. See Partition.hs - The set of named nets is the quotient set of the netlist partition. The key insight here is that a net name is just another element of one of the disjoint sets, where "component" is the PCB, and "pin" is the net name. - Evaluating functions on a netlist is best done wrt. the quotient set. - If there is a strict order on pins, this can be used to pick a representative, e.g. as the minimum of a partition element. - Evaluating a net can be done by introducing semantics to each component, represented by the set of inputs, and the function that computes outputs from inputs. Evaluation is then depth-first recursion for all components that have all inputs available. - A practical consequence is that it is possible to evaluate only a subset of a netlist, making it clear what is the fanout. Entry: Idris Date: Sat Jun 30 22:05:26 EDT 2018 It might be time for dependent types. After reading a bit (Agda, Idris or Coq?), I'm inclined to go for Idris for its focus on being a programming language over a proof assistant. http://docs.idris-lang.org/en/latest/tutorial/typesfuns.html Entry: Monoids Date: Sun Jul 1 23:45:34 EDT 2018 It clicks.. https://bartoszmilewski.com/2014/12/05/categories-great-and-small/ Monoid elements are morphisms of a 1-object category, with monoid operator mapping to composition. "... we can always recover a set monoid from a category monoid. For all intents and purposes they are one and the same." "A lot of interesting phenomena in category theory have their root in the fact that elements of a hom-set can be seen both as morphisms, which follow the rules of composition, and as points in a set." Entry: Incremental Relational Lenses Date: Thu Jul 12 10:07:06 EDT 2018 https://arxiv.org/abs/1807.01948 Extends original work on Relational Lenses by allow small changes to cause possibly small changes. Entry: optimizer generators Date: Sun Aug 5 22:25:33 EDT 2018 1. run a "compiler compiler" for a long time, possibly incrementally improving the "compiler". 2. snapshot the process in 1. to "release" the current compiler. Entry: Effects Date: Thu Aug 23 13:17:17 EDT 2018 https://arxiv.org/abs/1710.10385 We show that, in any language with exceptions and state, you can implement any monadic effect in direct style. Entry: Glue code isn't "dirty" Date: Tue Sep 4 11:14:11 EDT 2018 It's actually quite an involved process of defining semantics through protocol translation. What is "dirty", is the hoop jumping that is often involved to find a place where both ends can meet. ( Cognitive reframing :) Entry: transforming blocking code into state machines: Rust Async transform Date: Sun Dec 9 16:07:22 EST 2018 https://news.ycombinator.com/item?id=18641796 https://blag.nemo157.com/2018/12/09/inside-rusts-async-transform.html Entry: difference structures Date: Sat Feb 9 11:46:21 EST 2019 There's a lot of place where difference structures are useful: - incremental build systems - user interfaces However, it seems almost impossible to create these without proper language support. Especially for build systems, which are usually quite ad-hoc, it is almost impossible to fit them with a good incremental system. EDIT: This is very similar to multi-pass structures, where a first pass can record some index information that makes subsequent passes easier to perform. The index information doesn't contain anything new, i.e. it is a cache. Entry: Processor vs. generated state machines Date: Wed Feb 20 07:31:11 EST 2019 The reason I built a processor is to be able to write nested sequential programs as opposed to flat state machines. But I already have a structure that does just that (sm.h), so maybe figure out a way to turn a monad into a way to nest state machines? Everything for which you can imagine an intepreter (e.g. the CPU in this case), can be turned into a compiler. What I want is a Forth language that is then statically analized and compiled into a state machine. It doesn't have to be Forth, just something that can capture the finite nesting structure of a call sequence and reify it into state machines. Maybe do it in two steps: in C first, then try to map it onto Seq. So basic idea is to generate sm.h code. EDIT: The main bit is to split a function into suspend points. Let's try to solve that first. This is mostly CPS transformation. Partially, because any calls that do not block can be kept as ordinary calls. That part is easy. The hard part is to represent the continuation as nested C structs. So an explicit representation of a stack would probably be a good idea. E.g. ANF. EDIT: Another important part is to be able to trade off between doing things in multiple steps, and all at once. Summary: why use a CPU vs. state machine? - sequential programming - function decomposition Entry: glue code Date: Thu Feb 28 14:10:27 EST 2019 If glue is just mapping between things, this usually means there are not a whole lot of cases to be handled. Once the data types are mapped, that's that. Code like that doesn't need to change a whole lot, so can be dynamically typed. Reasons for dynamic typing: - fast compilation and late binding allows for instantaneous deployment - if there are protocols, there is going to be a dynamic matching step anyway To make dynamic types work, the tooling to reduce feedback is absolutely essential. Once it gets complex, or when computation and algorithms are necessary, it's probably time to switch to static types to structure things in a way the compiler can understand and verify. Entry: representing algebraic data types as folds Date: Wed Mar 13 12:12:33 EDT 2019 It is the formal version of what I've been trying to do at a low level as well, to avoid the need for actually constructing data. https://en.wikipedia.org/wiki/Mogensen%E2%80%93Scott_encoding https://stackoverflow.com/questions/16426463/what-constitutes-a-fold-for-types-other-than-list Inspired by some posts by Brian McKenna @puffnfresh "I don't wish for pattern matching when writing Java, Kotlin, TypeScript, etc. I just represent data structures using their folds. Scott-encoding. Super useful!" So the idea is straightforward. However for "protocol oriented programming", these are not functions that return results, but functions that perform a side effect. Maybe construction can be thought of that anyway: perform a side effect in a store, and return a pointer to the constructed data. Entry: Fold with cons into queue Date: Mon Mar 25 10:32:19 EDT 2019 EDIT: This takes a detour, leading up to the core idea There is no way I can think about this that makes the approach actually useful, except what I already have, which is to conceptually thranslate the nested form into a path list, and then applying a fold. No matter how I turn this: the easiest approach is going to be to present the data in the way that the client will actually want it. Which probably means nested C structs, where pointers can be easily translated. Practically: - receiver has a flat queue and can receive flat messages. - flat messages have a default encoding for nested structures To implement this without copying: each "flat" message should be transformed into a nested message at the point it enters the queue. If the granularity of the constructors is small, the queue implementation will be efficient. The client only ever sees nested data structures. So what is the difficulty here? Representing pointers. Both sender and receiver need to know where they are located. So it seems best to avoid representing pointers directly, but represent a structure that can be reCONStructured. So implement a fold anyway, but the fold will actually perform allocation in a queue! That's it. The key insight is to "bring your own constructors". So indeed, folds. When combined with a queue, the alloc/free problem is solved: alloc can do one constructor at a time, filling data structures with raw machine pointers, and dealloc skips the entire message. Then, to simplify, use only a handful of types. Or generate the ser/deser code? Summary: fold with cons into queue. Remark: What about overflow? This needs to be handled using backpressure. I.e. assume that it is ok to just return an error code if a message doesn't fit, so it can be handled at the sender side (which will be a more complex system : this is for leaf nodes ). Entry: What are services? Date: Mon Mar 25 12:20:44 EDT 2019 Many things, but practically they are objects that - grant restricted mutually exclusive access to state, and - perform actions parameterized by this state Entry: Is all state just cache? Date: Mon Mar 25 14:38:04 EDT 2019 No, but when thinking about it, a lot of what we treat as unique state is actually cache, i.e a compiled form of some other state. A good example is a running OS, "compiled" from the state on disk. Entry: Metamorphic testing Date: Tue Mar 26 17:31:12 EDT 2019 Generate tests by picking a random input, and for perturbations of the input, test how the different outputs relate based on the known perturbations (the Metamorphic Relation). https://www.hillelwayne.com/post/metamorphic-testing/ https://lobste.rs/s/lp14cm/metamorphic_testing I think a mathematician would state the underlying observation as: (A) If we know the output of a function f is invariant under the action of a group G, we can generate tests for f by selecting a single element x and comparing the values of f on the orbits g.x for some g in G. (B) We can further simplify our job by applying some hashing function to each orbit f(g.x) to avoid the construction of and caching of expensive elements (in the example in the article, the hashing function is the audio transcription). It’s a very nice observation. I’m not shocked that the practice isn’t more well known, because coming up with properties of functions or systems that are invariant under some large, easily generated group is hard. Entry: Statecharts Date: Wed Mar 27 09:32:38 EDT 2019 Harel Statecharts (HSC) solve the problem of representing "common" transitions through composition. - A hiearchy of states is defined (parent->child). - The parent states are not actual, they are abstract and point (recursively) to a default actual child. - Child states inherit all parent transitions - A transition can point to any state. See also Decision Tables below. https://www.hillelwayne.com/post/formally-specifying-uis/ http://gameprogrammingpatterns.com/state.html https://statecharts.github.io/ http://www.inf.ed.ac.uk/teaching/courses/seoc/2005_2006/resources/statecharts.pdf Entry: State machines, groups and geometry Date: Wed Mar 27 10:19:36 EDT 2019 Geometry is intuitive: we have a way to build a mental model of something that has a configuration space. Is there a way to represent a state machine as a group action? It doesn't seem so, because not all elements compose: in a certain configuration there are only a limited set of transitions. Entry: Decision Tables Date: Wed Mar 27 10:36:21 EDT 2019 I'm not quite sure what the big deal is here, apart from being similar to flattening down nested pattern matches. https://www.hillelwayne.com/post/decision-tables/ Entry: Lock free programming Date: Mon Apr 29 16:28:09 EDT 2019 https://preshing.com/20120612/an-introduction-to-lock-free-programming/ Entry: Path indexing vs. nested dictionaries Date: Fri May 10 07:53:52 EDT 2019 Do you nest dictionaries, or compose keys? This is really just currying. a -> b -> c vs (a, b) -> c Entry: Folds for mutually recursive types Date: Wed May 15 09:58:09 EDT 2019 Creating a fold for a type is straightforward: replace all the constructors with functions mapping to the typ. But what if the type is mutually recursive with another (recursive) type? Something that happens often is trees with a list of nodes. data Tree t = Leaf t | Node [Tree t] What is the canonical way to write this? I would think that inlining would work, but that only works with finite types. E.g. data Tree t = Leaf t | NodeNil | NodeCons (Tree t) (Tree t) Could represent the same information, but it doesn't have the same structure. It seems this is a representative of a long lasting confusion of mine: many data structures take this mutually recursive form, and that means there is always a mutial recursion between a function processing a single element, and a function processing a list of elements. So how do you write a fold in the mutually recursive form? It's quite straightforward. The trick is to see that there are 2 "accumulator types". Below a is the return type accumulator, and a' is an intermediate accumulator used in the list foldr. data Tree t = Leaf t | Node [Tree t] foldTree leaf node cons nil = tree where tree (Leaf t) = leaf t tree (Node ts) = node $ foldr cons nil ts foldTree :: (t -> a) -- leaf -> (a' -> a) -- node -> (Tree t -> a' -> a') -- cons -> a' -- nil -> (Tree t -> a) Now to reflect the composition of the data types, the fold could also be parameterized. E.g. instead of passing in 'cons' and 'nil', the list foldr can be abstracted away: foldTree foldr' leaf node = tree where tree (Leaf t) = leaf t tree (Node ts) = node $ foldr' ts foldTree :: ([Tree t] -> a') -- foldr' -> (t -> a) -- leaf -> (a' -> a) -- node -> (Tree t -> a) This is the flip side of thinking of the data type as: data Tree a t = Leaf t | Node (a t) Where the container type 'a' has its own associated fold. Summary: parameterized container type is associated to parameterized fold. Types and folds are essentially the same thing. ( Note that to make a full fold, this needs to be done mutually recursively, with a foldr that recurses into foldTree again. ) Entry: Futamura projection Date: Fri Jun 14 14:27:54 CEST 2019 So is it safe to say that futamura projection in a pure functional setting is rather trivial? Keep performing reductions until some normal form is reached. https://www.cs.purdue.edu/homes/rompf/papers/wei-preprint201811.pdf Introduction says partial evaluation is in general hard because of binding time analysis, and that in practice manual annotations are used (multi-stage programming). Entry: Hierarchical state machines Date: Thu Aug 15 18:57:00 EDT 2019 Machines who's state can be other state machines. https://link.springer.com/content/pdf/10.1007/3-540-44929-9_24.pdf Entry: Reinventing lenses Date: Sun Nov 17 11:56:47 EST 2019 Let's try to do this from first principle that covers the very practical case that I have in mind: mapping a database to a text representation of that database. There are two data representations A and B. These are isomorphic, in that there are information-preserving maps A->B and B->A. In addition, there are information preserving maps that relate edits: e.g. dA -> dB and dB -> dA. Note that the maps between the edits ddo not need to be isomorphic, but their effect needs to be. Eg A -> dA -> A' and B -> dB -> B' then A' and B' are also isomorphic. The reason to look at this differentially, is that the data itself might be large, and applying an edit might be much simpler. E.g. it is then sufficient to download a local copy just once, and from that point on just operate on edits. I am interested mostly in these lenses: - operations on views that map to operations on tables (editable views) - operations on s-expressions in an editor, that map to operations on db tables/views. Entry: OpenComRTOS Date: Sun Dec 22 14:53:34 CET 2019 Reading this one: isbn://1441997350 https://www.amazon.com/Formal-Development-Network-Centric-RTOS-Engineering/dp/1441997350 In my own work I've been more inclined towards dumb schedulers and non-preemptive state machines. However this might give some new insights, as the case is made that pre-emption is really necessary for some more complex applications. Entry: Erlang, CSP Date: Mon Dec 23 01:32:11 CET 2019 https://www.youtube.com/watch?v=3gXWA6WEvOM - CSP is synchronous. - Interact with channels, not processes So someone explain to my: why does the CSP I see described only talk about composition of events, and not about sending and receiving? Are these two levels of the theory? https://en.wikipedia.org/wiki/Communicating_sequential_processes Ok the wikipedia page explains: earlier versions looked more like a programming language with send/recive. Later versions were formulated as a process algebra. I still don't quite see how they relate, other than that input and output are events. See chapter 4 in CSP book. In short: a send is a very specific event, and a receive is a more general specification of a set of events. I.e. saying that a (generic) message on a (specific) channel is part of the alphabet of a process means that it can receive messages on that channel. Communication is a synchronization: the send event and the receive event are the same event. The avoidance of causality is one of the key simplifications of the algebra. EDIT: Continuing read of CSP book. After introducing some of the notation and laws, the conclusion is made in 1.4 that a process can be represented as a function that accepts no, one or more (choice) inputs and produces another process. EDIT: I don't think I quite understood the point of the algebra and how it relates to the language aspect on my last read. It makes more sense now. Entry: Revisit: Twitter thread on "derivatives" Date: Mon Dec 23 17:42:53 CET 2019 See 20170908 https://www.informatik.uni-marburg.de/~pgiarrusso/papers/pldi14-ilc-author-final.pdf Maybe time to look at Phil's "purescript incremental" library? https://github.com/paf31/purescript-incremental-functions/issues/8 https://github.com/paf31/purescript-incremental-functions/blob/master/src/Data/Incremental.purs Entry: Eventually consistent Date: Wed Jan 1 19:53:43 CET 2020 So with Raft I did run into a consistency vs. availability problem. Raft aims for consistency. What I need for a current application is partition tolerance and availability. So what is the essence of "merging" when partitions join again? Entry: Theory of events Date: Thu Jan 2 20:32:15 CET 2020 I need a theory of events as state differences. I.e. each event is a delta in a certain direction. It seems that delta-to-delta mapping is really what occurs in practice. Some change in one model leads to a change in a related but different model. Entry: Incremental lambda calculus Date: Thu Jan 2 22:05:47 CET 2020 See previous entry. The rules in the paper are: - free: D(x) = dx - abs: D(\x.t) = \x.\dx.D(t) - app: D(s t) = D(s) t D(t) - closed: D(c) = [] Also ran into this: https://bentnib.org/posts/2015-04-23-incremental-lambda-calculus-and-parametricity.html Maybe good to do a little survey on what happened in this field? EDIT: The comment in that post is that (-) is only used to define the zero change. Let's give it a try. Define change structure for a tree. Changes are expressed as sets of add,remove,update. I already have a difference operation. Differences are not unique. What is then missing to derive a tree processing Erlang function? Entry: Wingo on CML Date: Sun Jan 5 19:25:47 CET 2020 https://www.youtube.com/watch?v=7IcI6sl5oBc&t=305s CML is the way to go. Can I just do channels? No CML. What does he mean here with difference between channels and CSP? The rendez-vous property is very important: async is very different. Not a really good talker though.. I didn't gather a whole lot apart from:: - for rendezvous, the 2nd wakes up the 1st. - abstraction over channels? - design with CSP, implement with CML https://en.wikipedia.org/wiki/Concurrent_ML Following up https://wingolog.org/archives/2017/06/29/a-new-concurrent-ml "You'd think this is a fine detail, but meeting-place channels are strictly more expressive than buffered channels." EDIT: The example of using a non-buffered channel for RPC is a good one: it is not well-defined for buffered channels. https://wingolog.org/archives/2016/09/21/is-go-an-acceptable-cml https://wingolog.org/archives/2016/10/12/an-incomplete-history-of-language-facilities-for-concurrency Some interesting remarks: - Callbacks (manual inversion) has a lot of mental overhead. This work should be done by the compiler. - Promises (async/wait) lifts some of the burden, but this style "infects" the entire program. - Kernel threads don't have good information to know what to schedule next. Applications might have more information. - On when to buffer: not buffering enables "select". (how?) - Select is ok, but not compositional. Typical example: post-process channel output. This is what CML events enable: use events to create new events. Entry: Sperber on CML Date: Sun Jan 5 21:06:12 CET 2020 Mentioned in Wingo talk. CML compared to actor model: actors don't compose. As a functional programmer you want composition. https://www.youtube.com/watch?v=pf4VbP5q3P0 The idea of rendez-vous as a value seems to be important. Then operations on rvs can be used to create new rvs, and 'select' is the thing that maps rvs to values. In racket these are called 'events'. Also on the wikipedia page. https://en.wikipedia.org/wiki/Concurrent_ML That's what Wingo means with lambda instead of a value. Entry: Having an Effect by Oleg Kiselyov Date: Fri Jan 10 18:30:43 CET 2020 https://www.youtube.com/watch?v=GhERMBT7u4w&feature=youtu.be&t=1680 http://okmij.org/ftp/Computation/having-effect.html About bluespec: "something slightly better than Verilog" Denotational semantics: compositional mapping from expressions to some domain. I.e. "eval". Compositional means that there is no dependence on structure of subexpressions, only meaning. Effects as interactions. In process calculi monads are totally natural. They were invented there but nobody payed attention to it. Monads are just as interesting and important as parentheses. Monads arise naturally in the interaction view, by factoring out effect propagation. So the thing here is to define a single monad, and change the effect handler. Entry: Freer monads and extensible effects Date: Fri Jan 10 20:40:29 CET 2020 http://okmij.org/ftp/Haskell/extensible/index.html https://legacy.cs.indiana.edu/~sabry/papers/exteff.pdf https://mail.haskell.org/pipermail/haskell-cafe/2018-September/129992.html Entry: cbuf.h Date: Sat Jan 25 15:14:56 EST 2020 A review. Basic question: is it safe to use this from pre-empting ISR? I think it is, but can I prove it? I'm not really sure how to make this formal, but I know that these principles are important: - read and write pointer are non-decreasing - reads and writes to the pointers are atomic - each pointer is written by only one thread - write pointer is written after data is written - read pointer is written after data is read This gives the guarantee that when read and write are read, there will be no way to get an invalid range. I.e. there is no way that there can be less bytes available than "room" indicates for the writer, and less bytes available than "bytes" indicates for the reader. The only things that can go "wrong" is that: - a reader reads an old value of the write pointer and thus underestimates "bytes". - a writer reads an old value of the read pointer and thus underestimates "room". It is however not portable. It relies on reads and writes to be atomic, which is the case on ARM CM3 for a 32-bit aligned 32-bit access. ( So I don't find any reference to an implementation like this. Is there still an error? ) Entry: cache invalidation Date: Wed Feb 19 10:31:12 EST 2020 http://zwizwa.be/-/compsci/20170711-115938 definition control-dominates use Can this principle be used to ensure caches are coherent? YES! Entry: Chunking coroutines Date: Mon Apr 6 15:46:47 EDT 2020 ( not sure where to put this ) Context: I'm writing a blocking call API on top of libuv and coroutines. It is based around a read() call that takes an exact number of bytes from a stream. That is the API the caller inside a task can see. Now the question is: where should the buffer + chunking code go? At the push end, or at the pull end? It seems easiest to do this at the read end. Entry: Staging a flat description Date: Sun Jul 12 11:18:39 EDT 2020 Start with a flat table. Make sure it makes sense that way. Then gradually introduce structure and "staging", encoding bits as language elements (compile time) and data. Entry: Futamura projections Date: Mon Aug 24 11:58:33 EDT 2020 I would like to understand why this is not feasible in practice, and, if possible, how to simplify language semantics to make it feasible.