Computer Science.

This is a collection of blog articles about misc computer science topics.

Practically though, this is 50% about re-wiring the brain to learn
functional programming (Haskell).


Entry: Parsing and Automata
Date: Mon Feb 16 11:21:54 CET 2009

Before storming into a dark room writing yet another ad-hoc recursive
descent parser, it might be a good idea to turn on the lights and look
at languages first.

AUTOMATA
--------

Automata are classified by the class of formal languages they can
recognize.  Types of Finite Automata[1] :

  DFA: deterministic finite: each state has a exactly one transition
       for every symbol in the alfabet.

  NFA: nondeterministic finite: .... zero or more transitions ...
       (symbol with no state: input is _rejected_)

It can be shown that these [D|N]FA can accept the same languages.
These languages are called _regular_.

Extensions to Finite Automata:

  PDA: push-down:  FA + stack.

NPDAs accept the _context free_ languages.

  LBA: linear bounded: a turing machine with a tape length
       proportional to the input string.

LBAs accept the _context sensitive_ languages.

  TM:  turing machine: equivalent to algorithms.

Turing machines decide/accept recursive languages and recognize the
recursively enumerable languages.


FORMAL LANGUAGES
----------------

Let's stick to formal languages defined by a formal grammar[3].  A
grammar is a set of rules for transforming strings.  These rules are
called _productions_.  All the strings in the language are generated
by applying the grammar rules to a collection of _start symbols_.  If
there are multiple ways of generating the same string, then the
grammar is said to be _ambiguous_.

Note that from a practical point (parsing) this is upside down:

parsing consists of two steps:

  - validate a string (is it part of the language?)

  - as a side effect: find the exact production rule(s) to build a
    parse tree representation of the input (convert concrete syntax to
    abstract syntax)

Ambiguity is problematic for the latter part:  you really want a
single parse tree to which to attribute semantics later.

  Note: The production rules approach is very different from
        recognition-based PEG - parsing expression grammars, where the
        language is the set of inputs recognized by the parser
        expression (a formal representation of a recursive descent
        parser).

A context-free language can be recognized in O(n3) time, but there are
a couple of subclasses for which linear algorithms exist:

LR      Left-to-rigth, Rightmost derivation
LALR    Lookahead LR
LL      Left-to-right, Leftmost derivation
LL(k)   LL with k lookahead, without backtracking

LL(k) Can be implemented with recursive decent parsers. Lisp is LL(1)


A derivation[4] is a convenient way to express how a particular input
string can be produced by fixing a replacement strategy (i.e. always
replace Leftmost or Rightmost non-terminal first) and listing the
rules applied using that strategy.  This is not unique for an
ambiguous grammar.


[1] http://en.wikipedia.org/wiki/Automata_theory
[2] http://en.wikipedia.org/wiki/Context-free_languages
[3] http://en.wikipedia.org/wiki/Formal_grammar
[4] http://en.wikipedia.org/wiki/Context-free_grammar#Derivations_and_syntax_trees


Entry: YACC
Date: Fri Feb 27 11:07:08 CET 2009

"Why Bison is Becoming Extinct"
http://www.acm.org/crossroads/xrds7-5/bison.html

Generic parser references:

http://www.meta-environment.org/
http://accent.compilertools.net/  (works with LEX)

The first one seems quite interesting.


Entry: coroutines and "join"
Date: Tue Mar 10 14:47:23 CET 2009

Start with a simple 2-coroutine network: a process coupled to a
controller:

|   | --> |   |
| C |     | P |
|   | <-- |   |

The controller's outputs are the process' inputs, and vice versa.
This can be executed using synchronized channels (a read is woken up
by the write on the other end).

1. This can be implemented using globaly named variables and globally
   accessible synchronization events (signal/wait).  A disadvantage
   here is that to name such a variable "input" or "output" is
   ambiguous.

2. The ambiuitiy can be removed by introducing "positive" and
   "negative" coroutines.  Negative routines read from "output" and
   write to "input".  It is natural to call these negative coroutines
   "busses".

3. Coroutines with multiple inputs that are read at the same time need
   a synchronization mechanism.  This is usually called "join": a
   process that continues when all its inputs are available.  (aka a
   "barrier").

It is possible to avoid "join" by buffering all "output" registers
(bus task read channels) in one direction and adding explicit clocks
that only occur when all data is guaranteed ready.  This basically
clocks the "input" -> "output" state machines.


Another question is:

  Given a "signed" network of coroutines (every corotine is connected
  only to coroutines of opposite polarity) is it sufficient to start
  the even ones in output and the odd ones in input (or vice versa) to
  avoid deadlocks?


Entry: data - codata
Date: Sun Mar 22 22:13:57 CET 2009

http://blog.sigfpe.com/2007/07/data-and-codata.html
http://en.wikipedia.org/wiki/Corecursion

well-behaved recursion for recursion on data and codata

 - structural recursion: recurse on strict subparts only
 - guarded recursion:    recurse only inside constructors


Entry: CTM
Date: Fri Apr  3 16:51:13 CEST 2009

Partial values <-> Complete values.

The single-assignment store is remarkable (2.2 p. 42).  Especially the
use of assignment to both construct data structures and take them
apart.  See p80 2.6 : The binding operation performs [unification]
which is a symmetric operation.

Maybe I should try to implement it?  CTM 2.8.2.2 p. 101 has the
algorithm.  In the context of an evaluator, the nontrivial part is the
implementation of the data structure (i.e. functional w. sharing
vs. imperative/hash).

[unification] http://en.wikipedia.org/wiki/Unification


Entry: Evaluation Strategies or Lambda Calculi?
Date: Wed Apr  8 15:17:26 CEST 2009

Been reading a bit on dave's blog the posts surrounding this[1].  The
main idea being that it makes little sense to talk about a single
lambda calculus (LC) with different reduction strategies like
applicative order or normal order.  It's better to make a
special-purpose calculus to model call-by-value and call-by-name
languages (the CBV-LC and CBN-LC).  This generalizes to other language
extensions.

This one[2] was informative:

  In call-by-name lambda calculus, the definition of a reducible
  expression is an application whose operator is a value. The operand
  can be any arbitrary expression. As a result, non-values end up
  being passed as arguments to functions. In call-by-value, we
  likewise only perform reduction when the operator is a value, but we
  also require that the operand be a value as well.

[1] http://calculist.blogspot.com/2009/02/history-lesson.html
[2] http://calculist.blogspot.com/2006/03/observable-differences-between.html


Entry: invertable data structure pack/unpack
Date: Mon May  4 12:51:53 CEST 2009

Instead of using a zipper, use polymorphic accessors that will
automatically perform the correct pack/unpack when modifying a data
structure.

I.e. an operation becomes: (unpack dosomething pack)

but instead it will be left at (unpack dosomething lazy-unpack) such
that the next 'pack will cancel the lazy-unpack.

Now explain this a bit better..


Entry: Futamura Projections
Date: Mon May  4 21:03:55 CEST 2009

sigfpe is talking [1] about the Futamura projection.  I find it a bit
hard to follow so let's try to explain it here again.

( or not? )

I'm actually more interested in determining why partial evaluation is
not a trivial problem.  If it is just about folding constants, it
really shouldn't be too hard.

The problem as I understand is recursion: you can't unfold general
recursion as it will lead to infinite code structures.  At some point,
run-time recursion needs to be introduced to break the loop.  This is
mentioned in Wadler's paper [2] on deforestation (blazed treeless
form).

As far as I understand, doing this in general can be quite involved.
Using transformations of higher order functions instead of raw
recursion seems a better approach.


[1] http://blog.sigfpe.com/2009/05/three-projections-of-doctor-futamura.html
[2] http://homepages.inf.ed.ac.uk/wadler/topics/deforestation.html
[3] http://www.itu.dk/people/sestoft/pebook/

Entry: The 2 x 2 of functional programming.
Date: Mon May  4 21:24:54 CEST 2009


Pick any two and see what they have in common.


       (bind)        (reference)

       abstraction | application    (code)
      -------------|-------------
       destruction | construction   (data)


There is also a diagonal dimension: creation and consumption of the
abstracted object itself.


Entry: Stacks
Date: Fri Jun  5 12:04:02 CEST 2009

I'm trying to figure out some abstract structure of 2-stack Forth
systems, combined with parsing (input and output stream).

The structure of Forth is quite remarkable.  It is implemented very
compactly by using different stacks and streams.  However, its
(sparse) use of direct mutation to break cycles during compilation
makes it difficult to analyse and cast into a stream / stack / tree /
directed-graph / graph structure.

The idea I'd like to develop serves to answer the question:

    "Why are 2 stacks different than 1 stack?"

From programming in Forth I can answer this in vague terms as:

    "It's possible to save work to the 2nd stack when you need to
     perform a subtask using the 1st one."

This can probably be generalized to:

    "It's possible to save work to the Nth stack when you need to
    perform a subtask using the other N-1 stacks.

This doesn't need to be ordered, so you could see all stacks as
equivalent and pick one to save work while the others are used in a
subproblem.

The question then is: can this be formalized a bit more?  Can Forth,
its (self-hosted) compilation algorithm and 2-stack computation and
composition model be expressed as some structure with morphisms, and
are there relations of transforming N-stack machines to N-1 stack
machines?

Pardon my rambling..  A stack might be seen as a prototypical
complexity reducer.  The most basic data structure to encode
composition mechanism in a way that is simple to express in hardware.
A stack is the "mother of locality".  And locality is an essential
concept in (physical) computation: if the data ain't in your hand, you
can't do anything with it.  Stacks connect "now" with "later" in a
flexible way.

The mutation that's present in Forth actually only happens as part of
a mechanism to compute forward and backward loops.  It is what turns a
flat code stream into a directed (control flow) graph.  So, in order
to simplify Forth, this mechanism needs to be replaced by something
that can be _reduced_ to it, i.e. higher order functions and their
partial evaluation (jumps are PE of applications).


Entry: Cached Continuations and Synchronized VMs
Date: Mon Jun 15 13:19:05 CEST 2009

I've been thinking a bit about continuations and how they can be used
in situations where you have two parties that communicate without
shared state with the following constraints:

   * There is a possibility of one side giving up without notice.

   * The link is expensive.

This is the typical interaction scenario of a web server talking to a
web client, but there is no need to make it a-symmetrical.

Communicating parties are adequately modeled using continuation
passing style.

In general, if the two communicating parties just exchange the whole
continuation (placing the ball in the other court), the first
constraint isn't a problem: if one of them dies the conversation
simply stops without any side-effect on the other party.

However, because the link is expensive and because non-trivial
continuations tend to be quite large, one tends to keep the
continuation stored on one of the parties, call this the "server", and
exchange references over the channel.  The problem here however is
that if the other party (the "client") dies, this creates garbage on
the server.

If you separate this problem in two it might be easier to manage:

  * Logically, continuations are always exchanged fully between the
    two parties.  There is never any local state.

  * A caching channel with knowledge of the protocol can use an an
    aggressive expiry strategy to manage the communication.

An effective caching channel requires analysis of the continuations by
the caching mechanism: i.e. two subsequent continuations usually share
a lot of common data.  Eploiting this redundancy is the task of the
cache.

Also, the expiration strategy probably needs to be based on common
usage.  This needs to be tuned in the field.

So I wonder, can't this be solved by "synchronizing" two virtual
machines, say two CEK machines, one on the server and one on the
client?

This would be quite similar to the way two humans talk: we don't
really have a mechanism to transfer our continuations, but we can try
to model the other's state to get by with very little information
traveling accross.

TODO:

  * Look at some work on continuations in web programming
    (i.e. Shriram Krishnamurthi[1] or Jay McCarthy[2]) to see if this
    idea is already there.

  * Look at prog@vub work on this[3][4].


[1] http://www.cs.brown.edu/~sk
[2] http://faculty.cs.byu.edu/~jay/home/
[3] http://prog.vub.ac.be/amop/research/dgc
[4] http://prog.vub.ac.be/amop/research/ambientrefs

Entry: Filesystems as Graphs
Date: Mon Jun 15 13:48:49 CEST 2009

I ran into an interesting pattern trying to solve .tex -> .dvi -> .png
conversion.  It is a way to manage temporary files used in
orchestrating the invokation of external programs.

Classical file systems, by the way they are _used_ in unix-like
utilities, are quite low-level data structures.  They cannot support
garbage collection because _references_ to files are not explicit.

A filesystem is a finite function (hash table).  Since it cannot be
guaranteed that this function won't be evaluated at some arbitrary
point, it has to be kept around in its entirety.  This kind of late
binding makes garbage collection impossible.

By replacing this data structure with a graph (a Scheme code/data
structure) files can be managed using the graph memory manager.  Wrap
temporary files in graph nodes, and ensure a 1-1 correspondence to
these (meta)objects and the file's content, either in memory or on
disk.

Practically:

  * This is essentially independent of the data storage / caching
    strategy.

    It is possible to perform operations on objects by temporarily
    serializing them to disk, running external programs that produce
    more files, and bring those back into memory.  The most elegant
    solution would be a filesystem interface towards external
    programs, but simply save+execute+load is good enough as a first
    attempt to implement the essential logic.

  * The effect of external programs can be localized.

    Filesystem operations (unix utilities) still work in this view.
    What is better though is that effects can be managed locally:
    create a temp directory with files, perform some external
    processing on it, and map the relevant results back into the graph
    store.


Entry: Goodbye Smalltalk?
Date: Mon Jun 15 14:07:43 CEST 2009

The previous post[1] advocates the explicitness of all references.
Instead of using this just for temporary file management, can't we use
it for _all_ file management?  Can we view all files as temporary?

This is a shift in paradigm about how to think about data: instead of
looking at data as a dumb collection of bits, implicitly connected to
a program that uses it (an interpreter), you never disconnect it from
its use (semantics).

More generally, it sort of advocates the abolishment of the principle
of late binding.  Goodbye Smalltalk?  Actually, it seems to put late
binding into a clear perspective: it acknowledges data use cannot be
anticipated.

However, it seems to be exactly that which makes turing machines so
hard to understand.  This is like making the leap from turing machines
to (static) boolean circuits as is done in complexity theory to make
the subject more managable.


[1] entry://20090615-134849


Entry: GC and Cache
Date: Thu Jun 18 13:50:28 CEST 2009

They are related but not quite the same.

What if it would be possible to use non-GCd data as a cache?

I.e. define a collection of objects that are essentially stored on
disk, but which sometimes need to be loaded into memory.  The object
in memory is GC-able, but is it possible to provide some special
mechanism to see if it got GC'd?

I guess I'm looking for weak references.

[1] http://docs.plt-scheme.org/reference/weakbox.html


Entry: 64bit
Date: Thu Jul  9 16:52:40 CEST 2009


Oh yeah, I forgot.  This is a mess[1].

The different ways of being incompatible:

 Data Type         LP32  ILP32  ILP64  LLP64  LP64
----------------------------------------------------
 char                8      8      8      8     8
 short              16     16     16     16    16
 int32                            32
 int                16     32     64     32    32
 long               32     32     64     32    64
 long long (int64)                64
 pointer            32     32     64     64    64

In 1995, a number of major UNIX vendors agreed to standardize on the
LP64 data model for a number of reasons...

So, basicly you only need to care about ILP32 and LP64, where long is
the size of a pointer.


[1] http://www.unix.org/whitepapers/64bit.html


Entry: Partial Evaluation
Date: Wed Jul 15 09:37:11 CEST 2009

Program analysis (PA) is undecidable because of state dependent
control flow.  The culprit is the "IF" statement, or anything that
boils down to turning a form of data into an execution branch target.

Partial evaluation (PE) is a form of PA.  Effective partial evaluation
is essentially about trying to figure out which "IF" statements can be
computed at run time.  Picking the wrong one can lead to infinite code
size.

An interesting approach is to mix lazyness with PE to keep infinite
code structures under control.

--

The problem with PE is time/space resource analysis.  It is not always
possible to assess how much time or space recursive/looping code will
take.

Given the right representation I think it becomes more practical to
get close to some reasonable definition of optimal behaviour.  You
need to `dodge the recursion' by concentrating on combinators with
easy-to-manage time and space properties.


Entry: Small-step vs. Big-step operational semantics
Date: Thu Jul 16 14:20:11 CEST 2009

Wikipedia[1] isn't very clear here:

  In computer science, small step semantics formally describe how the
  individual steps of a computation take place in a computer-based
  system. By opposition big step semantics describe how the overall
  results of the executions are obtained.

From Pierce[2] 3.4 p.32: (reformulated)

  The small step style of operational semantics is sometimes called
  "structural operational semantics" and specifies reduction by means
  of a transition function for an abstract machine.  The meaning of a
  term is the halting state of iterative application of the transition
  function.  Big step style evaluates a term to the end result in a
  single transition.


[1] http://en.wikipedia.org/wiki/Small_step_semantics
[2] isbn://0262162091


Entry: Constraint Programming
Date: Thu Jul 16 15:48:19 CEST 2009

Constraint programming is rougly based on replacing functions as a
primitive building block by equations (relations or multi-directional
functions).

To make things feasible, constraints are supposed to be _locally
enforced_ with some global issues handled using backtracking.

It looks like in first approximation it's best to start with [2] as it
contains virtually the same as the introduction of [1].

[1] http://www.ai.mit.edu/publications/pubsDB/pubs.doit?search=AITR-595
[2] http://mitpress.mit.edu/sicp/full-text/book/book-Z-H-22.html#%_sec_3.3.5


Entry: Managing External Resources
Date: Sun Aug  2 09:28:46 CEST 2009

I'm wondering if it is possible to replace external resource
management (open/close) with garbage-collection.  Is this remote
resource management a `real' problem or is it merely an implementation
artifact?

It seems to me that the reason is that "reachability" isn't always
observable[4].  A good example is webserver continuations.  Will the
client come back, or did it loose interest?  We can simulate
reachability using timeouts.  I've talked about this before: it seems
to be essentially a caching[5] problem.

I wonder what is the link with linear languages (possibly extended
with lazy copy) where open/close resource management is quite natural.


[1] http://okmij.org/ftp/Haskell/misc.html#fp-arrays-assembly
[2] http://okmij.org/ftp/Scheme/enumerators-callcc.html#Finalization
[3] http://okmij.org/ftp/papers/LL3-collections-enumerators.txt
[4] http://prog.vub.ac.be/amop/research/dgc
[5] entry://20090615-131905


Entry: Connected Ideas
Date: Sun Aug  2 09:54:52 CEST 2009

Explain how these are related:

Linear / stack-based memory management.
Deforestation.
Task scheduling and dependency analysis.
Enumerator inversion and finalization.
Distributed GC and Remote continuation/context cache.
Concurrency-oriented programming.
Message passing concurrency.


Entry: Engine vs. Coroutines
Date: Mon Aug  3 11:12:34 CEST 2009

The difference between an engine[1] and a coroutine is that an engine
uses timed preemption, while a coroutine only uses volontary
preemption.  ``An engine runs until its fuel runs out''[3].

It seems to me that this is somewhere between full nondeterministic
preemption and cooperative multitasking, by making the preempt points
happen only at control points with consistent state -- i.e. not in the
middle of some low-level routine that uses a shared resource.

[1] http://en.wikipedia.org/wiki/Engine_%28computer_science%29
[2] http://list.cs.brown.edu/pipermail/plt-scheme/2002-September/000620.html
[3] http://www.scheme.com/tspl2d/examples.html#g2433


Entry: Task-based C interface
Date: Mon Aug  3 14:36:10 CEST 2009

Here's the basic idea I'm trying out for writing reusable C primitives
for different kinds of scripting languages (operating systems, in
essence Scheme and PF, a linear concatenative language).

                     C code shouldn't CONS.

C code should only communicate with the outside world using _channels_
which have a limited number of primitive types, but do not allow for
aggregates.

All aggregate data types should be transferred using _protocols_ :
explicit sequencing of primitive types to represent data structure.

Doing it this way makes it possible to write C code that doesn't
perform any memory management, except for allocation of local
variables.  This makes automatic wrapping very simple, and allows a
single abstract object: the task (zipper).  Moreover: the code itself
can be incorporated in a static scheduling policy (whole program
optimization: compile time weaving).

This technique is essentialy premature deforestation: eliminating
intermediate data structures using compile-time transformation.  The
slogan is something like this:

               Replace data structures with protocols.

From the outside, it doesn't matter that C code is stateful, as long
as _all_ the state is contained in the continuation.  The essential
insight is that all memory and control management can be abstracted.

Funny how I got here.  I've rediscovered concurrency-oriented
programming by looking for the simplest way to interface C code with
Scheme.


Entry: EWD concurrency
Date: Mon Aug  3 23:58:57 CEST 2009

I believe these are the original semaphore papers.

[1] http://www.cs.utexas.edu/~EWD/transcriptions/EWD00xx/EWD51.html
[2] http://www.cs.utexas.edu/~EWD/transcriptions/EWD00xx/EWD54.html
[3] http://www.cs.utexas.edu/~EWD/transcriptions/EWD00xx/EWD57.html
[3] http://www.cs.utexas.edu/~EWD/transcriptions/EWD00xx/EWD74.html


Entry: Finalizers
Date: Sun Aug 16 16:37:50 CEST 2009

It's generally agreed that automatic, non-synchronous GC and
finalizers interact badly.

Where does the problem lie?  That depends on how you look at it.

      - GC should be synchronous

      - Represent resources as ``pooled-with-spare'' to behave more
        like memory.

Which one is best?  Ultimately I believe that representing all
resources as pooled-with-spare is not realistic: essentially we're
bounded by _physical_ resources, and if there's only one, you better
have synchronous GC.

But, if these resources _are_ modeled as memory, it might work: any
pooled resource that runs out of candidates to hand over will issue a
_global_ GC to determine if it is still reachable.

The global nature of GC makes it rather hard to manage.  This makes me
think that RC based management really isn't going to go anywhere,
unless you can propagate the ``out of'' event all the way up from
device drivers to the toplevel memory GC.

It is probably possible to remove some of the arbitrary RC managed
resources (i.e. Unix file handlers) by ``peeling them open'' to reveal
the real resources (hardware signalling the device driver), and have
them propagate these signals all the way to the top level GC whenever
they occur.


Entry: Synchronization
Date: Mon Aug 17 17:44:46 CEST 2009

So what is a monitor[1] exactly?
[1] http://en.wikipedia.org/wiki/Monitor_(synchronization)


Entry: Concurrency
Date: Mon Aug 17 17:44:58 CEST 2009

Explain the difference between these different control structures:

- Pre-emptive unix tasks and threads
- Cooperative tasks (yield)
- CSP processes
- symmetric coroutines
- asymmetric coroutines
- partial and full continuations
- one-shot continuations
- Icon's generators[1] and goal-directed evaluation[2]
- Iterators[3]

[1] http://lambda-the-ultimate.org/classic/message1851.html
[2] http://www.cs.arizona.edu/icon/books.htm
[3] http://home.pipeline.com/~hbaker1/Iterator.html


Entry: Tagless Interpreters
Date: Sun Aug 23 12:16:37 CEST 2009

I am going to try to understand this[1].  It is about building a
``tagless staged definitional interpreter''.

It touches on some ideas that I've seen vague hints of writing the PF
and SC interpreters, and trying to see the tradeoffs in
compiling/interpreting LC-based languages.  First, some terminology:

Initial algebra[2]: ``In mathematics, an initial algebra is an initial
object in the category of F-algebras for a given endofunctor F. The
initiality provides a general framework for induction and recursion.''
It seems to be used related to recursive types, which are the yin of
a yang: recursive functions operating on the types.  I guess the
Coalgebraic[3][4] structure is those of recursive functions?

HOAS: higher order abstract syntax.

COGEN: code generator.

1.1 TAGS

The paper starts with explaining the use a universal type `u' (a
tagged union) to represent a dynamic type, to be able to write
something like:

         eval : u list -> exp -> u

Where `u list' is a DeBruyn environment (variables are then DeBruyn
indices).  The disadvantage is that in this representation, `eval' is
a partial function: i.e. it needs to handle cases where it is passed
invalid input, i.e. non-closed terms or ill-typed ones.  In practice
however, when a term is closed and well-typed these cases do not
occur.  Essentially, the algebraic types fail to express in the meta
language that an object expression is closed and well-typed.

1.2 TAGLESS

Current approaches to solve this uses complex data types like GADTs or
dependent types.  The paper presents an approach that doesn't require
this, by representing object programs using ordinary functions instead
of data structures.  This approach turns evaluation of open object
terms into _ill-typed_ terms in the meta langauge.  Neat!


REMARKS

There is a link between this kind of representation and Staapl's Coma
abstraction: representing target code as procedures operating on a
stack machine code stack.


[1] http://okmij.org/ftp/tagless-final/APLAS.pdf
[2] http://en.wikipedia.org/wiki/Initial_algebra#Use_in_programming_theory
[3] http://en.wikipedia.org/wiki/Coalgebra
[4] http://en.wikipedia.org/wiki/F-coalgebra
[5] http://okmij.org/ftp/Computation/tagless-typed.html
[6] http://lambda-the-ultimate.org/node/2438


Entry: Eager rewriting vs. ``something more general''
Date: Mon Aug 24 19:59:35 CEST 2009

In the light of peephole optimizations and the Joy machine in [1].

How can you keep a rewriting system managable, if you don't pin it
down manually by performing only eager substitutions?

I guess what I'm looking for is confluence[2].

And I have this hunch that _interesting_ rewriting systems for
optimizations are _not_ going to be confluent: they probably lead to a
large number of possible irreducable forms which need extra
constraints to isolate a single solution.

I read in Muchnick[3] (Chapter 6: Producing Code Generators
Automatically, 6.2.1 p.140) that the standard peephole rewriting
techniques are essentially SLR(1) parsers.  Does this have anything to
do with that?

[1] http://zwizwa.be/darcs/libprim/pf/joy.ml
[2] http://en.wikipedia.org/wiki/Confluence_%28abstract_rewriting%29
[3] isbn://1558603204


Entry: Monad transformers / Arrows
Date: Thu Aug 27 19:12:49 CEST 2009

I'm interested to find out the link between:

  State threading using Monads and Arrows

  Monad / Arrow transformation (building new such constructs by
  composing the others).

More specifically, I've run into the case in concatenative programming
where you want to thread things different from the stack, and where
you also want to combine different threading mechanisms (i.e. a data
stack and a compilation `writer' monad).

However, the way (i understand) that monad transformers work is that
you always need to pick an _order_ of wrapping: it isn't a
side-by-side thing like i.e. linear operators and vector subspaces.
My hunch is that the latter is about Arrows.


Entry: Parser Combinators
Date: Fri Aug 28 14:49:21 CEST 2009


[1] http://en.wikipedia.org/wiki/Parser_combinator
[2] http://shaurz.wordpress.com/2008/03/11/haskell-style-parser-combinators-in-scheme/


Entry: Partial Continuations
Date: Fri Aug 28 14:49:55 CEST 2009

From the bit-twiddler's perspective, a partial continuation is not
much more than a segment of the call stack represented as a function,
where the full continuation is the whole call stack, represented as a
``function that doesn't return''.

The nuances come from how you specify the marking that enables you
to isolate the a segment between the current point and the one marked,
and how you execute code in the continuation that remains when this
segment is removed.

From [3]: ``The operators shift, control, control0, shift0 are the
   members of a single parameterized family, and the standard CPS is
   sufficient to express their denotational semantics.''

According to Oleg in [3], ``Shift to Control''[4] is the one to read.
This paper is about CPS transforms, and not directly of interest to me
atm.  I'm already quite happy with an understanding of how to
implement it dynamically.  Anyways, the gang of 4:

  +F+  shift
  +F-  control
  -F+  shift0
  -F-  control0

Here eFk denotes whether the shifted expression will be run inside a
delimited context e=+ or not e=- and whether the continuation will be
delemited k=+ or not k=-.


[1] http://okmij.org/ftp/Computation/Continuations.html#generic-control
[2] http://lambda-the-ultimate.org/node/966
[3] http://lambda-the-ultimate.org/node/606
[4] http://www.cs.rutgers.edu/~ccshan/recur/recur.pdf
[5] http://docs.plt-scheme.org/reference/cont.html


Entry: Monads and shift/reset
Date: Fri Aug 28 17:23:03 CEST 2009

Two directions: Representing Monads[1] by Filinski.  and a paper by
Wadler[2] about CPS / Monads / Delimited control / ...

From a very high level of intuitive understanding: PCs are
abstractions around segments of executions context and jumps -- Monads
are abstractions around threaded state and operation sequencing.  Put
in those words, a link seems plausible.

EDIT: see a Scheme example in [4].


[1] http://eprints.kfupm.edu.sa/62283/1/62283.pdf
[2] http://www.brics.dk/~hosc/local/LaSC-7-1-pp39-56.pdf
[3] http://homepages.inf.ed.ac.uk/wadler/papers/essence/essence.ps.gz
[4] entry://20090906-105507

Entry: Linear Lisp and Uniqueness Types
Date: Sun Aug 30 10:59:41 CEST 2009

Look at Clean[1] and Uniqueness Types[2] and try to see the relation
with libprim's linear PF machine.

[1] http://en.wikipedia.org/wiki/Clean_Language
[2] http://en.wikipedia.org/wiki/Uniqueness_type


Entry: Linear Type Systems
Date: Sun Aug 30 11:24:57 CEST 2009

Main idea: variables disappear from scope once referenced.

From the logic pov.: the linear logic is concerned with
_transformation_ of resources as opposed to an ever growing
accumulation of facts (classical logic).

In a nutshell, a logic is built of a set of axioms and a set of
combination rules.  In classical logic, you can start from a set of
propositions and construct larger sets of propositions by chaining
combination rules (inference rules).  In linear logic, to add a new
proposition to a ``context'' (multiset), you need to _consume_ one you
already have, preventing you to ever use it again.


It is mentioned in the wikipedia article[2] that that linear logic can
be seen as ``the interpretation of classical logic by replacing
boolean algebras by C*-algebras''.  This vaguely rings a bell,
conforming to my intuition of computation as an endomap of a
constant-size space instead of an ``explosion'' of an initial seed.

Conservation of resources in linear logic works a bit like
conservation of energy in a classical hamiltonian system[4].  Upto
now, I've mostly been talking about ``linear memory management'' as
``conservation of memory'', mostly in the form of binary tree nodes
(CONS cells).


[1] http://en.wikipedia.org/wiki/Linear_type_system
[2] http://en.wikipedia.org/wiki/Linear_logic
[3] http://en.wikipedia.org/wiki/C*-algebras
[4] http://en.wikipedia.org/wiki/Hamiltonian_mechanics


Entry: Proof Calculus
Date: Sun Aug 30 15:44:55 CEST 2009

A proof calculus[2] corresponds to a family of formal systems that use
a common style of formal inference for its inference rules.  A rule of
inference[3] (also called a transformation rule) is a syntactic rule
used in a formal system which is used to produce valid statements
within that system.  Examples of proof systems:

  * Hilbert-style deduction system[4], where a formal deduction is a
    finite sequence of formulas in which each formula is either an
    axiom or is obtained from previous formulas by a rule of
    inferences.

  * Natural deduction[5] is an approach to proof theory that attempts
    to provide a deductive system which is a formal model of logical
    reasoning as it "naturally" occurs.  This approach is in contrast
    to axiomatic systems which use axioms.

The sequent calculus[1] is a widely known proof calculus for
first-order logic.  The term "sequent calculus" applies both to a
family of formal systems sharing a certain style of formal inference,
and to its individual members, of which the first, and best known, is
known under the name LK, distinguishing it from other systems in the
family.  The sequent calculus LK was introduced by Gerhard Gentzen as
a tool for studying natural deduction.

In the sequent calculus all inference rules have a purely bottom-up
reading.  In natural deduction the flow of information is
bi-directional: elimination rules flow information downwards by
deconstruction, and introduction rules flow information upwards by
assembly. Thus, a natural deduction proof does not have a purely
bottom-up or top-down reading, making it unsuitable for automation in
proof search, or even for proof checking (or type-checking in type
theory). [5]

In Kleene's Metamathematics, Chapter V, p.86[6] ``The labour required
to establish the formal provability of formulas can be greatly
lessened by using metamathematical theorems concerning the existence
of formal proofs.''

-- remarks

The way I understand this is that natural deduction has no axioms (no
primitive assumptions) but contains all its structure in the inference
rules: the way propsitions can be combined to create new propositions.
In contrast, a Hilbert-style deduction system has both axioms and
inference rules (i.e. Modues Ponens).

Reading more in [6].  It seems that meta-mathematics is quite similar
to lisp macros: the meta math serves the purpose to abstract over the
_construction_ of formal proofs = meaningless pure syntax.  Because
formal logic's composition mechanisms at the foundations of
mathematics are so barbaric, a macro system makes sense to lift the
tedium.

What I find surprising however is that this meta system itself is
_not_ formal: proofs in the meta system are not formal proofs, but are
intuitive justifications of correctness.  I'd say this is because of
the chicken-and-egg problem: you can't use a formal system to do this,
because you don't _have_ one yet (remember, your using this meta-math
to _build_ a formal system).  This sounds like compiler bootstrapping
to me.


[1] http://en.wikipedia.org/wiki/Sequent_calculus
[2] http://en.wikipedia.org/wiki/Proof_calculus
[3] http://en.wikipedia.org/wiki/Inference_rules
[4] http://en.wikipedia.org/wiki/Hilbert_system
[5] http://en.wikipedia.org/wiki/Natural_deduction
[6] isbn://0720421039

Entry: Dependent Types
Date: Sun Aug 30 19:13:49 CEST 2009

In [1] I read: ``Dependent type theory in full generality is very
powerful: it is able to express almost any conceivable property of
programs directly in the types of the program. This generality comes
at a steep price — checking that a given program is of a given type is
undecidable. For this reason, dependent type theories in practice do
not allow quantification over arbitrary programs, but rather restrict
to programs of a given decidable index domain, for example integers,
strings, or linear programs.''

In [2] I read: ``If the user can supply a constructive proof that a
type is inhabited (i.e., that a value of that type exists) then a
compiler can then check the proof and convert it into executable
computer code that computes the value by carrying out the
construction. The proof checking feature makes dependently typed
languages closely related to proof assistants. The code-generation
aspect provides a powerful approach to formal program verification and
proof-carrying code, since the code is derived directly from a
mechanically verified mathematical proof.''

[1] http://en.wikipedia.org/wiki/Natural_deduction
[2] http://en.wikipedia.org/wiki/Dependent_types


Entry: Dynamic vs. Static Types
Date: Sun Aug 30 19:18:57 CEST 2009

I've been reading up a bit about logic and type systems.  This is
seriously complex stuff, in the sense that there are a great number of
different ways to approach static structure.  If you compare this to
the simplicity of dynamic typing (predicates / set membership without
static structure) I sometimes if people that use this heavy machinery
get anything done at all.

I do see that there can be a lot of payback if your applications are
complex, but exhibit some arbitrary but well-defined static structure.
Heavy types allow some of the correctness burdon carried by the
compiler, at the expense of 5+ years graduate studies of the
programmers :)

It seems that typing is about defying undecidability.  What you really
want is the machine to write your program for you.  Since that's quite
difficult, maybe you scale down expectations and ask the machine to at
least tell you that what you just did is not what you really intended.

When this ``intention'' can be codified in a structure that doesn't
lead to undecidable problems when trying to interpret it, you can
offload some of the thinking to the machine.  For dependent types this
can be taken quite far: as long as you (the programmer) can help the
verification system to solve the undecidable part of the problem
(provide a proof for certain parts) then it can use this to check the
rest.


Entry: Recent discussions with Dominikus about Concatenative Languages
Date: Mon Aug 31 14:14:50 CEST 2009

A _lot_ of topics have been covered.  I think this counts as the most
exciting discussion I've ever had with anyone on the topic of
concatenative languages.  Such intensive discussions tend to drive you
right to the center of your ignorance.

I'm going to try to list what I've learned, relative to my own
endeavors:

   * It pays to distinguish 3 kinds of Joy machine variants: linear
     with intensional quotations, nonlinear with extensional
     quotations and a staged linear/nonlinear version with extensional
     quotations and linear run-time code structures (continuations,
     compositions, partial applications, ...)

   * Continuations and tasks deserve to be treated as different things
     in a concatenative stack language.  The former doesn't include
     the parameter stack (partial continuations are stack->stack
     functions) and the latter does.

   * Phase separation is important: intensional code quotation is
     difficult to specify other than a VM with late binding.  Static /
     early binding helps here.

   * The pattern matching approach in Staapl isn't so bad.  It would
     be nice to find a more elegant rewriting _syntax_ for it, but the
     semantics seems to be just what I need to get the desired compile
     time reductions.  Generalizing the rewrite semantics seems to
     open up a can of worms.  However, it might be beneficial to do
     this for _optimizations_ since the interesting ones they tend to
     be non-confluent.

Then, about what I don't understand (warning: buzzwords):

   * The link between stack languages and other state threading
     mechanisms.  The key ignorance seems to concentrate on Monads,
     Monad transformers and Arrows.

   * How to use the above pure functional description to build a lazy,
     typed, partially evaluated system.


Entry: Stackless Extensional Joy
Date: Mon Aug 31 16:36:23 CEST 2009

I'm involved in a discussion with Dominikus about writing a
specification of Joy in terms of rewrite rules.  It is his opinion
(and I believe also Manfred's) that it is necessary to explicitly use
a data stack in the description.

  EDIT:

    * The discussion was resolved by making a clear distinction
      between language semantics (map from syntax to some
      representation domain, i.e. unary functions) and purely
      syntactic (meaninless) manipulations.  The original MvT paper[2]
      doesn't make this distinction either (it talks about rewriting
      to specify _semantics_ not to specify legal syntactic operations
      in a formal system without involving semantics.)

      In the rules below, a purely syntactic rewriting system only
      gives you the ability to state that two concatenative forms are
      equivalent.  Then later you can attach a meaning to that (they
      represent the same program).

    * Proper treatment of quotations is important as a language design
      issue (early / late binding), but when you're specifying an
      operational semantics, of course any one would work.  I.e. in
      the discussion it was irrelevant and this put me on a side
      track.

    * In my understanding, the discussion needed a clear definition of
      value (irreducable expression) and redex (reducable expression).
      Using a stack to specify this makes things simpler, but is not
      necessary.  However, in the specification I produce, a stack can
      be immediately seen to emerge.  Conclusion: if (purely
      syntactic) reduction rules are of the form

              a .. m A -> n .. z

      With `a' .. `z' values and `A' a non-value, the values on the
      left of `A' can be _interpreted_ as a stack, as can the values
      on the right of '->'.


   In that light, here's what I wrote (the first section about
   quotations is irrelevant).

                             * * *

It is mine that this is not necessary.  What _is_ necessary is a clear
treatment of quotations, respecting the non-isomorphic map from syntax
to meaning.  Thusfar this has been a series of hunches.  I'd like to
see why I really believe this (me being wrong is OK too..)

The problem appears to lie with the fact that the homomorphism S which
attributes meaning (a function) to syntax (a concatenation of
primitive symbols) is not one-to-one.  I.e.:

       S( [ 1 dup ] )  =  S( [ 1 1 ] )

Is this related to ``never reduce under lambda'' ?

I.e. there is only one _expansion_ operator: `i'.  All the others are
combinators that _re-arrange_ and _contract_ things.  It is not
allowed to perform _any_ substitutions _inside_ quotations.  The only
place where this is legal is in the current expansion of the program:
a flat list of tokens and quotations.

Does this help to eliminate the data stack?

I think I'm on the wrong track: to eliminate the data stack it
probably doesn't matter much whether code is intensional or
extensional.  What matters is that it needs to be really well-defined
what a _value_ is.

I.e. what does `swap' do actually do?  Can you explain that _without_
referring to the stack?  Syntactically, the quotation "a b swap" can
be reduced to "b a" but _only_ if both `a' and `b' are values.

Isn't this really just LR(1) parsing?

In [2] manfred alludes to all this..  However, going on about
stackless semantics, he says: ``This is the key for a semantics
without a stack: Joy programs denote unary functions taking one
program as arguments and giving one program as value. ''

This seems to be the ``staged'' interpretation.  A program denotes a
function, and evaluation is finding the simplest syntactic form of
this function.


                             * * *


1. values

It seems that what really matters is to distinguish values from
non-values, not necessarily using a stack.  Applicability of rewrite
rules representing `stack words' depend on leading symbols being
_values_.

        a b swap == b a

only if `a' and `b' are values.

If reduction order doesn't matter (confluent rewrite rules), then
picking an order that leads to a simple algorithm seems like a
reasonable default approach.

However, my position is still that the stack is purely an artifact of
the way you sequence the reductions.  In MvT's words[2]:

  ``It is clear that such a semantics without a stack is possible and
  that it is merely a rephrasing of the semantics with a stack.
  Purists would probably prefer a system with such a lean ontology in
  which there are essentially just programs operating on other
  programs.''

So I guess I'm wearing the purist hat for a change..

The reason I do is that there might be a benefit in generalizing this
to non-confluent systems that will give you a set of possible
reductions.


2. left to right order

One way to do the reuctions is from left to right.  Values are skipped
while redexes recombine with the values.  This will then effectively
yield a stack of values as a result.

I.e. in the following, everything on the left of '|' is fully reduced
(the stack) and the rest are possible redexes (the code).

| 2 5 + 3 *
2 | 5 + 3 *
2 5 | + 3 *
    7 | 3 *
    7 3 | *
       21 |

This left-to-right approach guarantees that a rule that needs values
on it's left side will allways be applicable.  The general rewrite
system that starts at any place in the code can't do this: some
symbols that need values on the left side won't have them (yet).


3. other orders?

So, what is the benefit of general rewrite rules, without specifying
an explicit sequencing order?

It seems that Forth's parsing words (i.e. `variable') use the _other_
direction: here symbols can modify the semantics of symbols to the
right.  Also non-confluent rewriting systems that describe
optimizations and not language semantics seem to need a non-local
approach.

Another useful point is to use global pattern matching to find
isolated words: if the reduction order doesn't matter, then (in a pure
functional setting) these segments are value-isolated and can be run
in parallel.


                             * * *

So, words are not values, but values are words.

DEFINITION:

   A value is word that does not produce a reducable expression if it
   is right-appended to a sequence of values.

It's a bit anti-climactic maybe, but I think that's the essence..

Dominicus calls this ``constructor functions'', but it requires a
definition that depends on a data stack: a domain that is used to
chain functions of the semantic space together.


[1] http://www.latrobe.edu.au/philosophy/phimvt/joy/j00ovr.html
[2] http://www.latrobe.edu.au/philosophy/phimvt/joy/j07rrs.html


Entry: Syntax vs. Semantics
Date: Mon Aug 31 20:46:22 CEST 2009

What I get: ``running'' a program is different from ``compiling'' it.

What I don't get: where is the bottom line semantics?  Is it really
just the physical processes that implement the interaction of a
concrete machine with the world?  Is it the model of physics?  Is it a
simplified model of combinatory logic and memory?

It looks like at some point you have to stop this sillyness and attach
a semantics, associating mathematical functions with the syntax.  The
higher up the chain you can do this, the more structure it will
probably have and the easier it becomes to reason about the meaning of
programs.

EDIT: I found this on [1]: ``... critics [of operational semantics]
  counter that the problem of semantics has just been delayed. (who
  defines the semantics of the simpler model?).''

It looks like the confusion is about operational semantics being
_relative_ while at some point you do need something tangeable to have
any meaning at all.  However, it is possible to allow for syntax
transformation based on preservation of relative semantics.

What I still can't make precise: how can rewriting (a syntactic
operation) be the specification of the semantics of a formal language?

[1] http://en.wikipedia.org/wiki/Formal_method


Entry: Types vs. Staging vs. Abstract Interpretation
Date: Thu Sep  3 08:43:56 CEST 2009


I'd like to know more about the formal relation between types and
abstract interpretation[1][3], and types and staging[2].  On an
intuitive plane it does seem reasonable to blur the 3 concepts: type
checkers / inference engines interpret programs in the compilation
stage.

Following some links I ended up here[4]. Frank Atanassow:

  ``Personally, I think the hits of the near future (ten or fifteen
    years) in the statically typed realm will be staged/reflective
    programming languages like MetaML, generic programming as in
    Generic Haskell and languages which support substructural logics
    like linear logic. I see opportunities for combining these in a
    powerful way to support lawful datatypes, i.e., datatypes which
    satisfy invariants in a decidable way. The underlying ideas will
    probably also pave the way for decidable behavioral subtyping in
    OO-like languages. A unified foundation for functional,
    procedural, OO and logic languages is also something I predict we
    will see soon. The aspect-oriented stuff will be probably sorted
    out and mapped onto existing concepts in current paradigms. (I
    don't think there is anything really new there.)

    Also, I think that in twenty years there will no longer be any
    debate about static vs. dynamic typing, and that instead most
    languages will instead provide a smooth continuum between the two,
    as people realize that there is no fundamental dichotomy
    there. Also languages will have more than one level of types: the
    types themselves will have types, and so on. But we will be able
    to treat types in a similar way to the way in which we treat
    values: we will be doing computations with them.

    There will be an increased emphasis on efficiency in the future,
    and I think we will see fewer aggressively optimizing compilers;
    instead it will be the programmer's job to write his programs in
    such a way that they are pretty much guaranteed to be efficient
    with any reasonable compiler. This sounds like a step backward,
    but it won't be because we will be better able to separate
    concerns of specification from implementation.

    (The reason I see more emphasis on efficiency is partly because I
    think that wearable computers and other small computers will
    become ubiquitous, partly because I think we will have the
    technology to do it when substructural logics invade static
    typing, and partly because we are starting to understand how to
    assign static type systems to low-level languages like assembly
    language.)''


[1] http://www.di.ens.fr/~cousot/COUSOTpapers/POPL97.shtml
[2] http://lambda-the-ultimate.org/node/2575
[3] http://lambda-the-ultimate.org/node/220#comment-1695
[4] http://lambda-the-ultimate.org/classic/message6475.html#6506


Entry: Metaprogramming Patterns
Date: Thu Sep  3 11:27:47 CEST 2009

Two topics:

- I am focussing on building stagable abstractions on top of
  combinatory circuits for real-time DSP applications.  This should
  yield a series of small DSLs and DSL -> C compilers.

- Jacques Carette's approach to finding a list of metaprogramming
  patterns[1].

The latter paper talks about 1. the need for CPS-style programming to
assure proper name generation (``let insertion'') for storing
intermediate results and 2. a way to solve the notational problems
using a monad (which can then accomodate other effects).

I currently don't see how this can be used to bring these techniques
``to the people'', for the simple reason that it takes me quite some
effort to follow the notation, and I already spent considerable effort
reading about the field.  Types bring security, but complicate matters
quite a bit.  The payoff might be large, but the investment isn't
neglegible: sometimes it takes a whole lot of maneuvering to express
the static structure you want in the type system.  The monadic style
can be relaxed by using control operators[2].

For practical purposes it seems that untyped abstract-evaluation based
approaches are a better way to gently add this to the toolbox of
nuts&bolts embedded software engineering, with the typed approach
currently limited for the construction of software tools by experts in
both the domain _and_ typed functional programming.

What matters in practical / simple DSLs is to provide a good
abstraction (semantics) and notation (syntax), and to allow for static
analysis.  Whether the generators _themselves_ are statically verified
is an added safety I see only pay off in very specialized and
error-prone generator applications, unless the notational and
conceptual overhead can be reduced (as seems to be the idea of [2]).


[1] http://www.cas.mcmaster.ca/~carette/publications/scp_metamonads.pdf
[2] http://okmij.org/ftp/Computation/staging/circle-shift.pdf


Entry: Local consistency
Date: Thu Sep  3 13:47:40 CEST 2009

Context: I'm looking at the jump from data flow programming -> local
consistency rules[1].  More specifically, in my application at hand,
are local constraints enough?  (as opposed to global constraints like
simultaneous sets of equations) If so, then (staged) constraint
propagation is a good solution.

Let's look at the ``trivial constraint language'' described in chapter
2 of [2].  It focusses on local propagation only.  Apart from simple
arithmetic constraints it is useful for allowing constraints that
can't be directionalized fully, i.e. writen in terms of inequalities
like m = max(a,b)

The remaining chapters then talk about removing some of the
deficiencies: data types, abstraction mechanisms, multiple use
(tracking changing data), global issues and dependencies of
constraints (why is there a contradiction?).

In [1], a constraint satisfaction problem is defined as a set of
variables, a set of domains, and a set of constraints.  Variables and
domains are associated: the domain of a variable contains all values
the variable can take.  A constraint is composed of a sequence of
variables, called its scope, and a set of their evaluations, which are
the evaluations satisfying the constraint.


[1] http://en.wikipedia.org/wiki/Local_consistency
[2] http://www.ai.mit.edu/publications/pubsDB/pubs.doit?search=AITR-595


Entry: Monads vs. delimited control
Date: Sun Sep  6 10:55:07 CEST 2009

So, practically: I have a need to turn a function that uses explicit
mutation of a dynamic variable into a pure function.  How to do this
without adding explicit state threading?

Simplified: you want the dynamic parameter to be included inbetween
the prompt and the shift / control operator that reifies the dynamic
context into a function.  I.e. something like this:

#lang scheme/base
(require scheme/control)
(provide make-counter state)
(define state (make-parameter #f))
(define (make-counter)
  (reset
   (parameterize ((state 0)) ;; param sandwiched between `reset' and `shift'
     (let loop ()
       (printf "state = ~s\n" (state))
       (shift k k)
       (state (add1 (state))) ;; mutate
       (loop)))))

The problem is that it doesn't give you referential transparency: the
mutation is still visible when a certain continuation gets invoked
multiple times, as the parameter's storage location is a shared value.

Every continuation should somehow include its own value of the
parameter.  This can be assured by saving the parameter's value
whenver a continuation is created, and resetting it whenever it is
resumed:

#lang scheme/base
(require scheme/control)
(provide make-counter state invoke)
(define state (make-parameter #f))
(define (make-counter)
  (reset
   (parameterize ((state 0))
     (let loop ()
       (printf "state = ~s\n" (state))
       (state (let ((s (state))) (shift k (cons k s))))
       (state (add1 (state))) ;; mutate
       (loop)))))
(define (invoke ctx)  ((car ctx) (cdr ctx)))

Here the continuation is extended with a context value `s' which is
passed to the continuation by `invoke' and used to reset the `state'
parameter upon context entry.


It looks like this is related to the ``continuations + storage cell''
approach in [1], though a bit too dense for me atm.

Some related work.  In [2] investigates the interaction between DC and
`dynamic wind', while [3] focusses on DC and dynamic binding.  I must
admit I fail to understand the subtleties, as it all seems ``obvious''
to me from the pov. of continuation marks.  I'll come back to this
after using it in practice.

To summarize: partial continuations capture control, while parameters
are useful for ``locally global'' threaded state in case it's not
_practical_ to implement that lexically.  You locally give up
referential transparency to increase modularity and simplicity of
function interfaces.


[1] http://eprints.kfupm.edu.sa/62283/1/62283.pdf md5://e60a51d38011e8dca44540f590643001
[2] http://people.cs.uchicago.edu/~robby/pubs/papers/icfp2007-fyff.pdf
[3] http://okmij.org/ftp/Computation/dynamic-binding.html#DDBinding


Entry: APL & J
Date: Mon Sep  7 22:51:52 CEST 2009

Time for something different.  Hmm.. there's no open source
implementation?

[1] http://en.wikipedia.org/wiki/J_programming_language


Entry: Towards the best collection traversal interface
Date: Wed Sep  9 09:50:14 CEST 2009

I find this idea quite intriguing, as it seems to be central to a lot
of things I'm trying to understand.  I'm putting the idea to the test
by trying to avoid lists wherever possible, and use left fold instead.

Enumerators are easily bridged to SRFI-41 lazy lists, eager lists and
PLT Scheme sequences.  The `enum->stream' operation uses reset/shift
to invert control as in:

 (define (enum->stream enum)
   (reset
    (enum (lambda (el)
            (shift k (stream-cons el (k #t)))))
    stream-null))

Example: the `choice' operator[3] in Staapl, which uses enumerators
for representing choices and results, making it easier to compose
searches.  (Internally, the choice enumerator is translated into a
lazy stack of resume points.)


[1] http://lambda-the-ultimate.org/node/1224
[2] http://okmij.org/ftp/Streams.html#enumerator-stream
[3] http://zwizwa.be/darcs/staapl/staapl/machine/choice.ss
[4] http://www.eros-os.org/pipermail/e-lang/2004-March/009643.html


Entry: Environment / Continuation vs. Param stack / Return stack
Date: Wed Sep  9 13:49:24 CEST 2009

The analogy doesn't hold, but there is some relation..  Can this be
made more precise?  The explanation could be centered around the
following observation.

In a concatenative language, there is no `lambda' to introduce new
functional abstractions closed over a surrounding lexical environment.
However, in a concatenative language it is possible to do something
similar using the quotation-equivalent of `cons', `list' and `append'.

More precisely, in a concatenative language one can take two quoations
and `compose' them, and one can take a data item and turn it into a
quotation that will reproduce it.

In some sense, `lambda' performs a `cons' of code and environment.
With a bit of a stretch of imagination, one could then see a data
stack as an environment, and a quotation as a code body containing
free variables.


Entry: Bananas, Lenses, Envelopes and Barbed Wire
Date: Fri Sep 11 12:55:25 CEST 2009

It's probably not very smart to try to do anything with ``algebra of
programs'' without this work[1].  It's interesting to also look at
Fokkinga's introduction to category theory[2].

From my particular perspective (emphasis on vectors) recursive data
types are probably too general.  However, the theory here could serve
as some guideline, as recursive types will probably re-emerge as an
implementation issue.

Ok, the paper[1].

operator     recursion pattern for        example
------------------------------------------------------
bananas      catamorphism                 fold, map
lenses       anamorphism                  unfold, map
envelopes    hylomorphism[3] (ana->cata)  factorial
barbed wire  paramorphism


Translated to the talk of mere mortals: an catamorphism is a data
consumer, an anamorphism is a data producer, and a hylomorphism is a
producer feeding into a consumer without an intermediate
representation of the data structure that decouples them.  A
paramorphism is a catamorphism that ``eats its argument and keeps it
too''.

Functor: map types to types, and functions to functions.
Arrow: ``wrap'' a function to operate on a different space.
... (etc.. glossing over details for a moment)

The point in section 4. is to program in terms of the morphisms
instead of using explicit recursion on data types.  For each cata-,
ana- and paramorphism W, 3 rules are provided:

  - evaluation rule
  - uniqueness property (induction proof to be a W)
  - fusion law

The fusion laws are based on the concept of ``fixed point fusion'':

    f ( u g ) = u h    <=  f strict ^ f . g = h . f

here u : (A -> A) -> A is the fixed point operator.  In expanded form
the LHS is quite obvious:

    f . ( u g )
  = f . g . g . g . ...
  = h . f . g . g . ...
  = h . h . f . g . ...
  = u h


The take-home argument (ignoring strictness issues related to the
fixed point combinator) is that fusion for ana- and catamorphisms are
left/right symmetric:

        ana:  |( x )| . f = |( y )|  <=  x . f = f_L . y
        cata: f . (| x |) = (| y |)  <=  f . x = y . f_L

Where f_L is obtained from f as some fixed point operation.. (???)

(... I'm going to give this up for now, but this paper gives enough
ideas to construct a less abstract version. )

[1] http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.41.125
[2] http://www.cs.utwente.nl/~fokkinga/mmf92b.pdf
[3] http://en.wikipedia.org/wiki/Hylomorphism_%28computer_science%29


Entry: Staging Control Flow
Date: Fri Sep 11 14:20:53 CEST 2009

It looks like all the things I'd like to do (making DSP/Control
prototyping and finding correct implementations two orthogonal
problems that do not involve duplication of effort) all has to do with
staging control flow : how high can you make the level of abstraction
while still guaranteeing that the eventual product is a bounded-time
combinatorial circuit / state machine.

Most of the DSP/Control applications have a very functional, parallel
data flow character.  What makes them difficult to implement is that
they need to pass through the von-neumann bottleneck.  There are a lot
of choices to be made turning 1. equations into directed functions and
2. sequencing operations (control) and managing intermediate results
(memory).

A key paper is going to be this one[1].  Both from the perspective of
making it possible to express the original algorithm in a
straightforward way, and from the perspective of making all the design
decisions explicit.


Above I'm talking about moving from higher level languages down to
some ideal low-level machine architecture.  Mapping to _real_ hardware
is then another problem that might involve ``lobotomizing'' a compiler
and bringing decisions to the surface so they can take part in a
global optimization process.

[1] http://www.cas.mcmaster.ca/~carette/publications/scp_metamonads.pdf


Entry: Control-Flow Analysis of Higher-Order Languages (Shivers)
Date: Sun Sep 13 10:22:16 CEST 2009

Based on the techniques of CPS and non-standard abstract semantic
interpretation (NSAS).

The basic problem is described as an interdependency of two analysis
phases: flow analysis needs a control flow graph, but because code can
be bound to variables, the construction of the control flow graph
needs flow analysis.

In CPS Scheme, the problem is reduced to determining which call sites
call which lambdas.  This is because all control transfers are
represented by procedure calls.  This problem is represented as the
search for a function L(c) which maps a call site c to a minimal set
of lambda expressions that are possibly called at c.

NSAS is described as a technique to construct a computable analysis
for a certain property X.

    - Start with a denotational semantics S

    - Construct a _non-standard_ semantics S_X derived from S, that
      precisely expresses X.

    - Construct an _abstract_ version of S_X that trades accuracy for
      compile-time computability.

Note that this is one of (the main?) practical reasons why
denotational semantics are important.

The denotational semantics for CPS Scheme is presented as an
``interpreter written in a functional language''.

[1] http://www.ccs.neu.edu/home/shivers/citations.html#diss


Entry: CPS vs. A-normal form
Date: Sun Sep 13 10:39:18 CEST 2009

From [1]

  (print (* (+ x y) (- z w)))

in CPS, where `'k' is the contination of the expression:

  (+ x y (lambda (xy)
           (- z w (lambda (zw))
              (* xy zw (lambda (prod)
                         (print prod k))))))

in A-normal[2] form, which I've called ``nested let'' before:

  (let* ((xy (+ x y))
         (zw (- z w))
         (prod (* xy zw)))
    (print prod))


[1] http://www.ccs.neu.edu/home/shivers/citations.html#diss
[2] http://en.wikipedia.org/wiki/Administrative_normal_form


Entry: Delimited Continuations and Staging
Date: Mon Sep 14 11:27:48 CEST 2009

Avoiding monads using delimited continuations[1][2].

In the latter, ``scope extrusion'' (the possibility of bringing
variables outside of their scope using assignments) is illustrated
using the following MetaOCaml example:

      let r = ref .<1>. in
          .<fun y -> .~(r := .<y>.; .<()>.)>.; !r

      => .<y_1>.

Inside the escape .~( ) a code value is assigned to the reference r
and referenced outside of the quotation .< >. to be returned as the
value of the whole expression.  This code value is ill-formed: the
variable y_1 (y renamed) is no longer bound.

Arbitrary shift/reset will give similar problems: variables can be
transported outside of their scope.  The paper suggests an approach
where shift/reset is still used, but escapes are limited up to the
binding site.

So the thing to figure out is how this translates to scheme: is it
worth limiting control effects (since we have no typing, but do want
to have correct scoping).


[1] http://lambda-the-ultimate.org/node/3112
[2] http://okmij.org/ftp/Computation/staging/circle-shift.pdf


Entry: Proof Assistants / Constructive Analysis
Date: Tue Sep 15 21:05:47 CEST 2009

Proof assistants have a small ``trusted core'' that's manually
verified, which is then used to bootstrap other theories.  Correctness
can be proven _mostly_ in the system itself (but of course not fully
from Goedel's incompleteness).

An interesting remark about types vs. set theory: in set theory the
question: is the set \pi an element of the set \sin? (DeBruyn)

The answer depends on the encoding.  You need a layer of type theory
over the set theory.

Modern proof assistants are based on type theory directly: serves both
as a foundation of mathematics and as a programming language.

Exact reals scream for constructive logic because there is no
zero-test.  You want to avoid taking the exact decision P or not-P
while programming.  However for naturals, rationals or other decidable
structures you do have this decision.

Programming in type theory used for real programs:
  Leroy built a machine-verified compiler for Cminor (subset of C) to
  machine language.  The idea is to extend this to Coq (built on Ocaml
  (built on C)).


[1] http://videolectures.net/aug09_spitters_oconnor_cvia/


Entry: Sparse Conditional Constant Propagation
Date: Fri Sep 18 08:45:58 CEST 2009

These 3 are very related:

- constant folding  (1 + 2 -> 3)
- constant propagation (A = 3; B = 4; A + B -> 3 + 4 -> 7)
- function inlining (f x y = x + y;   f a b -> a + b)

Abstract interpretation can do constant prop+fold and function inline
all at the same time, as long as all operations / functions have a
staged behaviour.  In a functional (SSA) setting, this isn't such a
fuss.

Mutation however complicates matter.  The wikipedia page about
constant folding[1] talks about reaching definition[2] analysis, which
in SSA is of course trivial.

So, sparse conditional constant propagation[3] uses abstract
evaluation of SSA form: ``The crux of the algorithm comes in how it
handles the interpretation of branch instructions.''  Basicly,
conditional branches depending on known data can be picked at compile
time.


[1] http://en.wikipedia.org/wiki/Constant_folding
[2] http://en.wikipedia.org/wiki/Reaching_definition
[3] http://en.wikipedia.org/wiki/Sparse_conditional_constant_propagation


Entry: Controlling Effects
Date: Fri Sep 18 15:25:29 CEST 2009

Filinski PhD: representing monads w. delimited control[1].

[1] http://www.diku.dk/hjemmesider/ansatte/andrzej/papers/CE.ps.gz


Entry: Algebra of Programming
Date: Sun Sep 20 10:27:49 CEST 2009

Oege De Moor[1] and Richard Bird[2], their book[3] and a LtU
thread[4].  That thread contains some interesting links.  Looks like
this is the place to start for getting some more information on the
subject.  I also asked John Nowak what he's up to, since it seems to
be related to [3]. The thread[4] mentions that Oege stopped pursuing
this line of research becase it is too abstract.  In any case, a bit
more knowledge of category theory would help.  See Maarten
Fokkinga[5]'s [6].

[1] http://www.comlab.ox.ac.uk/people/oege.demoor/
[2] http://www.comlab.ox.ac.uk/people/Richard.Bird/index.html
[3] isbn://013507245
[4] http://lambda-the-ultimate.org/node/1117
[5] http://wwwhome.cs.utwente.nl/~fokkinga/
[6] http://www.cs.utwente.nl/~fokkinga/mmf92b.pdf


Entry: Galois Connection
Date: Sun Sep 20 10:51:08 CEST 2009

A Galois connection is a kind of morphism between posets.  Starting
with Lecture 10b [1] from Cousout's course at MIT[2].  The basic
property is expressed as:

        a(x) [= y   iff   x <= g(y)

Galois connections are interesting when they are used to relate sets
of different ``sizes''.  Let's take a be the abstraction operation
which maps the larger poset P,<= to the smaller poset Q,[=.

Page 112 in the slides, (page 28 in the pdf/4) has a picture of a
Galois connection.  From this, the point is that while a roundtrip

         g . a

can shift around elements in P, it will not change the order relation
<= in P.  I.e. some elements in P are promoted (or dually demoted),
but never in a way that they switch order.  In other words, g . a is
extensive. This is captured in the following theorem:

  From the properties: a,g monotone, a . g reductive, g . a extensive
  follows that a, g form a Galois connection.

f extension: x <= f(x)
f reductive: f(x) <= x

Notes:

   * as with most related to posets, each GC has a dual which reverses
     the order relations in P and Q and exchanges the roles of a and g.

   * GCs compose using ordinary function composition of the a's and
     g's (as opposed to Galois correspondences)

   * GCs can be combined as sums (linear, disjoint, smashed),
     products and powers.


[1] http://web.mit.edu/16.399/www/lecture_10-maps1/Cousot_MIT_2005_Course_10b_4-1.pdf
[2] http://web.mit.edu/16.399/www/


Entry: Software pipelining: An effective scheduling technique for VLIW machines
Date: Sun Sep 20 12:08:39 CEST 2009

From the abstract-interpretation based ``flattening'' of matrix
operations (Gauss-Jordan elimination) to data flow graphs it dawned on
me that ``sorting'' this network might be an interesting optimization
for VLIW processors (such as the TI DaVinci / C64x DSP, or NXP
TriMedia) Of course, it turns out I have it all backwards: the VLIW
architecture was introduced to take advantage of this pattern.

From [1] (which I found in a collection[6] of must-read CS papers):
``The key to generating efficient code for the VLIW machine is global
code compaction.  In fact, the VLIW architecture is developped from
the study of the global code compaction technique, trace scheduling''.

On this subject Ellis' PhD[4] seems to be the definitive referece,
however I can't find an electronic copy.  Looks like this book[5] by
Fisher et al. might have some answers too.  This one[2] is on
Trace-Scheduling-2, a non-linear extension.  The basic idea in trace
scheduling is to optimize the most frequently executed traces by
turning them into straight-line code.  This in turn allows for plenty
of opportunities to find parallelism.  So, trace scheduling attempts
to _create_ the conditions I mention in the first paragraph, by
picking traces that are highly probable.

However, it has its problems.  The most important one is exponential
code explosion.  So, [1] talks about software pipelining, which is an
alternative to trace scheduling.  It uses hierarchical reduction to
straighten branches: a conditional branch is split into streams of
conditional code, where the shorter branch is padded to fit the size
of the larger one.  The rationale is that because jumps are
particularly expensive in VLIW (i.e. #FU x pipeline-depth), this
approach (disabling units from the unused side of the branch) is often
better than performing a jump.


[1] http://reference.kfupm.edu.sa/content/s/o/software_pipelining__an_effective_schedu_8310.pdf
[2] http://www.hpl.hp.com/techreports/93/HPL-93-43.pdf
[3] http://courses.ece.illinois.edu/ece512/Papers/trace.pdf
[4] http://mitpress.mit.edu/catalog/item/default.asp?tid=5098&ttype=2
[5] isbn://1558607668
[6] http://www.cs.utexas.edu/users/mckinley/20-years.html


Entry: DSLs as compiler hints
Date: Sun Sep 20 16:05:47 CEST 2009

( In the context of previous post[1] about VLIW optimizations. )

Coming back to the combinator approach: you loose some, because not
all programs can be expressed, and you win some, because specification
can be separated from implementation.

An essential part is that you fix the specification of the solution on
a higher abstraction level such that the compiler doesn't need to
_infer_ properties of your solution (to choose a different
implementation).

By using a high-level description, properties can be made explicit,
independent of the meaning (correctness) of the program (i.e. as
``aspects'').  These could then be used by a compiler to optimize
over: it can concentrate on searching instead of spending time on
inference.

The real problem however is to find a good collection of combinators
(the ``DSL''), and the specification of an escape hatch to lower, more
general purpose levels.  Here _good_ means that it can express most of
the solutions, and provides a good parametrization of possible
implementations.

[1] entry://20090920-120839


Entry: Theorems for free!
Date: Sun Sep 20 20:07:47 CEST 2009

Wadler's free theorems[1] and the algebra of programs.

A remark that caught my attention: ``in general the laws derived from
types are of a form useful for algebraic manipulation''. (I.e. push
`map' through a function).

These theorems depend on _parametricity_ : because of the _hole_ in
the specification, it can't do much else than be about _structure_.
The theorems then reflect structural theorems (i.e.  commutation
laws).


[1] http://homepages.inf.ed.ac.uk/wadler/papers/free/free.ps

Entry: The Design and Implementation of Typed Scheme
Date: Mon Sep 21 09:45:47 CEST 2009

Based on occurence typing: assigning distinct subtypes based on the
control flow of the program.  This is based on the observation that
Scheme programmers often use flow-oriented reasoning: distinguishing
types based on prior operations.

[1] http://www.ccs.neu.edu/scheme/pubs/popl08-thf.pdf


Entry: Understanding Expression Simplification
Date: Mon Sep 21 12:03:33 CEST 2009

.. in the light of Minimum Description Length.

[1] http://www.cas.mcmaster.ca/~carette/publications/simplification.pdf


Entry: Dynamic Programming
Date: Mon Sep 21 12:54:33 CEST 2009

Subdivide and memoize to avoid exponential explosion.

What I never realized is that Recursive Least Squares (RLS) falls in
this category, and in general all ``update'' stream-based algorithms.

In [2] a technique is mentioned that centralizes memoization in a
y-combinator.  For more about this see [3].

[1] http://en.wikipedia.org/wiki/Dynamic_programming
[2] http://okmij.org/ftp/Computation/staging/circle-shift.pdf
[3] http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.92.495&rep=rep1&type=pdf


Entry: Partial Evaluation for the Lambda Calculus
Date: Mon Sep 21 14:03:08 CEST 2009

[1] http://eprints.kfupm.edu.sa/20847/1/20847.pdf


Entry: Constraint Programming from CTM Chap. 12
Date: Tue Sep 22 12:29:12 CEST 2009

Combination of propagation and search.  Try local deductions first,
and solve the rest with search.  Choice can be introduced by
transforming a constraint program P into P^C and P^(~C).

Local deductions are implemented as propagators.  ( Equations,
i.e. sets of linear equations, are propagators that can be turned into
functions, but there is a more general relational framework. )

Choice is inserted implicitly using a heuristic ``distribution
strategy''.


Entry: Staging & Typing
Date: Wed Sep 23 11:51:03 CEST 2009

This one[1] should be an eye-opener, next to the first paper on
taggless interpreters[3].

[1] http://okmij.org/ftp/Computation/staging/metafx.pdf
[2] http://lambda-the-ultimate.org/node/2575
[3] http://lambda-the-ultimate.org/node/2438


Entry: FISh & Squigol
Date: Wed Sep 23 10:20:53 CEST 2009

Functional = Imperative + Shape[1][2].  Another one of those little
gems to read before attempting the DSP / vector extension to Staapl.

And a tutorial on squigol[3].

[1] http://www-staff.it.uts.edu.au/~cbj/FISh/index.html
[2] http://linus.socs.uts.edu.au/~cbj/Publications/latest_fish.ps.gz
[3] http://ti.arc.nasa.gov/m/profile/ttp/squigol.pdf


Entry: Recent Scheme papers from NU
Date: Thu Sep 24 14:43:15 CEST 2009

[1] http://www.ccs.neu.edu/scheme/pubs/


Entry: Scheme implementation
Date: Thu Sep 24 15:06:09 CEST 2009

Following advice from [1], here is a Kranz's PhD about Orbit[2][4] and
Dybvig's PhD about implementing Scheme[3].

For the run-time side: David Gudeman about representing dynamic typing
[5].

[1] http://news.ycombinator.com/item?id=835020
[2] http://repository.readscheme.org/ftp/papers/orbit-thesis.pdf
[3] http://www.cs.indiana.edu/~dyb/papers/3imp.pdf
[4] md5://3a9e0bba8f636d5a9fcdd3d19fc09216
[5] ftp://ftp.cs.indiana.edu/pub/scheme-repository/doc/pubs/typeinfo.ps.gz


Entry: Pico
Date: Fri Sep 25 16:48:38 CEST 2009

I'm looking back at Pico[1], a language developped at prog@vub by De
Meuter and d'Hondt.  A ``lisp for mere mortals''.

[1] ftp://prog.vub.ac.be/Pico/Docs/LispWS.pdf


Entry: Logic
Date: Sun Sep 27 17:28:14 CEST 2009

If first order logic can be used to create structured domains (the
space where the predicate parameters live), propositional logic
represents just structure, and isn't `about' anything.

What does higher order logic represent?  It allows quatification over
predicates and higher order types.

[1] http://en.wikipedia.org/wiki/First_order_logic


Entry: Formal methods
Date: Sat Oct  3 20:33:50 CEST 2009

SPIN/Promela and TLC/TLA+ by Jose Faria.  The paper[1] compares two
formal specification tools based on a case study: an algorithm to deal
with non-blocking linked-lists based on the compare-and-swap (CAS addr
old new) instruction.  CAS atomically compares the contents of a
location addr with an old replaces it with a new value in case of a
match, returning a boolean to indicate whether the substitution took
place or not.

This primitive allows to detect significant conflicts in the case of
list operations: some other process might have modified the list in
such a way that is incompatible with the intermediate state of a list
operation, in which case the whole operation can simply be restarted
based on the result of the CAS.

TLA+[4] is a specification language for describing and reasoning about
asynchronous nondeterministic concurrent systems.  TLC is an
explicit-state model checker for specs written in TLA+.  A spec in
TLA+ is summarized in a single formula that describes a state machine
in erms of an initial condition, a next state relation and possibly
some liveness conditions.

Promela is the specifically developed input language of SPIN, an
on-the-fly explicit-state model checker (as TLC, while TLC was
designed after TLA+).

The Wikipedia page on linear temporal logic (LTL)[5] summarizes the
useful properties as the ability to express safety (something bad
never happens) and liveness (something good keeps happening).

[1] http://www.openlicensesociety.org/docs/FMethodsReport_ComparisonTLASpin.pdf
[2] http://en.wikipedia.org/wiki/Promela
[3] http://en.wikipedia.org/wiki/Temporal_logic
[4] http://en.wikipedia.org/wiki/Temporal_logic_of_actions
[5] http://en.wikipedia.org/wiki/Linear_temporal_logic


Entry: Funmath
Date: Sun Oct  4 10:16:46 CEST 2009

The previous article lead me to funmath[1].  Funmath stands for
Functional mathematics (for other uses of the name, look here).  The
underlying principle consists in defining mathematical concepts as
functions (hence the name) whenever doing so is appropriate. This
turns out to be especially convenient where it has not yet become
common practice.

The idea is to build a defect-free notation for mathematics.  It
reminds me of what Sussman is up to with his recent work on Scheme +
differential geometry.

  By formalism we mean a framework for reasoning comprising two
  elements: (a) a symbolic language or notation, (b) rules for
  symbolic manipulation.

    (a) The language is usually characterized by its form, typically
    specified by a formal syntax, and its meaning, typically specified
    by a (denotational) semantics.

    (b) The rules are typically specified by a formal system, which
    can be seen as the axiomatic semantics of the language if we
    borrow the term from programming languages.

  Gries advocates calculational reasoning. This means that logical
  arguments are presented as symbolic calculations, stepping from one
  equation to the next using appropriate rules, and linking them by
  (in)equalities.

[1] http://www.funmath.be/
[2] http://www.funmath.be/LRRL.pdf
[3] http://www.cs.utexas.edu/~EWD/ewd10xx/EWD1073.PDF


Entry: The Two Towers.
Date: Sun Oct  4 11:54:03 CEST 2009

Two important things happened to me in the course of the last 10
years.  Rising from the puddle of asm and C, I discovered Forth and
its algebraic nature, and Scheme, lambda calculus, macros, types, ...
The practical pilar was compilation.

Through manipulation of language as just another data object, a
different kind of math was brought within my reach - quite different
than the isolated world of linear algebra and calculus useful for
numerical applications.  Instead of math being something that's done
on paper, it came to live in my hands through the manipulation of code
objects.

This took a couple of years to really sink in.  As opposed to the
principle being simple - formal languages, axioms and inference rules,
semantics represented by functions mapping language objects to
other objects, there is an incredible range of possible ways to bring
mathematical structures to physical computing machinery.

So is logic the ultimate programming language?  I guess it depends on
what your goals are.  Currently I see really only two camps in
software development: the mathematical / logical camp which limits
power by introducing formal abstractions that have provable or
verifyable properties, and the biological camp which uses evolutionary
techniques to approach correctness (and specification!) by removing
constraints and looking only at observable behaviour.

I guess the point is something like this: since computers themselves
behave mathematically (within certain bounds), do you want to
propagate this kind of exactness to very high level claims (proofs in
logic), or are you more interested in using the computer to give you a
system that implements limitless power (reflection / programmable
semantics).

Maybe, stretching it a bit, the former could be called the tree /
directed graph approach (proof / transport of truth), while the other
is the full graph approach (connected objects, the internet model).


Entry: Signal Processing Functions, Algorithms and Smurfs: The Need for Declarativity
Date: Sun Oct  4 13:06:26 CEST 2009

Following the links on [2] brings me to a paper[1] by Boute ``Signal
Processing Functions, Algorithms and Smurfs: The Need for
Declarativity''.  This is brilliant stuff.

  The main cause [of decline of declarative thinking] in DSP is a
  shift from essentially declarative mature engineering formalisms to
  ``algorithmic thinking'' induced by computer implementation,
  ignoring the declarative mathematical methods for software.''

Boute mentions SILAGE[3], a DSP dataflow language.

  Our own research over the past 15 years is also aimed at unifying EE
  and CS, starting with mathematical modeling and reasoning.

I have a new hero.  Looks like I need to shut up and read for a while.
Most of my plans for Staapl's DSP language are probably best put in
this framework.


[1] http://www.funmath.be/SmurFinl.pdf
[2] http://www.funmath.be/LRRL.pdf
[3] http://www.cosic.esat.kuleuven.be/publications/article-756.pdf


Entry: Coq & Dependent types
Date: Wed Oct  7 09:53:16 CEST 2009

Let's pick up again at Coq and dependent types.


Entry: Linear types
Date: Wed Oct  7 10:04:35 CEST 2009

I'm having a look at this survey[1] about linear types, regions and
capabilities.

  At any point in time, the heap consists of a linear upper structure
  (a forest), whose leafs form the boundary with a nonlinear lower
  structure (an arbitrary graph).

It is possible to lift this requirement using _focus_.

  Wadler noted that this is extremely restrictive. ...  This leads to
  an explicitly threaded programming style (i.e. `uncons' and a linear
  stack of values) which is heavy and over-sequentialised.

Temporary aliasing is possible, provided there is only one remaining
pointer when a variable recovers its original linear type.  (Wadler's
ad-hoc `let!' is essentially the style in which the PF primitives are
implemented.)  Apparently there is a cleaner idea hidden.

  Only state has to be linear, a state transformer can safely be
  nonlinear.  Monads are a language design in which state is implicit
  and only state transformers are first-class values.

  In principle one could type-check a monad's implementation using a
  linear type system (it allows in-place updates) and type-check its
  clients using a standard type system.

  Linearity is meant to enforce the absence of aliasing.  Regions are
  intended to control aliasing.

Then, Baker's regions provide some annotation that can be used to
perform 1) collection at function exit and 2) in-place update for
intermediate data.

Then some about `letregion', regions that coincide with lexical scope
which with proper escape analysis can be implemented as a stack.  This
then leads to type-and-effect[2] systems.

  The calculus of capabilities is a low-level type-and-effect system.

Then it becomes a bit too abstract.  I follow the general idea though:
distinguish pointers (shared environment) from the right to deallocate
or dereference (linear capabilities).

Skipping to the practical: Cyclone, a ``safe C'', provides fine
grained control over allocation/deallocation without sacrificing
safety.

  Cyclone is a complex programming language.  Simplified calculi
  describe its foundations and discuss interesting connections between
  linearity, regions and monads.

[1] http://lambda-the-ultimate.org/node/3581#comment
[2] http://en.wikipedia.org/wiki/Effect_system


Entry: Componend Based Software Engineering
Date: Fri Oct  9 16:44:58 CEST 2009

I'm trying to reconnect to main stream OO-world terminology.  The
discipline of component-based software engineering[1] talks about
interfaces and dependencies.  Translating to PLT Scheme's large scale
compositional structure (ignoring the small-scale class / inheritance
/ mixin functionality) this reflects two parts.

Both require and provide interfaces and represent compilation units

  - units: cyclic, no macros

  - modules: acyclic w. macros

Presence of macros makes module compilation follow the dependency
graph (functionality defined in one module might influence language
semantics of other).  Modules dependencies follow a directed acyclic
graph.

Units are more like the components in CBSE.


[1] http://en.wikipedia.org/wiki/Component-based_software_engineering


Entry: OpenComRTOS
Date: Mon Oct 12 11:49:06 CEST 2009

Best point of entry is the white paper[1].  The kernel is
communication based, and has several semantic levels:

 L0 Priority based pre-emptive multitasking (packets & ports).
 L1 Higher level RTOS: semaphores, events, queues, resources.
 L2 Dynamic features: code mobility, ...

The system is modular with kernel & drivers implemented as tasks.


[1] http://www.altreonic.com/sites/default/files/Whitepaper_OpenComRTOS.pdf


Entry: Java vs. Other
Date: Wed Oct 14 14:14:57 CEST 2009

* primitive types

Primitive types are not objects, however they can be wrapped as such.
int -> Integer
float -> Float
double -> Double


* Java and functional programming

Maybe Java generics[1] is the best place to link FP and Java.
Generics are FP-style parametric polymorphism(PP).  This is different
from inheritance based polymorphism.  I.e. the most general container
without PP needs the least common denominator: the universal reference
type `Object'.  This however does not enforce that all elements in the
container are of a more specific type.  Generics allow a solution to
that by providing a generic template that can then be specialized to a
more specific type.  This is akin to C++ templates, except that Java
generics type-check in _parametric_ form, and all their
specializations are well-typed, in contrast with C++ templates which
type-check in expanded form using a heuristic type checker.


* OO vs FP in context of Java

FP: - algebraic data types      somewhat analogous to class hierarchies
    - parametric polymorphism   abstract collections
    - higher order functions    parameterize computational processes

The main difference is that in class-based, single dispatch OO,
methods are _tied_ to classes, while in FP this coupling is less
strict.

[1] http://www.javaworld.com/jw-02-2000/jw-02-jsr.html?page=2


Entry: Linearizability
Date: Sat Oct 24 10:28:13 CEST 2009

From [1]

  Linearizability provides the illusion that each operation applied by
  concurrent processes takes effect instantaneously at some point
  between its invocation and its response, implying that the meaning
  of a concurrent object's operations can be give by pre- and
  post-conditions.

Maybe better is The art of multiprocessor programming[2], chapter 3
about Concurrent objects.

Quiescent Consistency: any time an object becomes quiescent, then
execution so far is equivalent to some sequential execution of the
completed calls.

I.e.

  A---
   B----
    C--
         D--
..      .   ..

This can mean all possible permutations of A,B,C followed by D,
because there is no quiescent time among the invocations of A,B,C but
all of them are separated from D.

QC doesn't necessarily respect program order (= single thread's
sequential order)

Sequential Consistency: In any concurrent execution, there is a way to
order the method calls sequentially such that they (1) are consistent
with program order and (2) meet the object's sequential specification.

SC is _not_ composable.  QC and SC conditions are incomparable.

Linearizability: each method call should appear to take effect
instantaneously at some moment between its invocation and response.

L => SC.


[1] http://www.cs.brown.edu/~mph/HerlihyW90/p463-herlihy.pdf
[2] isbn://0123705916


Entry: Language oriented development
Date: Tue Oct 27 11:16:52 CET 2009

As expresses in [1]:

   ``As a result of my academic and professional training i have come
   to rely heavily on types as a development discipline. In fact, if i
   cannot devise a sensible type "algebra" for a given (application)
   domain then i feel i don't really have a good understanding of the
   domain. One way of seeing this from the Schemer point if view is
   that the deep sensibility embodied in the Sussman and Abelson book
   of designing a DSL to solve a problem is further refined by
   types. Types express an essential part of the grammar of that
   language.''

A strange pattern emerges when you think of it in this way:

  1.  ML and its algebraic data types (ADT) designed as a meta
      language, a system to represent another formal language.

  2.  Haskell: Types and lambda calculus as the only vehicle for
      writing any kind of computer program.

  3.  Thinking about programming as writing a programming language for
      describing a problem.

It's so simple and straightforward.


[1] http://www.haskell.org/pipermail/haskell-cafe/2008-April/041239.html


Entry: LL and LR parsing vs. binary protocol design
Date: Mon Nov  9 08:21:38 CET 2009

Both LL(k) and LR(k) have finite lookahead and no backtracking.  This
means that the parsing decisions need to be made based only on the
next k input symbols.

LL is top-down (recursive descent) and LR is bottom-up (recursive
ascent).  In their simplest forms (only one state?) they correspond to
a prefix and a postfix language.  This is useful for building
serialization protocols optimized for minimal parser complexity,
i.e. to run in hardware or small 8-bit uCs.

I've determined 4 important design decisions[1] for a protocol:

 - delimited vs. prefixed token stream
 - representing quotation + construction tokens
 - bottom-up (postfix) or top-down (prefix) structure
 - constructor arity tagging


[1] entry://../libprim/20091107-113002


Entry: Peter Landin
Date: Mon Dec 21 22:17:56 CET 2009

- functional programming languages
- domain-specific languages
- syntactic sugar
- SECD machine
- function closures
- program closures (continuations)
- streams
- connection between streams and coroutines
- delayed evaluation
- partial evaluation
- circularity to implement recursion (tying the knot)
- graph reduction
- sharing
- strictness analysis
- where expressions
- disentangling nested applications into where expressions


[1] http://www.vimeo.com/6638882


Entry: Two kinds of optimizations
Date: Sun Jan  3 13:01:13 CET 2010

Let's see if I can find the quote again: There are only two kinds of
optimizations:

  * Not performing the work (yet), i.e. performing it lazily at
    run-time, or eliminating it at compile-time.

  * Performing the work only once and reusing the result.
    I.e. run-time memoization and compile-time evaluation.

I think this was attributed to Mich Wand by Dave Herman, but I can't
find the reference.


Entry: Java and CPS vs callbacks.
Date: Tue Jan  5 10:45:52 CET 2010

I'm working on Android lately, which has a lot of asynchronous message
passing going on.  Using this without anonymous classes is a pain: the
alternative is to extend the calling class with callbacks implementing
a particular callback interface.

It's much easier to use anonymous objects.  This is essentially CPS:
call a function, and provide a context it needs to invoke whenever it
sends it reply.

What this really shows me is the arbitraryness of designing with
objects and classes.  I think I understand why ``patterns'' are so big
in OO: they are essentially an informally specified set of rules to
adhere to to not get bogged down in mind numbing low-level decisions.

However, the patterns are in the design doc, not in the source code,
and the programmer is supposed to recognize them, looking past the
boilerplate code.

In Functional programming this is less so.  It seems that it's easier
to abstract away boiler plate code: just add yet another higher order
function.


Entry: Avi Bryant: Don't build stuff for developers
Date: Sat Jan  9 00:27:33 CET 2010

If you want to use all the cool stuff, don't build stuff for
developers, because they will get in your face about it.


[1] http://2010.cusec.net/01-08/from-cusec-2009-avi-bryant-bad-hackers-copy-great-hackers-steal/


Entry: Applicative Functor
Date: Sun Jan 10 11:40:33 CET 2010

To make things more intuitive I'm calling the parameterized data types
that are members of the type class Functor and Applicative
``collections''.  (I find a fixed-size array most intuitive.)

Functor: A functor is a collection that supports an operation `fmap'
which maps A SINGLE transformation of elements to a transformation of
collections.

  class Functor f where
    fmap :: (a -> b) -> f a -> f b

Applicative: An applicative functor is a collection that supports the
operation `<*>' which maps A COLLECTION of transformations to a
transformation of collections.

  class (Functor f) => Applicative f where
    pure  :: a -> f a
    (<*>) :: f (a -> b) -> f a -> f b

In addition a function `pure' is required that wraps an element into a
collection.  The operations need to satisfy some laws:

  pure id <*> v = v                            -- Identity
  pure (.) <*> u <*> v <*> w = u <*> (v <*> w) -- Composition
  pure f <*> pure x = pure (f x)               -- Homomorphism
  u <*> pure y = pure ($ y) <*> u              -- Interchange


In [3] it is mentioned that this can be used for side-effects -- hence
the name `pure'.  I don't quite get that.


[1] http://en.wikibooks.org/wiki/Haskell/Applicative_Functors
[2] http://learnyouahaskell.com/functors-applicative-functors-and-monoids
[3] http://www.soi.city.ac.uk/~ross/papers/Applicative.html

Entry: Recursive make Considered Harmful
Date: Tue Jan 12 08:38:55 CET 2010

The general idea: don't (artificially) break up the dependency graph
in separate components.  If there are _any_ dependencies between
different components of a project, a Makefile best describes these in
a central way.

What I've learned: some key tricks are based on the fact that make is
_string based_, and that := and = have different meaning: the first is
strict expansion - it is evaluated immediately, and the second is
deferred expansion - it saves the string literally, expanding only
when triggered trough strict expansion or evaluation of a rule.

From [2]:

  Basically, what's needed to tackle these issues is a variable that
  tracks the 'current directory' while the source tree is traversed
  and makefile fragments are included. This variable can then be used
  in describing dependency relations in a relative fashion, and in the
  include path for the compiler in build recipes.

As far as I understand: using strict assignment (:=) you create some
mutables state during inclusion of rule files.  Any rules that use
these variables get immediately expanded to strings and stored to
build the dependency graph.


[1] http://miller.emu.id.au/pmiller/books/rmch/
[2] http://www.xs4all.nl/~evbergen/nonrecursive-make.html
[3] http://aegis.sourceforge.net/auug97.pdf


Entry: Beautiful differentiation
Date: Wed Jan 13 08:37:20 CET 2010

I'm trying to understand this[1] magnificent paper by Conal Elliott.
Click trough for a video presentation from ICFP 2009.

The basic idea is that by using an abstract, recursive definition of
the chain rule and some clever function overloading, it is possible to
implement AD using very general high level code.

The code that can be found here[2] doesn't seem to have the most
general version of the chain rule (multiplication replaced by
composition with linear map).  Or do I miss something?

From Dif.hs:

-- The chain rule
infix 0 >-<
(>-<) :: (Num a) => (a -> a) -> (Dif a -> Dif a) -> Dif a -> Dif a
f >-< d = \ p@(D u u') -> D (f u) (d p * u')


[1] http://conal.net/papers/beautiful-differentiation/
[2] http://conal.net/blog/posts/beautiful-differentiation/


Entry: Can functional programming be liberated from the von Neumann paradigm?
Date: Fri Jan 15 08:02:42 CET 2010

I like some of Conal Elliott's ideas, especially his critique on IO in
Haskell[1].

The post is about composability.  I.e. a "program" with IO is an
artificial notion.  Really, the unit of composition on that level
should be the module (a bunch of functions and data structures
augmented with static meaning).  Can the "program" be eliminated?

I love this:

  Roly Perera noted that

    ... you never really need to reach `the end'. (It really is all
    about composition.)

  To extend an example in the previous post, after numbers, then
  strings, and then pixels, are phosphors the end? (Sorry for the
  obsolete technology.) After phosphors come photons. After photons
  comes retina & optic nerve. Then cognitive processing (and emotional
  and who-knows-what-else). Then, via motor control, to mouse,
  keyboard, joystick etc. Then machine operating system, and software
  again. Round & round. More importantly, our interactions with other
  wetware organisms and with our planet and cosmos, and so on.

  Roly added:

    What looks like imperative output can be just what you observe at
    the boundary between two subsystems.

  Which is exactly how I look at imperative input/output.

[1] http://conal.net/blog/posts/can-functional-programming-be-liberated-from-the-von-neumann-paradigm/


Entry: Functional programming with GNU Make
Date: Tue Jan 19 08:24:53 CET 2010

GNU make's "call" is Scheme's "apply".  However, I don't see if
lexically nested procedures are possible.  I guess this would require
proper quoting/unquoting (functions are represented as strings).

More specifically

VAR = value    <->  (define VAR (lambda () value))
VAR := value   <->  (define VAR value)

The former is a recursively expanded, while the latter is a simply
expanded variable.  The analogy with a lambda thunk isn't completely
correct, as it depends on how is applied.  I.e.

VAR = ... $(1) ... $(2) ...  <-> (define VAR (lambda (v1 v2) ... v1 ... v2 ...))

To invoke : $(call VAR,arg1,arg2)


[1] http://lambda-the-ultimate.org/node/85
[2] http://okmij.org/ftp/Computation/Make-functional.txt
[3] entry://20100112-083855


Entry: Common expression elimination
Date: Fri Jan 22 14:48:58 CET 2010

Here is a pattern I ran into today.  It is somewhat related to loop
exchange and memoization.  Translated to a Scheme program
transformation problem it is:


(begin
  (let ((a 1)
        (b 2)
        (c 3))
    ...)
  (let ((a 1)
        (b 7)
        (c 19))
    ...)
  (let ((a 1)
        (b 5)
        (c 3))
    ...))

->

(let ((a 1))
  (begin
    (let ((b 2)
          (c 3))
      ...)
    (let ((b 7)
          (c 19))
      ...)
    (let ((b 5)
          (c 3))
      ...)))

I.e.: if all the bindings of `a' are the same, pull them out.

This is useful for presenting a hierarchical view of a database table.

The solution is straightforward, especially if this needs to be done
over only one level: transpose the nesting and pull out rows that have
the same values.

Now, the interesting thing is that the table view comes from a
different hierarchical nesting that's been flattened as the variables
visible in the deepest nesting level.  So then this gives a way to
invert certain nested namespaces into another nesting.


Entry: Relational lenses and partial evaluation (generating VMs)
Date: Sat Jan 23 11:45:17 CET 2010

Is it possible or feasible to formulate the specification of a machine
such that different optimizations can be described and implemented by
an automated procedure?

I.e.  A straightforward example is to use a cons list as a value rib
during the evaluation of the expressions in a `let' form.  This has
the advantage of maximal sharing in the case a continuation is
captured during the evaluation of one of the expressions, i.e. in <*>
below:

(let ((a e_a)
      (b e_b)   ;; <*>
      (c e_c))
  (...))

Using a vector to represent a value rib imperatively is more
efficient, but requires a copying operation on let/cc to avoid the
mutation to have observable side-effects.

The value rib is conceptually a part of the environment.  The same
argument goes for the activation stack.

I think I ran into this before, and the pattern is called "lazy stacks".


I guess the advantage of a CPS representation is then that there is
only one kind of stack: the environment stack.


Entry: Convert a static library to a shared library
Date: Sun Jan 24 13:18:52 CET 2010

Suppose foo.a is made of bar.o baz.o

I keep on running into this problem:

    gcc -shared foo.a -o libfoo.so

is not the same as

    gcc -shared bar.o baz.o -o libfoo.so

In the first example, all the objects in foo.a are ignored because
nothing depends on them!


Entry: Representing control: a study of the CPS transformation.
Date: Mon Jan 25 08:22:17 CET 2010

By Danvy and Filinsky[1].  This seems to be an important work to
understand where ANF and `shift' and `reset' come from.

Main property of CPS term = independence of evaluation order.  I.e. it
is a sequential program.

[1] http://www.cs.tufts.edu/~nr/cs257/archive/olivier-danvy/danvy92representing.ps.gz


Entry: Higher order functions in Java
Date: Mon Jan 25 10:17:12 CET 2010

The simplest approach seems to be to use a 'forEach' function for each
class that lifts a function.  I.e. partially applied map/fold.

Currently, my main concern is to abstract database queries in Android.
There are essentially 3 main strategies to do this:

         - Eagerly convert to concrete lists/arrays/...

         - Iterator

         - universal traversal function (left fold with termination)

For database traversal the Iterator abstraction isn't very good as it
doesn't have a close() method, which might leak resources.  The other
two are fine.

Inversion of control ala `shift' and `reset' doesn't seem to be
straightforward to emulate, so this leaves forEach and lists, which
can be generated from forEach.


Entry: Type aliases for Java Generics
Date: Mon Jan 25 10:33:33 CET 2010

[1] http://stackoverflow.com/questions/683533/type-aliases-for-java-generics


Entry: Indulge yourself: Scheme literature
Date: Mon Jan 25 16:04:13 CET 2010

From [1]:

  Indulge yourself:

  http://library.readscheme.org/

  The must reads are Keny Dybvig's thesis, "Three Implementations of
  Scheme". The original lambda papers can wait until you read the
  Orbit paper, an optimizing compiler for T by Kranz, Rees et al.

  The Lisp implementation bibliography pretty much runs through PL
  research like a vein. Some of the stuff you must read for Lisp are
  typically in "books"; Christian Quinnec's Lisp in Small Pieces is
  the most important work, but you will need a good foundation in
  denotational semantics (you can get by with the one chapter in the
  little book by Nielson and Nielson, "Semantics with Applications: A
  Formal Introduction".

  Somewhere in there you will brush against various compilation
  methods and IRs for the lambda calculus, most importantly
  continuation passing style. Most semantics text introduce lambda
  calculus and its three rules, but none go in depth into this like
  the tall green book by Andrew Appel, "Compiling with Continuations",
  a good chunk of which can be read in Appel's other papers. Appel's
  work is MLish in nature, but don't let that stop you; most
  optimizing Lisp compilers are MLish down underneath anyway. CMUCL
  does very good type inference but gets short of implementing a full
  Hindley-Milner. Felleisen et al's "The Essence of Compiling with
  Continuations" might also come handy, though it's heavy on the
  theory. Andrew Kennedy continues the saga with "Compiling with
  Continuations Continued", this time CPS gives way to A-Normal Form,
  another IR. He describes the techniques used by a compiler targeting
  .NET.

  Most compiling "meat" can be found in the bits-and-bytes type
  papers. Wilson's GC bibliography "Uniprocessor Garbage Collection
  Techniques" is a must, it should have been called "What Every
  Programmer Should Know About Garbage Collection". Not to be confused
  with Richard Jones' "the Garbage Collection Bibliography". Boehm's
  "Garbage Collection in an Uncooperative Environment" is sheer
  hacking bravado, perhaps second only to "Pointer Swizzling at Page
  Fault Time", which should introduce you to memory management for
  disk-based heaps (i.e. object stores) among other things.

  Your start in hacking runtimes will probably be David Gudeman's
  "Representing Type Information in Dynamically Typed Languages"; this
  is where you learn how stuff looks inside the computer when you no
  longer need to malloc and free. A previous hacking of a Pascal
  dialect prepared me for this wonderful paper.

  Implementations of runtimes are documented by Appel, for SML/NJ,
  Robert MacLaclahn's "Design of CMU Common Lisp" (also perhaps Scott
  Fahlman's CMU report on CMUCL's precursor, "Internal Design of Spice
  Lisp", but that confused the crap out of me as I don't know the
  machine architecture they're talking about.) You will also enjoy the
  Smalltalk research starting with L. Peter Deutsch's first optimizing
  Smalltalk compiler, documented in "Efficient Implementation of
  Smalltalk-80", follow the Smalltalk lineage btw, all they way up to
  David Ungar's "The Design and Evaluation of a High Performance
  Smalltalk System" making sure NOT to ignore Self and its literature,
  also spearheaded by Ungar (Start your Smalltalk hacking career with
  Timothy Budd's "A Little Smalltalk", should take you about a weekend
  and will absolutely prepare you for dynamic languages; a similar
  system is described by Griswold and Griswold, compiler, intermediate
  representation and VM, but that one is for ICON.)

  Dynamic type inference and type-checking (TYPEP and SUBTYPEP,
  CLASS-OF, INSTANCE-OF, etc) you can learn a good chunk of how CLOS
  should look like to the runtime system from Justin Graver's
  "Type-Checking and Type-Inference for Object Oriented Programming
  Languages". He scratches the surface, and you should supplement this
  with a selection from Smalltalk and Self, though neither will
  prepare you for multiple-dispatch, for that peer into Stanley
  Lippman's "Inside the C++ Object System".

  I have deliberately avoided "classics" on Lisp, compiler
  construction, optimization, and other stuff. None of the books and
  papers I have recommended are as popular as SICP, PAIP, or AMOP. Or
  even the popular PL books, like EoPL, van Roy and Haridi, both of
  which you should read by the way, but they're stuff that you need to
  read and understand to be able to implement a practical Lisp
  implementation, or at least satisfy your curiosity.


More here: http://www.reddit.com/r/programming/comments/9220o/ask_proggit_recommender_a_compsci_paper_for_me_to/

[1] http://news.ycombinator.com/item?id=835020
[2] http://news.ycombinator.com/item?id=834175


Entry: Scheme compilers
Date: Mon Jan 25 16:44:09 CET 2010

Slava Pestov[1]:

  If you compare performance on benchmarks, then Gambit-C and Ikarus
  are closer to the performance of C, whereas PLT Scheme is a bit
  faster or equal to Python. I prefer the design of Ikarus over
  Gambit-C. Compiling to C seems like a big hack on the other
  hand. Ikarus reminds me of SBCL in a lot of ways, and SBCL's
  compiler is one of the best dynamic language compilers of all
  time. Another nice Scheme compiler is Larceny[2]. The source is very
  easy to read, and if you haven't seen a compiler that uses ANF as
  intermediate representation its worth checking out.

[1] http://www.reddit.com/r/programming/comments/9tek5/were_learning_scheme_in_our_introduction_to/
[2] http://www.ccs.neu.edu/home/will/Larceny/overview.html
[3] http://ikarus-scheme.org


Entry: Open, extensible object models
Date: Sat Jan 30 11:43:26 CET 2010

Everything dynamic.

A very nice video presentation here[3], slides and other info here[4].

[1] http://piumarta.com/software/cola/objmodel2.pdf
[2] http://piumarta.com/software/cola/
[3] http://www.youtube.com/watch?v=cn7kTPbW6QQ
[4] http://www.stanford.edu/class/ee380/Abstracts/070214.html


Entry: Name my recursion pattern
Date: Sat Feb 13 14:37:50 CET 2010

How is this called:

1. start with a list: [a] and a context c
2. for each a <- [a], map (a,c) -> (a',c')
3. collect [a'] and c'

Functor? Applicative functor? Monad? Arrow?

  a0    a1  ...
  |     |
  v     v
  s0 -> s1  ...
  |     |
  v     v
  b0    b1  ...


Does it fit in one of the following?


class Functor f where
   fmap :: (a -> b) -> f a -> f b


lass (Functor f) => Applicative f where
   pure  :: a -> f a
   (<*>) :: f (a -> b) -> f a -> f b


Entry: Recursion and Co-recursion for filters (s,a) -> (s,b) on a list [a]
Date: Sat Feb 13 17:54:06 CET 2010


-- CORECURSIVE

-- The intermediate/end state s is never observed, so [a] can be infinite.

iimap :: ((s,a) -> (s,b)) -> s -> [a] -> [b]
iimap fn = f
    where
      f s [] = []
      f s (a:as) = let (s',b) = fn (s,a) in b:(f s' as)


-- RECURSIVE

-- State s can be observed, so [a] needs to be finite.  This can't be
-- written as co-recursion, so we write it as a fold where the results
-- are accumulated in reverse.

iifold :: ((s,a) -> (s,b)) -> (s,[a]) -> (s,[b])

iifold fn (s,as) = f s as []
    where
      f s [] bs = (s, bs)
      f s (a:as) bs = let (s',b) = fn (s,a) in f s' as (b:bs)


integrate (s,a) = let s' = a+s in (s',s')


-- In the RECURSIVE pattern, the fact that the `bs' accumulator is a
-- list is irrelevant.  In the CORECURSIVE pattern, the fact that the
-- result is a list is essential: it is the recursion inside the list
-- constructor that makes it possible to present iimap with infinite
-- data.


Entry: Haskell pointer equality
Date: Sun Feb 14 11:29:56 CET 2010

import System.IO.Unsafe
import System.Mem.StableName

ptrEqual :: a -> a -> IO Bool
ptrEqual a b = do
  a' <- makeStableName a
  b' <- makeStableName b
  return (a' == b')

termRefEq :: (Eq a) => (Term a) -> (Term a) -> Bool
termRefEq x y = unsafePerformIO $ ptrEqual x y


Entry: Typeful symbolic differentiation of compiled functions
Date: Wed Feb 17 11:20:43 CET 2010

The interesting part about this[1] paper is the `reflect' function:

     > class Term t a | t -> a where
     >     reflect :: t -> a -> a
     >
     > newtype Const a = Const a deriving Show
     > data Var a      = Var     deriving Show
     > data Add x y    = Add x y deriving Show
     > newtype Sin x   = Sin x   deriving Show
     >
     > instance Term (Const a) a where reflect (Const a) = const a
     >
     > instance Term (Var a) a where reflect _ = id
     >
     > instance (D a, Term x a, Term y a) => Term (Add x y) a
     >     where
     >     reflect (Add x y) = \a -> (reflect x a) + (reflect y a)
     >
     > instance (D a, Term x a) => Term (Sin x) a
     >     where
     >     reflect (Sin x) = sin . reflect x

     ... This is the straightforward emulation of GADT. The function
     `reflect' removes the `tags' after the symbolic
     differentiation. Actually, `Sin' is a newtype constructor, so there is
     no run-time tag to eliminate in this case.


Note that the different kinds of terms are types collected in a class,
not constructors collected in a type.  This is compile-time type
dispatching vs. run-time tag dispatching.

This is an interesting trick, and also seems to be at the basis of a
lot of later work.  So let's have a look at this GADT trick in
isolation [2][3].


[1] http://okmij.org/ftp/Haskell/differentiation/differentiation.lhs
[2] http://okmij.org/ftp/ML/#GADT
[3] http://lambda-the-ultimate.org/node/1293


Entry: Generalized Algebraic Data Type (GADT)
Date: Thu Feb 18 08:10:18 CET 2010

From WikiPedia[2]:

    ... the parameters of the return type of a data constructor can be
    freely chosen when declaring the constructor, while for algebraic
    data types in Haskell 98, the type parameter of the return value
    is inferred from data types of parameters;

From HaskellWiki[3]:

    Generalised Algebraic Datatypes (GADTs) are datatypes for which a
    constructor has a non standard type.

and [4] explains it in mortal:

    why Haskell don't yet supports full-featured type functions? Hold
    your breath... Haskell already contains them and at least GHC
    implements all the mentioned abilities more than 10 years ago!
    They just was named... TYPE CLASSES!

Together with the multiparam typeclass extension this gives a way to
represent quite powerful type functions.

    (for "data" constructs) Lack of pattern matching means that left
    side can contain only free type variables, that in turn means that
    left sides of all "data" statements for one type will be
    essentially the same. Therefore, repeated left sides in
    multi-statement "data" definitions are omitted and instead of

    data Either a b = Left a
         Either a b = Right b

    we write just

    data Either a b = Left a
                    | Right b

    ... And here finally comes the GADTs! It's just a way to define
    data types using pattern matching and constants on the left side
    of "data" statements! How about this:

    data T String = D1 Int
         T Bool   = D2
         T [a]    = D3 (a,a)

    Amazed? After all, GADTs seems really very simple and obvious
    extension to data type definition facilities.


It seems the main trick that GADTs facilitate is to replace
constructor tag matching for sum types with type-level pattern
matching.  This moves information from run-time to compile time and so
can provide more safety.


[1] http://lambda-the-ultimate.org/node/1293
[2] http://en.wikipedia.org/wiki/Generalized_algebraic_data_type
[3] http://www.haskell.org/haskellwiki/GADT
[4] http://www.haskell.org/haskellwiki/GADTs_for_dummies


Entry: How to explain Monads
Date: Mon Feb 22 15:47:33 CET 2010

Best one until now[1].

Given a couple of functions a -> m b where m is a type constructor,
how can one construct a composition of these functions given that the
wrapped types b correspond to the input type a of the next function?

Solution:  Impose two requirements:

  - m is a Functor, i.e. it has a map function that can lift  a -> m b
    to  m a -> m (m b)

  - there is a function   join : m (m a) -> ma  that can combine layers

The function that computes a Kleisli composition is:

  (>=>) :: Monad m => (a -> m b) -> (b -> m c) -> a -> m c


In this picture a Comonad is also simple to understand.

Given a couple of functions w a -> b where w is a type constructor,
how can one construct a composition of these functions given that the
wrapped types a correspond to the output type b of the prvious
function?

Solution:  Impose two requirements:

  - w is a Functor, i.e. it has a map function that can lift  w a -> b
    to  w (w a) -> w b

  - there is a function   duplicate : w a -> w (w b) that can combine layers


So why does haskell use bind (>>=) and not join?  My guess is that
bind allows CPS-style code to look like assignments or list
comprehensions.  So from a _usage_ point of view, bind seems more
natural than join.


[1] http://blog.sigfpe.com/2006/06/monads-kleisli-arrows-comonads-and.html


Entry: Applicative programming with effects.
Date: Wed Feb 24 21:16:14 CET 2010

I'm reading [1] again.  It's about ``pure functions applied to funny
arguments.''

Some aha's of a concrete-minded Schemer:

 * map in Scheme is map, zipWith, zipWith2, ... in Haskell.
   These need to be different functions as they have different
   type signatures.

 * The S and K combinators are `ap' and `return' from the environment
   (reader) monad.  The paper says that S & K are ``designed for this
   purpose''.  That's the first time I hear this.  But surely, looking
   at S indeed it applies proto-function and proto-argument to an
   environment, and applies the resulting function to the resulting
   argument.


Now the concrete-minded mind needs to take a distance from looking at
a functor as a data structure over which one maps a function
piecewize, and instead see it as a computation.  Best to start with
monads, as each monad is a AF.

What I don't quite get is this idea of "pure function & effects" where
the <*> operator combines effects and the `pure' operator lifts a pure
function into the effectful domain.


Let's start with

  sequence :: (Monad m) => [m a] -> m [a]

just as in the paper.  This function takes a list of computations and
produces a list of results, threading the monadic effect.

  sequence [] = return []
  sequence (c:cs) = do
     x <- c
     xs <- sequence cs
     return (x:xs)

Which can be written differently as:

  sequence [] = return []
  sequence (c:cs) = pure (:) <*> c <*> sequence cs

The paper then generalizes this to `traverse'.  The key point being
that the recursive call is _inside_ the effectful world.

Now I can't bridge this explanation with the type signature of an
applicative functor:

  pure  :: x -> a x
  (<*>) :: a (x->y) -> a x -> a y

Probably for the same reason that I couldn't see this for Monads in
the beginning.  My intuitive confusion there was that I was looking at
`a' as a data constructor and a type constructor at the same time.

To state the (now) obvious: The two lines above are part of a class
definition `Applicative a', where `a' is a type variable of kind * ->
*, I.e.  a parameteric type with one parameter.  The `Applicative' is
(like) a predicate on type variables.


[1] http://www.cs.nott.ac.uk/~ctm/IdiomLite.pdf


Entry: The Actor model is not composable
Date: Wed Mar  3 11:23:41 CET 2010

I recently ran into this in practice (in Java): a system with a lot of
message passing suddenly needed a sequential behaviour.  This then
lead me to remove all synchronous message passing and use semaphores
instead.

It talks about something I've tried to hint at earlier:

   * FP allows for ``exponential'' expressivity: since everything can
     compose with everything else, the total number of expressable
     behaviours grows very fast.

   * Stateful programming allow for ``linear'' expressivity: you can
     add stuff, but in general it can't be combined as-is with other
     code.

This is of course a black&white picture, but it seems to be true in
spirit: it's harder to reach exponential expressivity in stateful
languages exactly because of the inertia present in state -- hidden
assumptions as mentioned in [1].

[1] http://pchiusano.blogspot.com/2010/01/actors-are-not-good-concurrency-model.html


Entry: Arrows
Date: Sun Mar  7 10:14:27 CET 2010

To understand arrows is to understand their basic combinators[1].  As
a concrete example one could think of arrows as a generalization of
functions.

> instance Arrow (->) where
>   arr f    = f
>   f >>> g  = g . f
>   first  f = \(x,y) -> (f x,   y) -- for comparison's sake
>   second f = \(x,y) -> (  x, f y) -- like first
>   f *** g  = \(x,y) -> (f x, g y) -- takes two arrows, and not just one
>   f &&& g  = \x     -> (f x, g x) -- feed the same input into both functions

[1] http://en.wikibooks.org/wiki/Haskell/Understanding_arrows


Entry: Simply Typed LC is Strongly Normalizing
Date: Sun Mar  7 21:19:03 CET 2010

From [1]:

  Given the standard semantics, the simply typed lambda calculus is
  strongly normalizing: that is, well-typed terms always reduce to a
  value, i.e., a lambda abstraction. This is because recursion is not
  allowed by the typing rules.  Recursion can be added to the language
  by either having a special operator of type (a->a)->a or adding
  general recursive types, though both eliminate strong normalization.
  Since it is strongly normalizing, it is decidable whether or not a
  simply typed lambda calculus program halts: it does! We can
  therefore conclude that the language is not Turing complete.

In Haskell and OCaml the type inferencer doesn't like construction
of infinite types:

:t (\x -> x x) (\x -> x x)

To express the Y combinator in this direct form requires it to be
wrapped in a recursive type[3] (quotes from [2]):

  The problem with fix f = (\x -> f (x x))(\x -> f (x x)) is that one
  needs a solution to the type equation b = b -> a. Fortunately this
  can be done with Haskell’s data types.

> newtype Mu a = Roll { unroll :: Mu a -> a }
> fix f = (\x -> f ((unroll x) x)) (Roll (\x -> f ((unroll x) x)))

  Of course, this is just an academic exercise. To actually define a
  fixpoint combinator in Haskell, one would use recursive definitions.

I.e. the Y combinator can be defined directly in its recursive form:

> y f = f (y f)

[1] http://en.wikipedia.org/wiki/Simply_typed_lambda_calculus
[2] http://r6.ca/blog/20060919T084800Z.html
[3] http://en.wikipedia.org/wiki/Recursive_type
[4] http://en.wikipedia.org/wiki/System_F


Entry: System F vs. Hindley–Milner
Date: Sun Mar  7 22:11:15 CET 2010

Hindley-Milner[2] is a restricted form of System F[2].  Type inference
for HM is decidable while for System F it is not.

[1] http://en.wikipedia.org/wiki/System_F
[2] http://en.wikipedia.org/wiki/Hindley–Milner


Entry: Writing compilers
Date: Sun Mar  7 22:56:47 CET 2010

The basic idea is that a compiler is a function, i.e. implemented by a
collection of rewrite rules, that relates one syntactic domain to
another.

In order to verify correctness of this ``arrow'' the language needs a
formal model - another arrow - to compare it with.

                comp
    language   ----->  machine code

       |                   |
       | definitional      | machine code
       | interpreter       | interpreter
       v                   v

   abstract domain (i.e. Haskell functions)

That's my idea.  How does [1] do it?  In each pass (14 passes and 8
intermediate languages) the semantics of the language is defined, and
the transformation is proven to preserve the semantics.

I am only interested in one pass: the reductions used in the PIC18
instruction selection.  It looks like I need a Hoare logic[2] for the
machine opcodes, i.e. some kind of typing rules for the state
transitions, and how they can be composed.

So, my question is: in the spirit of tagless interpreters, is it
possible to write down the Staapl PIC18 reduction rules such that:

  * PIC18 Instruction semantics can be encoded in the type as much as
    possible.  I.e. data dependencies.

  * Remaining semantics (i.e. add or sub) is encoded in behaviour, and
    can be verified by quickcheck.

Doing this in Scheme as an alternative to encoding it in Haskell/OCaml
type system is probably also possible, but I would need to make a
checker/inferencer.


[1] http://compcert.inria.fr/doc/index.html
[2] http://en.wikipedia.org/wiki/Hoare_logic


Entry: State breaks composition
Date: Sat Mar 13 12:14:53 CET 2010

More correctly it breaks composition of _function_ and requires the
less powerful composition of _instance_, which is a combination of
function and state (i.e. an object).

In Haskell, stateful computations are modeled as state _transitions_
which keeps the model purely functional.

                           s -> (a,s)

One builds a program as a (large!) state transition built from
primitive state transitions, and then supplies it with an initial
state to set it in motion.  This keeps composition by abstracting out
state completely.

External (physical) state is captured in much the same way by the IO
monad[3], which relates function composition and real-world state.
This synchronizes external at key points with intermediate nodes in
the evaluation of a function network.

             type IO a = RealWorld -> (a, RealWorld)

Notes:

  * Also in the more theoretical work about complexity theory one uses
    this machine->network transition.  In the first lecture in [2] the
    link is made between turing machines (TM) and combinatorial
    networks (CN) by ``unrolling'' the turing machine in time to yield
    a network.  The main point being that by mechanical translation, a
    polynomial time TM algorithm can be converted into a polynomial
    size CN.  The converse does not seem to be proved, but in practice
    one can usually find a TM for a CN. (``morally equivalent'')


[1] http://en.wikipedia.org/wiki/Flip-flop_(electronics)
[2] http://sms.cam.ac.uk/collection/545358
[3] http://www.haskell.org/haskellwiki/IO_inside


Entry: Ziggurat
Date: Sun Mar 14 14:49:41 CET 2010

The example from fig1+2 in [1].

;;; Fig 1: Creating classes and objects

;; Real number objects are described by a pair of integers (m . e)
;; where the value x is determined by x = m * 10^e
(define real-class (make-top-class))

;; Integer objects are described by a single integer; to instantiate
;; as a real number, use an exponent of 1.
(define int-class
  (make-class
   (lambda (x) (make-object real-class
                            (cons x 0)))))

;;; Fig 2: Creating methods

(declare-method (num->string n))

(method real-class num->string
        (lambda (n)
          (let ((data (view real-class n))
                (mant (car data))
                (exp  (cdr data)))
            (format "~sE~s" mant exp))))

(method int-class num->string
        (lambda (n)
          (let ((snum (view int-class n)))
            (number->string snum))))

;; Methods are functions that take the object as an argument.  The
;; `view' form returns the internal representation of an object.


I don't understand: why does int-class call (make-object real-class
...) while it still has access to the integer?

I don't get the paper.  I find no point to hook on.  Maybe some code
and interaction would help?


[1] http://www.ccs.neu.edu/home/dfisher/icfp06-ziggurat.pdf


Entry: Mathematical Logic and Compilation
Date: Wed Mar 17 08:37:48 CET 2010

I'm trying to get some intuition straight about specification by
compiler.  I don't find what I'm looking for on the web, so I guess
it's ``obvious'' ;)

In formal mathematics and logic, you have syntax (s) and inference
rules (s1,s2,... -| s) that allow the construction of new syntax.

To alleviate the tedium of working with such a low-level substrate,
one allows the construction of (informal, finitistic) mathematical
structure that talks about manipulation of formula and proofs.
I.e. that talks about the existence of proofs using constructive
methods: algorithms to construct a proof.

Now, in compiler construction one works the other way around:

  * One starts with a physical model (i.e. an electronic circuit).
    This can be abstracted by a mathematical (semantic) model.

  * The objective is then to derive a formal system (syntax and code
    transformation rules) and an interpretation such that the semantic
    model is preserved.

I probably need to look at Model Theory[1] and Proof Theory[2].
( I need to grow some more hair on my chest. )


[1] http://en.wikipedia.org/wiki/Model_theory
[2] http://en.wikipedia.org/wiki/Proof_theory


Entry: Hardware Mapping
Date: Wed Mar 17 10:26:47 CET 2010

Let's start at the meta-level[1]:

         "What are the important problems in my field?"

In an attempt to make things more explicit, what are the important
problems, and why am I not working on them?  What am I actually doing?
What is my main goal?

    Decouple function from implementation in numerical processing.

This is translated to the following problem:

       1. express numerical processing in mathematics (the DSL)

       2. find a way to express hardware mapping

There are plenty of examples of 1, so not much re-invention is
necessary.  There are plenty of examples of 2 also, but this field is
really quite broad, and there are many design decisions to make.

All I've been doing in the last couple of years is to learn about
languages and compilers, and while I did learn a lot, I'm still
struggling with making the target explicit.

My conclusion up to now is that the 1/2 distinction is better viewed
as a continuum, or at least a sequence of steps:

                  2       2'       2''
               1 ---> 1' ---> 1'' ----> 1'''

I knew for a longer time that I'm building a compiler, or more
specifically, a method for building multiple compilers.

What I'm starting to see now is that this is all about semantics and
proof.  When you move down the chain from specification to
implementation, you want to preserve meaning, or at least, preserve
meaning relative to a certain set of conditions that express
approximation.

It seems I've been looking through the wrong set of glasses.  The
aspect to focus in is really _correctness_ and not ease-of-use.

Actually, building a tower of languages is easy once you know what
problem to solve.  Writing DSLs becomes second nature with a bit of
practice.  Scheme, (Meta)OCaml and Haskell are all quite suited to do
the job.

However, while providing a lot of structure to eliminate silly
mistakes, these tools don't solve the correctness problem: ultimately
you need to define the beef as low-level computation (i.e. by pattern
matching).

The hard part is making sure that you preserve the intended semantics
facing a mountain of implementation details.

The real problem is managing those details, and replace them with a
structure that is ``obviously correct''.

I see it in the Staapl PIC18 compiler.  It consists of an ad-hoc set
of transformation rules that define the semantics of a concatenative
language in terms of generation and transformation of machine code.
And that is the _only_ thing it does.  There is nothing that is
somewhat structured to actually describe what the compiler is supposed
to do, and under what conditions it breaks.

So I'm getting really interested in correctness proofs[2], and in
adding static semantics to language towers[3].  Looks like I need to
start reading again.

As for the Staapl PIC18 compiler, I'm trying to asses if testing the
compiler by providing an ``obviously correct'' semantics (a reference
implementation) is good enough.  It is definitely more trustworthy to
have a correctness proof, but from a practical point of view, a test
suite with broad coverage might be sufficient.


[1] http://www.chris-lott.org/misc/kaiser.html
[2] http://compcert.inria.fr/doc/index.html
[3] http://lambda-the-ultimate.org/node/3179


Entry: The Arbiter Problem
Date: Fri Mar 19 23:53:06 CET 2010

I'm watching the interview with Leslie Lamport[1].  The recurring
subject is the arbiter problem.  Essentially: "which came first" is
not solvable in finite time in general as time differences approach
zero.

Then there's some mention about discrete vs. continuous, and time
differences and frequencies (non-discrete entities) used for
information representation in the brain.

Now this makes an old itch surface.  I'm far from being able to
express it, but it has to do with sigma-delta modulators (binary
representation of continuous signals) and cross-modulation of
near-equal square waves where arbitrary short pulses can arise.


[1] http://channel9.msdn.com/shows/Going+Deep/E2E-Erik-Meijer-and-Leslie-Lamport-Mathematical-Reasoning-and-Distributed-Systems/
[2] http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.88.4426&rep=rep1&type=pdf


Entry: Clojure
Date: Tue Mar 23 18:07:31 CET 2010

Enlightening video interview [1].  Clojure seems quite interesting for
the following reasons:

    - Persistent (shared) immutable data structures implemented using
      tries.  The tries make it techinically O(log(n)) but the very
      high branching factor make it close to constant-time in
      practice.

    - Some CAS-based synchronization and transaction operations.
      I.e. the `atom' and `ref' constructs.

    - Good interop with JVM libs

    - Almost hygienic macros?  No mention about this in the video
      though..


[1] http://channel9.msdn.com/shows/Going+Deep/Expert-to-Expert-Rich-Hickey-and-Brian-Beckman-Inside-Clojure/


Entry: Computation (pattern matching) vs. types
Date: Sat Mar 27 13:34:57 CET 2010

Some more non-obvious obvious stuff (NOOS ??).  Ha! I have some NOOS
for you!

Types (in the Haskell sense) give finite, static meaning to your
program.  All the other stuff to know about a program is about
1. loops (recursion) and 2. making decisions (conditionals) which are
necessary to get out of loops and produce terminating computations.

One of the things that really struck me when starting to use typed
programming languages is that types abstract away `if' statements.
You don't see their effect in a function type.  A program looks a lot
simpler when you can abstract away diverging control.

A bit less obvious to me at that time was that they also abstract away
recursion/loops.  You don't see in the types that a function recurses
and so possibly doesn't terminate.

In fact, that is what a type system is: what you can know about a
program (its structure) before running it.  Knowing its full behaviour
is `undecidable'; it can't be captured before running:

    - "infiniteness" comes from recursive functions acting on
      recursive data types = passing through the same control point
      more than once.

    - "decisions" come from conditionals like pattern matching =
      having run-time state influence future control points.

So "what happens inside types" can in general not be determined by the
type system.  Again, to turn this around, the type system says some
limitied thing about what happens at run time.  This limited thing --
the program's static structure -- is the "type".


Entry: Functional programs / stateful debugging
Date: Sat Mar 27 20:55:05 CET 2010

After some time with Haskell, I'm thinking and writing Scheme code
again.  Some state re-appreciation maybe.

One of the nice things to have is object pools in the form of weak
hashes.  When you have state (i.e. objects) it usually makes sense to
keep track of them to look at program behaviour on the side.

A simple approach is to always place objects of a certain kind in a
weak hash table, to pay them a visit and see how their doing, or to
inject some alternative states.


Entry: Bottom up vs. Top down
Date: Sun Mar 28 21:47:44 CEST 2010

As a programmer I am a bottom up person.  I like to know the details,
and build trustable and simple abstractions from the ground up.  I
slightly distrust top-down design.

In bottom up design, high level design elements usually emerge
spontaneously, and it is my impression that it is easier to "fix" a
bottom up design by feeding patterns back from top to bottom after
they have emerged, than it is to fix a top-down design by scraping
together abstractions to hide the structure-less details that are
pushed to the bottom.


[1] http://reprog.wordpress.com/2010/03/28/what-is-simplicity-in-programming-redux/


Entry: A History of Haskell: Being Lazy with Class
Date: Thu Apr  8 12:06:42 EDT 2010

about variable free programming:

  SPJ[1]: It's a bad idea; backus was wrong.  I tried that and I found
  myself doing a lot of plumbing.  Sometimes you really want to name
  that variable.

I think Oege de Moor also left this track calling this too abstract
(someone mentioned this on LtU).


about specific computer architectures:

  SPJ[1]: It's a bad mistake: 1. why interpret if you can compile?
  2. hardware industry moves so fast that it catches up easily to any
  specific optimizations.

[1] http://research.microsoft.com/en-us/um/people/simonpj/papers/history-of-haskell/


Entry: GUIs and modules
Date: Sun Apr 11 10:11:52 EDT 2010

Composability vs GUIs.  Libraries are nice and composable, an end-user
application isn't.  Is there a way to bridge this?  Pure Data comes
close, though it lacks generic expressiveness.


Entry: Databases and Normalization
Date: Sun Apr 11 15:26:22 EDT 2010

Some DB questions from the complete noob.  Suppose I have a relation
where all the variables are strings.  What do you call the operation
that replaces unique strings by identifiers, and creates a new
relation between the identifiers and the strings?

The reason to do so would be to reduce storage requirements and reduce
query computations, i.e. an identifier could be a 32bit or 64bit
number, instead of a larger string key.

In a functional store this would be object sharing by using pointers.

Is this indexing? ORM?


Entry: Functional Reactive Programming
Date: Sat Apr 17 10:25:04 EDT 2010

I'm building a (naive implementation of) an FRP evaluator to implement
the incremental update logic (compiler cache) of the ramblings
formatter for the http://zwizwa.be website.  Output events are server
http requests, while input events are database (file) changes.
Because file changes are infrequent compared to http requests, a
"compiled" representation where intermediate data are retained in a
cache works best.

The amazing part is that, yes, it is really all about composition.
And for composition, functions are king.

Once you have all logic abstracted as a collection of functions,
everything becomes a lot simpler to test individually and to string
together.

The reactive part then becomes a "toplevel" wrapped around a large
collection of pure functions that does the real work.  I.e. FP allows
complete separation of the functional and the reactive part.  This is
great.

The implementation uses lazy evaluation in the direction of functional
dependencies (data pull) combined with event-driven invalidation in
the reverse direction (data push).

In Scheme this can be implemented using weak hash tables; whenever you
apply a function to reactive values, notify each of the values that a
computation depends on it.  This can be done by associating each
reactive value with a weak hash table of values that depend on them.
Whenever a value gets invalidated, it can propagate invalidation to
all nodes that depend on it.  The weak table ensures that the GC still
works for reactive values.

The main abstraction then becomes function application, or more
specifically: application of pure functions to reactive values, in
zwizwa/plt/rv implemented as `rv-app'.


Entry: Lambda Calculus for Electrical Engineers
Date: Sat Apr 17 12:21:35 EDT 2010

If you look at the lambda calculus, it only ever talks about:

  * variable introduction or abstraction = make a socket

  * variable elimination or application = this plug goes in that socket

The fact that it uses variables is really not that interesting and
largely a consequence of paper being a flat medium.  I.e. the
"essence" of the LC needs to be embedded into something that can be
written on paper as a flattened graph.  First flatten the graph into a
tree by introducing variable names, then flatten the tree to a
sequence of symbols in the usual sense by introducing parenthesis and
precedence rules.

This horrible notation really makes it look bad and hides its true
simplicity.

Something clicked for me when I made this "paper serialization"
explicit in the way I looked at the LC, and seeing its intrinsic
beauty: an LC expression represents a directed a-cyclic graph of
computation modules.

Now _that_ is something that should make a lot of sense to an
electrical engineer thinking of wires and amplifiers.  Morale:

    The idea of "connectedness" that can be expressed by abstraction
    and application is tremendously powerful.


There is one problem however.  Variables in the lambda calculus
represent computations, meaning the only values are other lambda
terms.  In some sense it doesn't exist as there are no primitive
objects that are not computations themselves.  Thies defies intuition,
at least mine.  Computation should map things to things, not
computations to computations, right?

Typed lambda calculi fix this by introducing primitive types, for the
simple reason that without them, terms cannot be annotated with types.

One can argue that the due to the existence of primitives and
computations, the simply typed lambda calculus might be the best way
to introduce the lambda calculus in a more concrete way.


Entry: Abstract Machines and Semantics
Date: Sat Apr 17 12:22:28 EDT 2010

The LC is a formal system; there is nothing up the sleeves except for
formula on paper and ways to rewrite those formula.

Abstract machines with environments are a more concrete form used to
get something that behaves like one particular form of LC, the
call-by-value lambda calculus (CBV-LC).  Environments mainly allow
substitution to be delayed to improve efficiency.  I.e. you move from
CBV-LC's "global" substitution rule to a machine that implements its
substitution operation in a more low-level fashon, one step at a time.
The machine is still a formal system consisting of formula and ways to
rewrite those formula.

Once you have such a machine, you can start moving into different
directions.  One popular direction is to move away from the guiding
light of the CBV-LC you started from, and represent side effects such
as assignments and continuations because they are essentially right in
front of you, as part of the machine representation.

What is not so easy to understand is the other way around: how can a
particular abstract machine that has been soiled by locally
implemented side-effects be re-abstracted?  The subject of
denotational semantics seems to be mostly about this: how to
re-abstract things globally as functions that were easy to express as
machines locally.  These higher abstraction can then yield some
insights by making it possible to prove properties of a machine that
are completely intractable in the local machine view.


Entry: Stream Fusion
Date: Sun Apr 18 11:09:09 EDT 2010

About [1].  Simplified, there is a duality in the way sequences can be
approached: as lists data or as streams (an unfolding of the list, the
list's co-structure).

The natural operation over a list is a fold, while the natural
operation over a stream is an unfold.  A stream is represented as an
initial state and a stepper function.

Fusing co-structures: The key trick is that all stream producers are
non-recursive.  This is established by allowing a stream to produce
`Skip' values and moving the recursion to the `fold' part of a
pipeline.

Converting list ops to stream ops doesn't really perform any fusion on
its own.  However, it transforms the code into a form that is more
accessible to the Haskell deforestation optimizer as it has no
unnecessary recursion that ``blocks the view''.

[1] http://www.cse.unsw.edu.au/~dons/papers/CLS07.html


Entry: Clock Calculus
Date: Sun Apr 18 19:48:23 EDT 2010

In the SIGNAL paper[1] it is mentioned that a clock calculus is a
projection on the field Z_3: 0 = absence, 1 = true = presence, 2 =
false.  Weird..

The clock calculus allows then to statically verify the temporal
correctness of processes.

[1] http://www.springerlink.com/index/Y32277G7L8T61748.pdf


Entry: Polymorphy & Functors (lifting)
Date: Sun Apr 25 11:04:02 EDT 2010

One of the amazing new views that opened up for me after studying
Haskell is the ubiquitous presence of morphisms that take some
computation from one domain into a richer domain.

Often, this can be combined with type classes, making the lift
operator automatic.  A class - a collection of operations on
constrained types - often is defined for a single concrete base type,
with other instances of the class built from composite objects.
I.e. number -> vector.

Now, combine this with laziness and domains can become infinite
(i.e. power series, derivative towers, ...) and a lot of the
mathematical objects useful for numerics/DSP can be represented quite
directly in an abstract way, to be ``instantiated into'' programs
using abstract interpretation.


Entry: Object identity in Haskell
Date: Tue May  4 16:10:29 EDT 2010

One of the weird concepts in Haskell is that objects have no
`intrinsic ID'.  Pointer equality (an external map that relates memory
addresses to language objects) is a side effect!

More specifically: objects do not exist, only values.  If values need
to be compared for equality, this needs to be explicitly implemented
as an Eq instance.

In more than one occasion I've felt the need to think of values as
objects with a distinct identity, especially thinking about nodes in a
graph, and adding connections.


Whenever I run into the problem of needing node identity, what I
really want is binding structure, or at least some staging/macro step
that can create binding structure from

What I learned is that even if it might not be trivial, it is usually
possible to write keep nodes as variables and write function
abstractions in Haskell that do the same thing.  Esentially, using
something akin to higher order syntax.


This is one of those "deep differences" of functional programming that
take some getting used to.  I definitely need some more practice.


Entry: Memoizing (==) in Haskell
Date: Thu May  6 14:24:14 EDT 2010

Necessity for memoization of (==) pops up when comparing recursively
defined datatypes.

This is actually an interesting problem, as memoization is usually
based on (==) in the first place!

From [1] I'm pointed to Hughes' lazy memo functions[2].  The
introduction talks about memoization as a fix-up ingredient for very
high level programming, preserving modularity.  The `unique' function
implemented in terms of memoized constructors (hash consing) is
probably what I'm looking for.  What I didn't realized is that this
can be defined in terms of a generic `memo' function.  Haskell does
seem to have a standard memo function[3].


[1] http://conal.net/blog/posts/memoizing-polymorphic-functions-part-one/
[2] http://www.cs.chalmers.se/~rjmh/Papers/hughes_85_lazy.pdf
[3] http://www.haskell.org/ghc/docs/5.00/set/memo-library.html
[4] http://www.haskell.org/haskellwiki/Memoization


Entry: Data Parallel Haskell
Date: Sat May  8 11:13:15 EDT 2010

Simon Peyton-Jones on Data Parallel Haskell[1].  See paper[2][3]:
Harnessing the Multicores: Nested Data Parallelism in Haskell, Simon
Peyton Jones, Roman Leshchinskiy, Gabriele Keller, Manuel MT
Chakravarty, Foundations of Software Technology and Theoretical
Computer Science (FSTTCS'08), Bangalore, December 2008.

Rationale:
  - 1000s of processors: you need data parallelism
  - flat datapar is not enough
  - nested datapar covers a larger set of algorithms

Key problem: handling nested dataparallel algorithms.

Key insight: if you've got the lifted version f^ (a vectorized version
of a function f, f^ = map f), you can implement the doubly lifted
version f^^ = map f^ in terms of it.  The basic idea is:

               f^^ = unconcat . f^ . concat

This is flattening.

Practically, the `unconcat' and `concat' functions do not generate
intermediate structure; they can be implemented in constant time
without copying.


Representation of an array needs to depend on the types of its
elements: data families.  This is where fast `concat' comes for, as
the _representation_ is concatenated.

For higher order functions defunctionalization is used, such that the
environments can be represented as tuples which reduces them to the
data case.


[1] http://www.youtube.com/watch?v=NWSZ4c9yqW8
[2] http://research.microsoft.com/en-us/um/people/simonpj/papers/ndp/index.htm
[3] http://research.microsoft.com/en-us/um/people/simonpj/papers/ndp/fsttcs2008.pdf


Entry: Map + state threading.
Date: Sat May  8 14:14:27 EDT 2010

One of the patterns I use a lot in Scheme is a structure-preserving
recursion (map) over a data structure where some context is updated as
a side effect.  I.e. map over a list with threaded state:

   ((state, in) -> (state, out)) -> state -> [in] -> (state, [out])

What is this abstraction called?  See also [1].

As mentioned in [1], it's really just a state monad which can use the
fmap function.  The important thing is to see the function not as
(s,i) -> (s,o) but as i -> s -> (s,o) which is State when the i is
partially applied.

[1] entry://../meta/20100224-220400


Entry: State monad with unit output
Date: Sat May  8 17:47:57 EDT 2010

What's the purpose of a State monad that doesn't produce output, like
(s -> ((),s)) ?

The way I use it is to use fmap to map (In -> State s) over [In] to
get [State s] which can then be sequenced to State s and started with
runState.  The only thing I'm interested in is the end state.

But without output, this is really just a left fold (accumulator) ::
In -> s -> s.  What's the benefit of wrapping a fold up into a state
monad?  Monad transformers?

It pops up in the Flatten.hs code for graph -> SSA conversion.

Maybe this is related: merging monads and folds[1].  It talks about
the two schools: fold vs. monads.

[1] http://www.springerlink.com/index/768043006044675P.pdf


Entry: Left folds in Haskell
Date: Sat May  8 19:29:20 EDT 2010

Performing a left fold in Haskell can lead to stack overflows.
Therefore it is suggested to use the strict function foldl' from
Data.List instead.  Does the same problem happen with State monads?

[1] http://haskell.org/ghc/docs/6.12.1/html/libraries/base-4.2.0.0/Data-List.html#v:foldl


Entry: Syntax directed vs. Semantics directed (fold vs. monad)
Date: Mon May 10 14:24:05 EDT 2010

Merging monads and folds for functional programming[1].

For this Scheme nut & Haskell noob, the more interesting remarks are
in the introduction which uses some terminology I wasn't familiar
with:

  folds - syntax directed - organize on input types

  monads - semantics directed - organize on output types

Here "syntax" refers to the structure of the data types.  Generalized
folds can be constructed systematically by replacing constructors with
functions.  (I.e. for the list constructor Nil this is a 0-argument
function or a value).

I briefly skimmed the paper; the basic idea seems to be that it's
possible to define fold-like operators for monads and have the best of
both worlds.

Now I think I also understand why I have a natural affinity towards
the fold approach, and have difficulty thinking in monads:

   * Scheme is strict and impure; monads are not really necessary.
     It's often useful to combine context/state with fold/map
     operations which are easy to express.

   * To a lesser extent, in a dynamically typed language, polymorphism
     can be used dispatching on inputs, while a statically typed
     language can dispatch on outputs.  This is why types + monads go
     well together.


One thing I'm interested in more is a monadic map, i.e. something that
lifts (s,i)->(s,o) to (s,[i])->(s,[o]).  How this this fit the bill?
As a special case of monadic fold?

[1] http://www.springerlink.com/index/768043006044675P.pdf
[2] http://www.haskell.org/ghc/docs/6.12.1/html/libraries/base/Data-Foldable.html


Entry: Tree Grafting (Monads)
Date: Mon May 10 19:03:40 EDT 2010

Dan Piponi's monad post-tutorial[1].

Quote from Oleg Kiselyov: ``Monads turn control flow into data flow
where it can be constrained by the type system.''

[1] http://blog.sigfpe.com/2010/01/monads-are-trees-with-grafting.html


Entry: S K combinators and the Reader monad
Date: Tue May 11 08:50:46 EDT 2010

The S and K combinators form a complete set of combinators that can
encode all lambda terms.  The proof consists of a mechanised
transformation T explained in [1].  The trick is in the process of
abstraction elimination in rules 5 and 6.

  5. T[λx.λy.E] => T[λx.T[λy.E]] (if x occurs free in E)
  6. T[λx.(E₁ E₂)] => (S T[λx.E₁] T[λx.E₂])

Abstractions provide a means to access values passed in by
applications in "outer shells" of a lambda term using their name.

The S combinator works the other way around, it can be interpreted to
operate on binary trees that correspond to applications, propagating
values to branches of a binary tree.  I.e. the S combinator passes
some values "under water".

Starting from lambda terms, this binary tree is created by the T
transform, where each application creates a fork point.  Rule 5 makes
sure that each abstraction is directly followed by an application, and
rule 6 eliminates the abstraction by representing it as an S
combinator.

Short: all abstractions can be represented by passing values down
branches of a binary expression tree.

From this perspective it is not surprising that S and K pop up as
(<*>) and return from the Reader (environment) monad.

newtype Reader e a = Reader { runReader :: (e -> a) }

instance Monad (Reader e) where
    return a         = Reader $ \e -> a
    (Reader r) >>= f = Reader $ \e -> (runReader $ f (r e)) e


Compare these to S and K.

The K combinator is a non-tagged version of Reader's return.

k :: a -> e -> a
k x e = x

The S combinator is a non-tagged version of Reader's (<*>)

s :: (e -> a -> b) -> (e -> a) -> e -> b
s x y e = (x e) (y e)

(<*>) :: (Applicative f) => f (a -> b) -> f a -> f b


[1] http://en.wikipedia.org/wiki/Combinatory_logic#Completeness_of_the_S-K_basis
[2] http://en.wikipedia.org/wiki/SKI_combinator_calculus


Entry: What does that mean? -- Denotational semantics
Date: Thu Aug  5 19:33:57 CEST 2010

Conal about meaning[1]:

    ``In software design, I always ask the same question: "what does
    it mean?". Denotational semantics gave me a precise framework for
    this question, and one that fits my aesthetics (unlike operational
    or axiomatic semantics, which leave me unsatisfied).''

He then mentions Christopher Strachey[2] and Dana Scott[3]:

    ``Beware that denotational semantics has two parts, from its two
    founders Christopher Strachey and Dana Scott: the easier & more
    useful Strachey part and the harder and less useful (for design)
    Scott part.''

It seems that the Scott part is Domain Theory[4].  What is the
Strachey part?


[1] http://stackoverflow.com/questions/1028250/what-is-functional-reactive-programming/1030631#1030631
[2] http://en.wikipedia.org/wiki/Christopher_Strachey
[3] http://en.wikipedia.org/wiki/Dana_Scott
[4] http://en.wikipedia.org/wiki/Domain_theory
[5] http://en.wikibooks.org/wiki/Haskell/Denotational_semantics


Entry: Models of dataflow
Date: Sat Aug  7 08:54:11 CEST 2010

I know of 4 different ways of looking at dataflow:

* Functional dataflow (FRP): nodes are functions of time, functionally
  related.

* Channels and Processes (CSP): nodes are programs reading from and
  writing to channels.

* Synchronous state space models (SSM): functions transfer (state,input)
  into (next_state,output).

* The Observer pattern: objects get notified of state changes of other
  objects through a notify() method call.


In my current problem, the big issue seems to be to define the meaning
of an "event" and "state".  In FRP an event seems to be more of an
implementation issue; i.e. how to represent the functions that make up
the meaning.  In CSP, an event is very explict: a write operation that
triggers the "unlocking" of its corresponding read.  Each process can
have local state.  In SSMs there are no events, only data changes, but
there is a concept of "current state".  In the observer there are
explicit states and explicit events, but not necessarily parallelism
as in CSP.  The observer pattern can get messy, as it has very little
high level structure apart from message passing.

My question seems to be: Is an "event" necessarily "operational?",
meaning here: is it related to a state transition (or is it a state
transition)?


On to the practical: I'm implementing a Ractive Network[1] in an
environment that includes stateful objects that follow the Observer
pattern. The approach I use is invalidation + lazy evaluation (I/LE).

  * input: write causes all nodes that recursively depend on the
    written node to be invalidated.

  * output: read recursively (re-)computes all nodes that have been
    invalidated.


The main question for my application is: how much do we gain (and
loose) from using a reactive pattern vs. a more low-level and ad-hoc
observer approach?  As hinted in [2], the main issues are algorithm
complexity and granularity.  If the granularity is large, the
algorithm complexity might not really matter.


So..  Can we take the best of both worlds?  How can this be tied into
an Observer pattern in a correct (and efficient) way?

By introducing strict evaluation: a read transaction initiated after
invalidate caused by a write transaction.

In the I/LE implementation we still have events in the true OO sense:
node invalidation.  The trick is to propagate them correctly from
network inputs to network outputs.  Inside the network we have the
benefits of the I/LE model (dependency management + linear evaluation
complexity), outside we have the benefit of Observer: a clear (strict
not lazy) event semantics.


Some more remarks..

  * About Pure Data.  The Pd design has hot/cold inlets to "be done"
    with synchronization problems.  Using it is not always easy (the
    trigger object) but it does have a very simple meaning: a patch is
    a sequential program.

  * Strict semantics seems to mesh a lot better with OO design.  In
    the I/LE model, it seems best to have every write trigger a read,
    such that there is a direct path from write -> event handler.  The
    2-phase algorithm is still useful to avoid exponential complexity,
    but the lazy semantics is too hard to keep right when it's used in
    an imperative environment, by people used to OO programming.


[1] http://en.wikipedia.org/wiki/Reactive_programming
[2] http://en.wikipedia.org/wiki/Reactive_programming#Similarities_with_Observer_pattern


Entry: Condition variable vs. semaphore
Date: Thu Aug 12 12:05:35 CEST 2010

In PDP I used condition variables to signal queue writes.  This seems
to be incorrect.  Semaphores are actually a lot better for managing
work queues.

First, they are simpler to use, but second they also can ensure that
no events are missed.  I.e. during the handling of a changed
condition, the condition might change multiple times, which is missed
by the handling thread.


Entry: Coroutines
Date: Sat Aug 14 09:14:32 CEST 2010

Relation between coroutines and one-shot (partial) continuations.

This comes up very naturally in the implementation of PF: the
continuation is a linear data structure that is transformed and
consumed at runtime, while non-linear code is "ROM", i.e. constant to
the linear core.

The PF compiler (meta-system) is non-linear for a good reason:
entirely ephemeral.  Code is linear to mesh better with hardware,
which is a finite resource.

[1] http://lambda-the-ultimate.org/node/2868
[2] http://lambda-the-ultimate.org/node/803
[3] http://lambda-the-ultimate.org/node/438#comment-3228
[4] http://lambda-the-ultimate.org/node/558


Entry: Graphs without mutation
Date: Thu Aug 26 11:24:28 CEST 2010

I'm facing with the following problem:

    * Construct a graph data structure (pointers in C structs) without
      using mutation in the sub-structures.  This would need some kind
      of "magic" Y combinator-like operation.

    * Represent a graph in using a non-cyclic _constant_ data
      structure, so no zipper-like tree rewriting that requires new
      constructors to be called.


Entry: Treasure trove: Faré Rideau's pointers
Date: Sat Sep 18 14:52:38 CEST 2010

[1] http://fare.tunes.org/pointers.html


Entry: Always-on / Image-based computing
Date: Sun Sep 19 19:33:04 CEST 2010

What does it mean to take a snapshot of a memory image?  (I.e. OS
hybernate).  The bottom line is that memory-structure is simple, but
if memory points to external (non-memory) resources, one gets a hairy
problem of re-initialization on bootup.

Is it possible to design a computer that does not have any external
references?  I mean, start at the hardware: is it possible to design
hardware without hidden state?  Meaning, all state is exposed as RAM.

Probably not realistically..  However, it should be possible to at
least isolate initialization and limit them to an absolute minimum.

What does this mean?

Is initialized hardware maybe an "intermediate state", and is the
real, natural state of hardware the OFF position?

Is an interrupted machine and interrupted transaction that can be
simply restarted.

Can hardware initialization be seen as a transaction that never
finishes?


What about this:

 - An OS image (a "soft" object) is a graph structure represented in
   RAM which is transformed by an interpreter (CPU).

 - Some of the leaves in this tree are opaque objects that represent
   connections to the outside world (stateful objects): ports.

 - Opaque ports have a single method: they can be initialized.
   Necessary parameters for the initialization are present in the
   image and are transparent.

 - To reflect real-world scenarios, ports can depend on each other.
   I.e. a port initializer can be parameterized by an other, low-level
   initialized port.


The point is that instead of seeing the whole OS as an opaque object,
it might be beneficial to reduce the granulairity of opaqueness.
Booting is "compilation", and hibernation and snapshotting is caching
of compiled results.

Moral: it takes a long time to restart a whole system.  It takes
significantly less time to restart only the non-memory resources of a
system.

Most of the beef in modern computer systems is in the in-RAM data
structures, not the hardware configuration.  Rebooting mostly rebuilds
those data structures from more primitive (serialized, non-linked)
representations.


Entry: Mark-sweep GC
Date: Sun Sep 19 20:44:51 CEST 2010

Trade-off: time vs. space.  Apparently it's quite a bit slower than a
copying collector.  How does it compact?

What about mark-compact? [1][2][3].

[1] http://en.wikipedia.org/wiki/Mark-compact_algorithm
[2] http://comjnl.oxfordjournals.org/content/10/2/162.full.pdf
[3] md5://43bfb9905329b1cac86ec1391efe5e67


Entry: void events
Date: Sat Sep 25 11:25:37 CEST 2010

I'm building a reactive programming engine for a consulting project.
An interesting concept I keep running into is that of "void events".

Let's define an event as a <time,value> pair.  Reactive programming
can then be seen as functions defined on events.  My implementation is
strict (non-lazy) to allow integration of side effects for coupling
with the surrounding OO system.

This definition as <time,value> pairs is slightly more concrete
definition as Conal's "functions of time" definition[1][2].  The main
reason is that I don't see how to otherwise add strict side-effecting
code.  One could see the <time,value> pairs as a piecewise constant
representation of continuous functions.  I.e. this helps reasoning
about combinations of events with different time stamps: just think
"what would the continuous function do?".

Now, think of the qualitative difference of these two entities:

  1.  A push-button measurement.  1 = pressed,  0 = released

  2.  A stream of push events.

The second one is an abstraction of the first one, but it is radically
different as there is no longer a piecewise continuous function
associated.

The question is: is this thing "real" or an artifact of some modeling
confusing?

In essence, number 2 is a variant of the Dirac Impulse[3], a
generalized function[4] that allows the bridge between continuous
functions and discrete structures.  A Dirac Impulse represents the
derivative of a step function, allowing for discrete state updates
as an integral of impulse.

In my system they seem to serve the same purpose: a void event
resembles a state change (a derivative) in the cases where it is not
necessary or possible to indicate "by howmuch" the state changes.  It
thus seems to be an artifact of the side-effecting OO integration.


[1] http://conal.net/blog/posts/why-program-with-continuous-time/
[2] http://stackoverflow.com/questions/1028250/what-is-functional-reactive-programming/1030631#1030631
[3] http://en.wikipedia.org/wiki/Dirac_delta_function
[4] http://en.wikipedia.org/wiki/Generalized_function


Entry: Values and Transitions
Date: Mon Sep 27 11:00:50 CEST 2010

This places the previous post[1] in another light.

If you want to model a dynamical system (a system with internal
state), you need to have concepts of both state and transition.

For physical (Newtonian) dynamical systems this is very obvious: you
always need two state variables: position and velocity.

For discrete systems there is a clear analogy.  To be able to detect
and compute in terms of value change, one needs to have access to the
_previous_ value, so there are also two state variables: current and
previous position, or current and difference.


[1] entry://20100925-112537


Entry: Pointer Reversal for Graph Traverse
Date: Fri Oct  8 00:52:21 CEST 2010

When traversing a graph (i.e. mark phase of a GC) it is possible to
encode the traversal stack in the graph data structure using "pointer
reversal".

( Related to zipper? )

[1] http://www.cs.arizona.edu/~collberg/Teaching/520/2005/Html/Html-39/index.html


Entry: Real Time GC
Date: Fri Oct  8 01:18:41 CEST 2010

Essentially, the problem with RT gc is that you need to guarantee that
the collector will always catch up with the mutator.  If there is
enough memory to spare (buffer) then this can effectively work in
practice, though it doesn't seem that hard guarantees can be obtained.

For systems with little spare memory this poses a problem.


[1] http://www.cs.wustl.edu/~mdeters/doc/slides/rtgc-history.pdf


Entry: Refactorer: dependency graph visualisation
Date: Sun Oct 10 15:52:26 CEST 2010

Refactoring is mostly about changing dependencies between code.

It would be so cool to have a tool that can be used to _visualise_
dependency graphs of a whole project, such that changes can be made
_while walking in that virtual space_.


Entry: Pure functional programming and object identity
Date: Tue Oct 19 10:35:32 CEST 2010

One of the most difficult ideas to let go I find object Identity.  It
pops up in my expression to SSA conversion code that needs to process
code as a graph.  The usual imperative approach (is this node "x" ?)
don't seem to work very well.

Semantically there is no problem: equality is well defined.  Identity
isn't necessary for that.  However, identity makes it easier to
_implement_ equality.  There I think I still need some experience to
see how this would be handled correctly.

( Obscure maybe : )

This seems to be a representation problem.  Representing a graph as a
list of named nodes makes implementation of equality trivial and O(1).
However, mapping an original expression to a dictionary does require a
full traversal.  The problem is to keep dictionaries in sync: node
names are _centralized_ data.  I.e. combining 2 expressions needs a
merge of dictionaries, and they might have different names for the
same nodes.


Entry: fexprs
Date: Tue Oct 26 00:22:37 CEST 2010

The ultimate tension between static and dynamic.

Looking from afar, the only argument that I distill from the static
"eval-is-bad" camp is that reasoning about code is complicated by
late-bound semantics.

The argument from the dynamic "eval-is-good" side is that late-bound
semantics are the most flexible (and simple?)  starting point and thus
preferrable as a language base.

( Macros float a bit in the middle between the fully dynamic Smalltalk
  approach where meaning is completely defined at run time, and the
  fully static typed functional languages where a large part of the
  meaning of code (types) can be used at compile time.  I.e. in
  racket, macro bindings are always well defined (a macro name maps to
  a precise function that is known at compile time) but the way it
  transforms code does not preserve any other invariants. )

As Dave puts it [4]:

    Fexprs are bad for two reasons: they make the language hard to
    compile efficiently and they make programs hard to understand by
    subjecting the basic program definition to dynamic
    reinterpretation.

Thomas Lord's comment[5] is quite interesting though.  Also check out
the Kernel[6] programming language.


[1] http://kazimirmajorinc.blogspot.com/2010/10/on-pitmans-special-forms-in-lisp.html
[2] http://en.wikipedia.org/wiki/Fexpr
[3] http://lambda-the-ultimate.org/node/3861
[4] http://calculist.blogspot.com/2009/01/fexprs-in-scheme.html
[5] http://lambda-the-ultimate.org/node/3861#comment-57967
[6] http://web.cs.wpi.edu/~jshutt/kernel.html
[7] http://lambda-the-ultimate.org/node/3861#comment-57972


Entry: Algebraic Datatypes
Date: Thu Dec  2 11:15:54 EST 2010

An ADT constructor application is a function application that can be undone.

This is a simple and powerful idea.


Entry: Representation of impedance/admittance duality
Date: Thu Dec  2 11:42:46 EST 2010

-- Instead of having to perform numerical inverse all the time, why
-- not represent numbers as (x, 1/x), where possibly one of the two is
-- not given?

-- EDIT: This turned out to be quite interesting: dual.hs

-----------------------------------------------------------------------

import Data.Complex


---- Dual representation of 2-terminals.

-- "Half" two-terminal: 2 primitive elements, one composition.
data HTT a = Resistive a             -- V = R I        ;  I = G V
           | Reactive a              -- V = L dI/dt    ;  I = C dV/dt
           | Composite (TT a) (TT a) -- V1 + V2 (ser)  ;  I1 + I2 (par)
             deriving (Show, Eq)

-- A two-terminal is a dualized HTT.
data TT a = Primal (HTT a)      -- keep duals
          | Dual (HTT a)        -- flip duals (invert)
            deriving (Show, Eq)

dual (Primal htt) = Dual htt
dual (Dual htt)   = Primal htt


-- The main idea of the dual representation is generality.  One
-- consequence is that building a s-parameterized
-- impedance/admittance/transfer function is straightforward.


trans :: (RealFloat a) => (TT a) -> (Complex a) -> (Complex a)


trans = flip a'' where
    a'' s = a' where
        a' (Dual htt)   = 1 / (a htt)
        a' (Primal htt) = a htt

        a (Resistive r)   = r :+ 0
        a (Reactive  r)   = (r :+ 0) * s
        a (Composite x y) = (a' x) + (a' y)


---- Absolute representation of 2-terminals in terms of dual rep.

-- To represent networks we need to tag the TT to indicate which one
-- of Impedance or Admittance it represents.

data ATT a = Impedance (TT a)
           | Admittance (TT a)
             deriving (Show, Eq)

-- Primitive absolute elements are constructed as I/A tagged Primal TTs.

-- Resistive: Impedance / Admittance independent of frequency.
res r = Impedance  (Primal (Resistive r))   -- resistor (Ohms)
cnd g = Admittance (Primal (Resistive g))   -- conductor (Siemens,Mhos)

-- Reactive: Impedance / Admittance proportional to frequency.
ind l = Impedance  (Primal (Reactive l))    -- inductor (Henry)
cap c = Admittance (Primal (Reactive c))    -- capacitor (Farad)


-- Project ATT to TT, interpreting it as impedance or admittance.
imp (Impedance tt)  = tt
imp (Admittance tt) = dual tt
adm = dual . imp

-- Primitive absolute 2-terminal operations are constructed in terms
-- of a Primal Composite operation.
ser a b = Impedance  (Primal (Composite (imp a) (imp b)))
par a b = Admittance (Primal (Composite (adm a) (adm b)))


j x = (0 :+ x)


-- Voltage divider
divider a b s = a' / (a' + b') where
    a' = trans (imp a) s
    b' = trans (imp b) s


Entry: Understanding Referential Transparency
Date: Fri Jan 14 14:52:28 EST 2011

If you look at the definition of Referential Transparency[1] (RT)
which says that a computation can always be replaced by its value, it
seems to be relatively straightforward.

What I found really hard to understand is why this does away with
object identity.  The `eq?' function in Scheme which compares pointers
has no place in Haskell, because it has a result that depends on
whether its arguments come from the same evaluation (value copied) or
not (value obtained by running the same computation twice).

In essence this means that a value does not have an implicit, unique
name.  A value is just a value and nothing more.  To name values, a
function needs to be defined that associates a name to a value or the
other way around.

Of course, pointer comparison does show up in the Haskell trenches,
because it is too useful for implementing some low-level forms of
memoization that would otherwise lead to exponential complexity.  See
makeStableName[3].  It soils your code with the IO monad though,
unless you go down the bumpy road of unsafePerformIO[4].


[1] http://en.wikipedia.org/wiki/Referential_transparency_%28computer_science%29
[2] http://stackoverflow.com/questions/1717553/pointer-equality-in-haskell
[3] http://www.haskell.org/ghc/docs/7.0.1/html/libraries/base-4.3.0.0/System-Mem-StableName.html
[4] http://www.mail-archive.com/haskell-cafe@haskell.org/msg52544.html


Entry: Functional Programming is Fantastic
Date: Sat Feb 12 10:45:23 EST 2011

It is great to be able to completely isolate parts of a program
without _any_ worry of whether you missed a covert communication
channel or side effect.

On the other hand, this can be a bitch to program.  It makes you think
about how much you normally use covert channels and side effects to
fit a square peg into a round hole.


Entry: Merging
Date: Wed Mar  2 11:08:17 EST 2011

In the back of my head I've gotten interested in the mergin problem.
Mostly because of conflicts in source control, but also as a general
idea of data updates and the `what is a change' question.

I ran into Pierce's bidirectional programming work[2] before.  Seems
there is also a sync utility alled Unision[3] based on similar ideas.

[1] http://apps.ycombinator.com/item?id=2266071
[2] http://lambda-the-ultimate.org/node/2828
[3] http://www.cis.upenn.edu/~bcpierce/unison/


Entry: Filling in gaps
Date: Wed Mar 30 14:27:33 EDT 2011

Programming is filling in unspecified gaps.

Experience helps to not make too many bad choices.


Entry: 0,1,2 stacks : 3 kinds of programming?
Date: Wed Apr  6 23:02:25 EDT 2011

These are not the same machine.
0 = FSM (regular languages)
1 = PDM (push down automaton / context free grammars)
2 = turing equivalent

[1] http://en.wikipedia.org/wiki/Pushdown_automaton


Entry: Parser-oriented programming
Date: Sat Apr  9 10:38:31 EDT 2011

The USB driver is moving forward.  I found a way to use structs in
Forth, by turning them into streams, and writing parsers.  The golden
rule seems to be: don't use datastructures in Forth, use streams,
tasks and/or state machines.

The point-free style works better with ``parser-oriented''
programming.  Or Stream-oriented if you want.  Optimize the protocols
to make use of this.  I.e. a simple trick to reduce memory usage is to
always prefix the size of an array instead of using a termination
condition.  (Pascal strings vs. C strings)

If you think of it, data structures are only postponed execution.
This is very apparent in a language like Haskell: think of
deforestation optimization where constructors and pattern matching get
combined to eliminate the data structure entirely[1].

On an embedded platform this is even more true since you really are
more interested in process and IO than for a normal computer, which
would be data-storage central.

EDIT: Found something on LtU that advocates quite the opposite[2].

The thing is that time in computation is not of the same quality as
time: it doesn't support random access (until it's buffered,
i.e. "rotated" into space.

EDIT2: It's an implementation issue: what kind of high-level
description will produce a low-memory implementation?


[1] http://homepages.inf.ed.ac.uk/wadler/topics/deforestation.html
[2] http://lambda-the-ultimate.org/node/925


Entry: Reforestation
Date: Fri Apr 15 14:35:59 PDT 2011

Instead of starting from a functional description and "hoping" all
constructors can be optimized out, is it possible to start from
guaranteed elimination, and see what subset highlevel language fits on
top of that?

This has always been the core of my quest.  I suppose this is the Hume
project[1].

EDIT: Reminds me of some of the stuff I read around PEGs: a
parser-centric view instead of a language-centric view, because what
you care about is not so much properties of the language itself
(grammar, generation) but properties of implementation and efficient
information flow.

[1] http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.97.1710&rep=rep1&type=pdf


Entry: Protocol-oriented programming
Date: Sat Apr 23 00:42:20 EDT 2011

Staapl, PIC18 USB driver: I got it into my head that I want to program
in Forth without using data structures (random access data) but just
using linear time-ordered streams or channels.

Forth allows very dense code construction.  This is a direct
consequence of its implicit data access: simply not having to store
memory locations saves space.  The price you pay is extra effort
necessary to properly factor code.  It becomes very important to have
a procedure "do only one thing".  This is a skill that can be learned,
and in practice it works amazingly well.

The trouble is, that approach doesn't work well with structured data
(trees and graphs).  The obvious reason is that such data
representation requires the re-introduction of names or random access.

Is it possible to do a similar "compression" for data?

The idea is that if it's possible to use local access variables
instead of random access variables in Forth code, it should be
similarly possible to replace all data structures with serial,
uni-directional protocols.

The underlying idea is the operation of deforestation that is possible
in pure functional programming.  Deforestation eliminates
constructor+match pairs and rewrites the code's control flow to turn
such a pair into a function call or let-abstraction.

The big idea here is that data structures are really just postponed
function calls, or "buffers".  Depending on the fan-out of the data
structures (= the number of times a certain structure is used) it is
often possible to not store intermediate data structures in RAM but
push components to consumers directly.  In practice this can often be
done by introducing parallelism.

I would assume this doesn't work for all kinds of code.  I.e. code
that has intrinsic "data storage" features, such as a database.
However it should be possible to employ the principle in cases where
data use is just buffers or temporary storage.  In that case it should
be possible to eliminate it with different control-flow factoring.


One of the drawbacks of this approach seems to be that the protocol
definition should be part of the design process.  Just as function
prototypes are really important to write good Forth code, serial data
protocols should be optimized for ease of parsing.

It seems better to call this approach "parser-oriented" programming:
make sure that protocols between different components are defined in
such a way that local resource requirements are minimized.


In my current problem (USB driver for PIC18) I'm facing the problem
that the input data structures are fixed, and clearly defined with a
random-access approach in mind (think C structs).  So to make this
work in practice it is probably necessary to preprocess the input to
turn it into a usable event stream.


So what is the really big idea?

Make data dependencies explicit.


Entry: Low-level C : Exceptions or error codes?
Date: Wed Apr 27 10:27:52 EDT 2011

The trouble with exceptions is memory management.  If all data is
stored in local variables on the call stack, exceptions implemented
with setjmp are not a problem.  If at any point there is non-local
state manipulation, be it memory allocation or any other global state
update, exceptions are very hard to get right, and incremental error
passing that can undo any global changes is probably a better idea.


Entry: Goldmine
Date: Thu May  5 00:34:19 EDT 2011

[1] http://homepages.kcbbs.gen.nz/tonyg/projects/thing.html


Entry: Object Identity
Date: Thu May  5 13:16:47 EDT 2011

Baker on Object Identity[1].

I can't keep these two apart, so here's some examples:

- extensional def[2]: structure fully specified, exhaustive
- intensional def[3]: implementation hidden, only properties specified


[1] http://home.pipeline.com/~hbaker1/ObjectIdentity.html
[2] http://en.wikipedia.org/wiki/Extensional_definition
[3] http://en.wikipedia.org/wiki/Intensional_definition


Entry: Haskell: functions vs. structures
Date: Mon May  9 09:32:38 EDT 2011

Something funny happens when data structures are immutable: data and
code become more alike.

The difference between a (immutable) structure and a (pure) function
is that a structure is like a function where there is a
time-disconnect and multiplicity-disconnect between call and function
entry, meaning that data pasted in a structure will be interpreted
later, and possibly multiple times.  Data pasted in a function will be
interpreted immediately and only once.

If data structures are consumed only once, and the context of data
interpretation is explicit, the difference almost disappears and it is
often possible to re-arrange code such that constructor/deconstructor
pairs can be simplified into function calls.  This is called
deforestation[1].

[1] http://homepages.inf.ed.ac.uk/wadler/topics/deforestation.html


Entry: Simple checksums
Date: Sun May 22 17:31:14 CEST 2011

On a tiny uC, which simple checksums are most effective?  The common
ones are:

  - add all bytes together
  - perform XOR

Without really thinking, I'd say that XOR is worse since it doesn't
really "smear out" the errors: all 8 bits are just independent parity
checkers, though there are 8 of them.


Entry: Fault-tolerant, stateful code
Date: Tue Jun 14 16:13:08 CEST 2011

There seems to be only one guiding principle: keep the invariants of
the data structure as simple as possible.  It seems to make sense to
split the problem of fault recovery into two parts:

   - temporary (local) inconsistency due to transient faults

        These are quite easy to handle by simply retrying the
        operation.

   - permanent inconsistencies due to permanent state mutations

        These are really hard if there is no redundancy to bring the
        state back to consistency, or if the invariants are simply too
        complicated to "try to be smart".


Entry: intension / extension
Date: Thu Jun 16 11:50:59 CEST 2011

Seems the way this is used in CS is due to Church's lambda calculus[1]:

  In developing his theory of lambda calculus, the logician Alonzo
  Church (1941) distinguished equality by intension from equality by
  extension:

    It is possible, however, to allow two functions to be different on
    the ground that the rule of correspondence is different in meaning
    in the two cases although always yielding the same result when
    applied to any particular argument. When this is done, we shall
    say that we are dealing with functions in intension. The notion of
    difference in meaning between two rules of correspondence is a
    vague one, but in terms of some system of notation, it can be made
    exact in various ways.

In the previous section[2] it is said that:

  The rule that defines a function f:AB as a mapping from a set A to a
  set B is called the intension  of the function f. The extension of f
  is the set of ordered pairs determined by such a rule:

[1] http://www.jfsowa.com/logic/math.htm#Lambda
[2] http://www.jfsowa.com/logic/math.htm#Function


Entry: Referential transparency and object identity
Date: Sun Jul 10 14:48:29 CEST 2011

An expression e of a program p is referentially transparent[1] if and
only if e can be replaced with its evaluated result without affecting
the behavior of p.

This is a very strong requirement and completely destroys the ability
to use object identity.  From [2]:

    Identity is a property that an object may contain aspects
    that are not visible in its interface.

These "aspects" might simply be references by other objects, as the
just the act of referencing already creates a "hidden" relationship
between the pointing objects.  This relationship cannot be expressed
by any value that would substitute the object.


An interesting application of this is that while it is possible to
recover input->output dependency information for Haskell functions of
the Num class through abstract interpretation, it is impossible to
recover _internal_ sharing information from a Haskell program by
observing just the input->output behaviour of that program.  Here
"sharing" refers to bindings defined by `let' and `where' forms that
have more than one reference, thus producing a dependency graph
instead of a tree.  In short: values do not encode who they are used
by.

In the presence of sharing, the output of such an abstract
interpretation will be a tree with duplicate nodes.  With careful
input prepration, external input nodes can be unified using equality
in a straightforward way, but internal nodes need to be "scanned"
based only on structural equivalence of their dependency graph that
traces back to the input nodes.  This is really just common
subexpression elimination and has not much to do with the original
structure.

[1] http://en.wikipedia.org/wiki/Referential_transparency_(computer_science)
[2] http://en.wikipedia.org/wiki/Identity_(object-oriented_programming)


Entry: A letter to a C programmer
Date: Sat Jul 16 11:09:59 CEST 2011

If you ever wonder where my tendency to write weird C preprocessor
constructs comes from, it is most likely from spending too much time
with Racket, a Scheme dialect.  http://racket-lang.org/

That language contains the current state of the art of untyped macro
systems, which integrates a very powerful and simple name scope
management system (modules) with simple templates ("syntax-rules") and
full multiple-stage code generation ("syntax-case").

It is an incredibly powerful system.  Most of what it makes possible
you can't do in CPP.  What seems to have happened for me though is
that working with Racket macros made it possible to point a finger at
exactly what is wrong with CPP and how to hack around it in some
cases.


Entry: Getting used to Monads
Date: Fri Jul 22 11:14:12 CEST 2011

I (re-)derived my first Monad implementation, peeking left and right
but luckily making some mistakes in the process.

It's hard to say what actually clicked in my mind, but it seems that
exposing what `bind' and `return' actually do in some situations
clears up a lot of magic dust.

The real problem is that, for a "low-level" programmer like me, the
Monad is too abstract to start with initially.  The consequences of
this high level abstract construct are vast and profound and make room
for understanding that is hard to find in an impure language.  But
still, all that space it covers needs to somehow be part of your
mental framework to make sense of the usefulness.  In short: this
knowledge is hard to bootstrap.


Entry: Monads and evaluation order
Date: Sat Jul 23 13:28:51 CEST 2011

The name Monad refers to "1 output"[1].  This output is referred to as
a parametric type (M t).

The interesting thing is that while bind takes functions that produces
a single monadic (wrapped) _output_ from from a single naked input,
there is no reason that functions could not be partially applied
functions.  However, trying to do this immediately raises the issue of
order, i.e. for partial application of a function of type:

   bind2 :: M a -> M b -> (a -> b -> M c) -> M c

there are 2 natural implementations that have some symmetry but have a
different behaviour, becuse one bind operator is "executed first":

  bind2 ma mb f =
    ma >>= \a ->
    mb >>= \b -> f a b

  bind2' ma mb f =
    mb >>= \b ->
    ma >>= \a -> f a b

The same, but in do notation:

  bind2 ma mb f = do
    a <- ma
    b <- mb
    f a b

  bind2' ma mb f = do
    b <- mb
    a <- ma
    f a b

From the significance of the order of lines in the do statement (or
equivalently, the data dependency of the application of >>= operators)
it seems plausible to accept that monads can be used for implementing
behaviour that requires a certain order, i.e. state updates or CPS
computations.

Note that there are monads for which the order does not matter.  These
are called commutative monads[2].

[1] http://www.haskell.org/haskellwiki/Monad
[2] http://www.haskell.org/haskellwiki/Monad#Commutative_monads


Entry: Monad is a type class
Date: Sat Jul 23 14:13:30 CEST 2011

Some important things to realize about monads, from my earlier
misconceptions which missed the (awesome) generality of this concept:

  * `Monad' is a type class and is used to say something _about_ a
    parameteric type.  For a parametric type (M t), the expression
    (Monad M) declares that the parametric type M implements the
    operations:

      bind   :: M a -> (a -> M b) -> M b
      return :: a -> M a

    an `instance' declaration makes this explicit.

  * The abstract type Monad M => (M t) can carry a lot of hidden
    information next to "something of type t".  I.e. a typical
    declaration of a monadic type is:

       instance Monad (M a b c ...)

    where the types a b c ... are type parameters that do not take
    part in the monad interface.  The full type of M would be:

       M a b c ... t

    where the parameter `t' is the one that takes part in the monadic
    interface.  The type predicate `Monad' expects a parametric type
    with one parameter.  The "type" of `Monad' is called a "kind", and
    is * -> *.  [1]

  * The occurence of type paremeter t in a monadic type (M t) does not
    have to reflect to a naked data item in the implementation.  It
    can just as well be the input or output type of a function or a
    parameter in any parametric type.

    I.e. values of the (Cont v) monad do not contain concrete values
    v; they are functions that take a v type as input.


In some sense it is not monads that are the difficult concept, it is
type classes in general.  The ladder of abstraction is the list:

  1. basic types,      not parameterized
  2. parametric type,  parameterized by basic types
  3. type classes,     parameterized by parametric types

[1] http://www.haskell.org/haskellwiki/Kind


Entry: Composing monads
Date: Sat Jul 23 14:46:19 CEST 2011

Apparently that's not such a well-known subject[1].  What I can take
from that post+comments is that the implementation of a monad is too
low-level to make any general statements about, and that what would
help is a more disciplined way to build monad implementations from
composition of primitive monads/transformers that are better behaved.

[1] http://www.randomhacks.net/articles/2007/03/05/three-things-i-dont-understand-about-monads


Entry: Learning Haskell
Date: Sun Jul 24 19:03:32 CEST 2011

In Haskell I find it quite difficult to guess whether I can do
something or not, i.e. combining different abstraction mechanisms
seems to not always work as expected.  It's hard to make this more
explicit but it's as if the abstraction can get so high that you
completely loose all intuition.  Maybe it's just my learning curve
still, but I have had this going on for a while.  Maybe it's also that
I just don't write any really difficult code in C, and that in Scheme
I resort a lot to "interpretation" or loose runtime typing because the
real structure of the code isn't so clear.

The good thing in Haskell seems to be that once you do manage to
express your idea with a lot of static structure, the result is
beautiful and likely correct and very general(izable).


Entry: do notation algebra
Date: Mon Aug  1 21:22:42 CEST 2011

  let_ var body = do
    v <- var
    body $ return v

Why is the above not equivalent to below?

  let var body = body var


The monad I'm using is a CPS monad used to implement sharing.


Entry: Awesome Prelude
Date: Tue Aug  2 14:15:49 CEST 2011

Funny that in the "JavaScript types" example in [2] the same mechanism
of language-specific type constructors using phantom types is used as
in the tagless paper[4].

The general idea of replacing datatypes with type classes is to
abstract what you want to do with it.  For data types this is
construction and destruction.  For the BoolC class the constructors
are `true' and `false' while `bool' is the destructor.

  data Bool

  class BoolC dsl where
    false :: dsl Bool
    frue  :: dsl Bool
    bool  :: dsl a -> dsl a -> dsl Bool -> dsl r

Here the `dsl' parameter is the parameterized type constructor for
the DSL that implements the BoolC class.

The cool thing is that the same strategy works for functions

  class FunC dsl where
    lam :: (dsl a -> dsl b) -> dsl (a -> b)
    app :: dsl (a -> b) -> dsl a -> dsl b

Here `lam' takes a Haskell function and maps it to a function in the
DSL representation, and `app' does the reverse.

The downside is no syntactic support, which makes it difficult to use
in practice.  The best approach atm seems to be to write an explicit
syntactic frontend when your're designing a language to sidestep these
issues.

[1] http://tom.lokhorst.eu/media/presentation-awesomeprelude-dhug-feb-2010.pdf
[2] http://tom.lokhorst.eu/2010/02/awesomeprelude-presentation-video
[3] https://github.com/tomlokhorst/AwesomePrelude
[4] http://www.cs.rutgers.edu/~ccshan/tagless/jfp.pdf


Entry: CPS vs SSA
Date: Tue Aug  2 17:31:35 CEST 2011

CPS has well-defined binding structure and parallel assignment.  In
SSA[1] this seems to be somewhat looser.  Is there a real difference
here?  (Context: for me the point is to make _really fast code_ that
goes straight onto a DSP or FPGA.)

The WP article on SSA[1] mentions that SSA has non-local control
flow[2] while CPS has none.  (With this is meant things like
exceptions and continuations, so that's not relevant for me.)

Let's look for something that compares CPS and SSA[3][4], and not to
forget ANF[4].  The interesting bit about SSA is the Phi functions,
which are placed at control-flow joins.  Wingo cites the
interpretation that each basic block is a function, and that a Phi
function indicates that the basic block has an argument.

Wingo goes on to say that SSA is really for first-order programs and
aggressive optimization of loops, while CPS is for higher-order
programs.

[1] http://en.wikipedia.org/wiki/Static_single_assignment_form
[2] http://en.wikipedia.org/wiki/Control_flow#Structured_non-local_control_flow
[3] http://lambda-the-ultimate.org/node/3467
[4] http://wingolog.org/archives/2011/07/12/static-single-assignment-for-functional-programmers


Entry: Lambda or struct?
Date: Thu Aug  4 09:18:38 CEST 2011

One thing one could conclude from the embedding of typed languages as
functions instead of data is that functions seem to be strictly "more
magical" than data structures.

Is this merely a restriction of algebraic data types in a typed
setting?  Hence the existence of GADTs.  I never ran into this kind of
"difference" between data and code in Scheme, i.e. following the idea
that data is completely free-form and can always be interpreted.


Entry: initial / final
Date: Thu Aug  4 23:56:46 CEST 2011

I found a comment explaining the concepts "initial" and "final" in
category theory.  See section 1.4 example 1.4.4 in Pierce's Category
Theory for Computer Scientists[1].

In a category, an initial object is an object FROM which there is a
unique arrow to each object in the category.  In Set there is only
one, the empty set, where each arrow is an empty function.  A final
object is an object TO which there is a unique arrow from each object
in the category.  In Set every one-element set is a final object.


[1] isbn://0262660717

Entry: Phantom types
Date: Sat Aug  6 12:37:15 CEST 2011

Recently while styding haskell I ran into the trick of using phantom
types to "tag" information at compile time.  I.e. a data type

  data Str t = Str String

is internally just a string, but it is possible to use the type
parameter to specify operations like:

  data Blessed
  data Raw

  bless :: Str Raw -> Str Blessed

The function `bless' could then maintain some kind of invariant on
String.  The presence of that invariant can be indicated at compile
time by the type Str Blessed.  Hiding the constructor `Str' will make
it impossible to create a `Str Blessed' data type.  A limited
constructor could be exported to create non-blessed strings:

  str :: String -> Str Raw


Entry: Generalized Algebraic Data Types
Date: Sat Aug  6 14:18:45 CEST 2011

Soundbyte: an ADT constructor can't have types such as T x -> T y,
which are useful (necessary?) to represent the typed lambda calculus.

( Note, it is possible to use type *classes* to do this: C repr =>
  repr x -> repr y, but that's a different story. )

In what sense exactly is a GADT generalized?

[1] http://en.wikipedia.org/wiki/Generalized_algebraic_data_type


Entry: SSA vs CPS
Date: Sun Aug  7 23:08:18 CEST 2011

This[1] looks like a nice starting place.

[1] http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.3.6773&rep=rep1&type=pdf
[2] http://www.delicious.com/doelie/ssa


Entry: Type-level computations / meta systems
Date: Mon Aug  8 10:44:12 CEST 2011

Metaprogramming.  I've seen it now from many sides, and it is
fascinating how they are all subtly different but still quite similar.
From logics that say something "about" code, to macro systems that
generate code, either providing types (MetaOCaml) or not
(Scheme/Racket).

The main "force" seems to be the tension between simple logic systems
that can be reasoned about, and full-blown programming systems that
are limited in analysis by the halting problem (undecidability[3]).

- Typed metaprogramming:

  * Type systems proper: Hindley-Milner[4] (just complex enough to
    make inference work) up to the "lambda cube"[2] which contains
    systems that can only be checked.  Main benefit: prove things
    about programs.

  * Type level computations in Haskell using functional dependencies
    in type classes[1].  Benefit: allow limited form of computation
    without getting into undecidedness.

  * MetaOCaml[5]: proper multi-stage code generation in a typed setting.

- Untyped metaprogramming:

  * Scheme's hygienic macros: generating code programmatically
    respecting binding structure.

  * Typed scheme & macros.  Similar to MetaOCaml but approaching the
    problem from the untyped->typed side.  (find some links).

[1] http://hackage.haskell.org/package/type-level
[2] http://en.wikipedia.org/wiki/Lambda_cube
[3] http://en.wikipedia.org/wiki/Undecidable_problem
[4] http://en.wikipedia.org/wiki/Type_inference
[5] http://www.metaocaml.org/


Entry: Flattening expressions using liftM2
Date: Mon Aug  8 18:47:59 CEST 2011

One of the revelations of my recent Haskell study sprint is that it is
possible to "serialize algebra" using liftM2.  This might be
idiosyncratic language for something that has a proper name, but what
I mean is that the function

  liftM2 :: Monad m => (a -> b -> c) -> m a -> m b -> m c

is the bridge between "parallel" computations that have a binary tree
structure, where both legs of the tree (types a and b in the input
function above) are independent, and "sequential" computations that
have a fully specified order imposed by the monad structure.


Entry: QuickCheck as an API design guide
Date: Sun Aug 14 10:43:25 CEST 2011

Don Stewart mentions in an xmonad talk[1] that they've been using
QuickCheck as a guide to designing good APIs.  If the QC properties
are very hard to write down, your API sucks.

Another tip I've heard before: keep all your functionality pure.  Only
use thin layer of IO to interface with the outside world.

[1] http://www.ludd.ltu.se/~pj/hw2007/xmonad.mov


Entry: Applicative Transformers
Date: Sun Aug 14 16:25:07 CEST 2011

Some observations to make precise:

  * A DSP language (of combinators) would benefit from connections
    that happen behind the scenes.  Examples are state relations over
    time.

  * A recurrence relation / difference equation is essentially a state
    monad.

  * A Monad is also Applicative

  * Audio DSP is essentially a mix of State and List monads.

  * Monad transformers are a bit of a kludge.  There is not a lot
    known about the algebra of monad transformers.

  * Applicative Transformers do not exist because applicatives are
    "naturally composable". [1] -> Section 4.

  * Is it true that the direction that is inherent in the State monad
    -- the function s -> (s,t) -- is what causes State to be specific
    enough to be a Monad?  Is causality the essential element?

  * If a state space model can be run in reverse, would it stop being
    Monad?  This reminds me of Steele's parallel language..

[1] http://www.haskell.org/haskellwiki/Applicative_functor


Entry: Arrow = Applicative + Category
Date: Sun Aug 14 18:21:43 CEST 2011

What does this mean?

From my own experience trying to think about abstractions it seems
that Applicative uses a curried interface, while Arrow uses tuples.


[1] http://www.haskell.org/ghc/docs/latest/html/libraries/base/Control-Category.html#t:Category
[2] http://cdsmith.wordpress.com/2011/07/30/arrow-category-applicative-part-i/
[3] http://cdsmith.wordpress.com/2011/08/13/arrow-category-applicative-part-iia/

Entry: Existential types
Date: Sun Aug 14 21:29:21 CEST 2011

When using existential quantification[1], you can't actually do
anything with the values, because you don't know the type!

i.e. the difference between these two:

     data Foo   = forall a. Foo a
     data Bar a = Bar a

is that for Foo we don't know what type a is, and because Foo doesn't
have type parameters, we can't specify it elsewhere either.

Practically that means that for Foo, we can't really do anything with
the values it wraps because we just don't know the types.

I see two ways around this.  One is explained in [1] and it boils down
to giving information about a, i.e.

      data FooShow = forall a. Show a => Foo a

Here we still don't know what type a is, but we know that we can apply
the operations from the Show class to values wrapped by the Foo
constructor.  ( It seems this can't be done using a field accessor,
but it is possible with pattern matching. )


Another way is to use the quantified variable more than once.

      data FooApp = forall a. Foo a (a -> Int)

Here we still don't know anything about the type, but we know that we
can apply Foo's second argument to its first to obtain an Int.  This
is the trick I've used in the a state-space model (SSM) to record
initial state and state transition together.  The only operation
that's ever performed on state is to pass it to the state transition
function.

I used existential types to make the following composition operation
on state machines fit the Category class.  This required that the
state parameter is hidden.

chainSSM :: ((a, s1) -> (b, s1)) ->
            ((b, s2) -> (c, s2)) ->
            (a, (s1,s2)) -> (c, (s1,s2))

chainSSM f g = fg where
  fg (a, (s1,s2)) = (c, (s1',s2')) where
    (b, s1') = f (a, s1)
    (c, s2') = g (b, s2)

data SSM i o = forall s. SSM s ((i, s) -> (o, s))

-- If this would be (SSM s i o) then the composition operation would
-- have been :: SSM s2 b c -> SSM s1 a b -> SSM (s1,s2) a c
-- which doesn't fit :: SSM b c -> SSM a b -> SSM a c

instance Category SSM where
  (.) (SSM f0 f) (SSM g0 g) = SSM (g0,f0) $ g `chainSSM` f
  id  = SSM () $ id


As is mentioned in a post in [2], universals give generics: we don't
know what the type is, and we don't care since we only pass values
around.  Existentials give interfaces: we don't know what the exact
type is but we do care that we can perform a number of operations from
a given interface.

[1] http://www.haskell.org/haskellwiki/Existential_type
[2] http://stackoverflow.com/questions/292274/what-is-an-existential-type


Entry: The Haskell Learning Curve
Date: Mon Aug 15 10:38:39 CEST 2011

Haskell is a tremendous trip.  As I keep telling my fellow C trench
dwellers, it's really different.  Even with a couple of years of
Scheme (Racket) to get used to higher order functions, it's still a
different world.  The reason of course is types.  Here's a list of
things I've learned.

- Existential types[1].  This can be useful for hiding type parameter
  when making class instances[6].

- Phantom types [2].  Used for typed language embedding and encoding
  data structure invariants in the type system.  I.e. "blessed
  strings".

- Parametric polymorphism [3].  A basic tool for building generic
  functions that operate on data (Algebraic data types) with type
  parameters, i.e. list.

- Ad-hoc polymorphism [4] or overloading, implemented by type classes
  which abstract over collections of parameterized types.  What is
  interesting is the "1-up" that is possible by moving from a set of
  operations over a parametric data types, to a set of operation over
  a collection of different data types (class instances)[9].

- Common abstractions implemented as classes: Monad, Applicative,
  Functor, Eq, Show, Num, ...

- Thinking of monads as computations instead of containers.  An early
  misconception of mine instilled by one of the many monad tutorials
  is that parameteric types need to be data containers.  It is quite
  possible for a type parameter of a parametric type to refer to the
  input and/or output type of a function.  Obvious in retrospect, but
  a big revelation when I finally got it.

  This is used in the Cont (CPS) & State monads.  The monad operations
  simply chain computations together.  The end result is a function
  requiring a value (the initial state in State) or a higher order
  function requiring a function argument (the final continuation
  function in CPS).

- Seeing Monad as a DSL with a custom variable binding structure, the
  ">>=" operator, which is reflected in the do notation's left arrow.

- Finding out that Haskell has no meaningful object identity.  Object
  identitiy is not referentially transparent[7][10].

- Related to the above, finding out that sharing structure (let) is
  not observable from the outside of a function definition.  This is
  important in abstract interpretation when the intention is to
  recover sharing structure.  Embedding a DSL where sharing structure
  is important (i.e. static single assignment SSA) then needs a
  specific binding form.  This can be done by ">>=" in a Monad, or by
  embedding the language in a type class representing HOAS[8].


[1] http://www.haskell.org/haskellwiki/Existential_type
[2] http://www.haskell.org/haskellwiki/Phantom_type
[3] http://www.haskell.org/haskellwiki/Parametric_polymorphism
[4] http://www.haskell.org/haskellwiki/Ad-hoc_polymorphism
[5] http://www.haskell.org/haskellwiki/Algebraic_data_type
[6] entry://20110814-212921
[7] http://en.wikipedia.org/wiki/Identity_(object-oriented_programming)
[8] http://www.cs.rutgers.edu/~ccshan/tagless/aplas.pdf
[9] entry://20110723-141330
[10] entry://20110710-144829


Entry: From Applicative to Monad
Date: Mon Aug 15 19:50:05 CEST 2011

Each Monad gives rise to an Applicative where

    pure = return
    <*> = ap

Where ap is:

    ap mf ma = do
      f <- mf
      a <- ma
      return $ f a

or:

    ap = liftM2 ($)


but not all Applicatives are Monads.  See [1][2] for examples.

So, does it make sense to say that I was not able to encode a certain
behaviour as Applicative, but was able to do it as Monad?

Yes it does, since requiring Monad is requiring more structure.

What I tried to accomplish could be implemented as composition of
Kleisli arrows (a -> M b), which is something an Applicative can't do.

( I'm implementing recurrence relations represented as

  data Sig s a = {init :: s, next :: s -> (a,s)) }

  and the corresponsing signal operators.  I settled on signals as
  monad values and operators as Kleisli arrows.  Note that isn't a
  true monad due to the dependence on the `s' parameter, which isn't
  constant for join )

So which primitives to implement to define both Monad and Applicative?
It seems this only needs:

   (pure/return, join, ap)
   (pure/return, join, fmap)

So Functor to Monad needs return+join, while Applicative to Monad only
needs join since it already has return.  The join operation is what
implements the "monadness", the piercing of monad structure to "get
stuf out" which is necessary to chain Kleisli arrows, while
pure/return only put stuff inside the Monad.

[1] http://haskell.1045720.n5.nabble.com/Applicative-but-not-Monad-td3142155.html
[2] http://en.wikibooks.org/wiki/Haskell/Applicative_Functors#ZipLists


Entry: Kleisli arrows
Date: Tue Aug 16 14:47:50 CEST 2011

According to Dan Piponi, Kleisli arrows and their composition are the
whole point of monads[1].  Dan's explanation is something like the
following:

If you want to chain (a -> M b) and (b -> M c), a straightforward but
wrong thing to do is to use a function (M b -> b) that throws away all
the extras.  However, because M is a functor it is always possible to
use fmap :: (a -> b) -> (M a -> M b) to convert (b -> M c) to (M b ->
M (M c)) which composes nicely with (a -> M b).  The result of this is
that we end up with a double wrapping.  Therefore a monad needs to
have a function join :: M (M a) -> M a that restores the output of
this chain to something that can be chained again.  Summarized:

  (>=>) :: (Functor m, Monad m) =>
           (a -> m b) -> (b -> m c) -> (a -> m c)
  (>=>) f g = f .> (fmap g) .> join
    where (.>) = flip (.)

So how does this relate to the usual do notation?

  do
    a <- ma
    b <- mb
    return $ a + b

or explicitly

  ma >>= \a ->
  mb >>= \b ->
  return $ a + b

This is not simply chaining of arrows.  The nesting here makes the
arrows somewhat special.  They are all arrows that go from some type
to the result type of the do expression.  The focus here is on the
result of the expression, i.e. monadic values (M a) instead of arrows
(a -> M b).

Maybe the following makes sense: using (>=>) is pointfree or
function-oriented programming, while using (>>=) or do is applicative
or value-oriented programming.

Anyways, in this light, comonads are straightforward to grasp, and as
Dan mentions, it's not clear if there is a "codo" because comonads
don't map well to the idea of binding structure.


[1] http://blog.sigfpe.com/2006/06/monads-kleisli-arrows-comonads-and.html
[2] http://en.wikipedia.org/wiki/Kleisli_category
[3] http://www.haskell.org/haskellwiki/Arrow_tutorial#Kleisli_Arrows


Entry: Arrows
Date: Tue Aug 16 16:11:55 CEST 2011

The prerequisites for an Arrow instance are (.) and id from Category,
which gives the basic composition mechanism, and the operations arr
and first.  The arr operation simply lifts functions.

  arr   :: (b -> c) -> a b c

While the first operation provides basic communication

  first :: a b c -> a (b, d) (c, d)

I.e it's like using a stack to temporarily stash something away, here
the type d, in order to perform an operation and pop it back.  This is
essentially a disguised form of the basic "stack shuffling" mechanism
behind concatenative languages such as Forth.

Apparently the other operations can be derived from arr and first.
The second operation is simply the mirror of first.

  second :: a b c -> a (d, b) (d, c)

Parallel composition takes two cables and puts them in the same tube.

  (***) :: a b c -> a b' c' -> a (b, b') (c, c')

Fanout takes to cables that come from the same point

  (&&&) :: a b c -> a b c' -> a b (c, c')

Note that binary algebraic operations can be applied to arrow outputs
by making tuples and applying lifted curried operations.


I do see that some people think of this as a clumsy interface.  I'm
familiar with this kind of structure through working with graphical
data flow languages.  And indeed, it's not easy to put this inherently
graphical construct in a textual form.  While point-free style can be
quite powerful, it can also require a lot of intricate plumbing that
would be more straightforward to express in an applicative style using
named intermediates.

In my first state-space model implementation, I naturally came to the
ssmSer and ssmPar operations, where ssmSer is (.) and ssmPar seems to
be (***) in the Arrow class.  I also had lifted functions from
ssmPure.  This looks like a complete set.

The new GHC has Arrow notation[2].

TO READ:
[1] http://www.haskell.org/ghc/docs/latest/html/libraries/base/Control-Arrow.html
[2] http://www.haskell.org/ghc/docs/7.0.2/html/users_guide/arrow-notation.html
[3] http://www.haskell.org/haskellwiki/Monad_Laws
[4] http://lambda-the-ultimate.org/node/2799
[5] http://en.wikibooks.org/wiki/Haskell/Understanding_arrows
[6] http://blog.downstairspeople.org/2010/06/14/a-brutal-introduction-to-arrows/
[7] http://cs.yale.edu/c2/images/uploads/AudioProc-TR.pdf


Entry: Monad from Kleisli Arrow
Date: Tue Aug 16 18:46:32 CEST 2011

So I have an Arrow implementation that I know is the Kleisli arrow (K
a b) of some Monad (M a).  Is this enough to derive the Monad instance
from the Arrow instance?

The idea is that to implement bind, we need to "cut off the head" of
the arrow.  Maybe this can be done by the correspondence

    M a       <-->  K () a

If then wi can use a correspondance such as this we might be able to
pull it off:

    a -> M b  <-->  K a b

The catch is probably that the latter correspondence only exists if
the arrow is a true Kleisli arrow.

Actually, I rediscovered something (at least the conjecture ;) from
[1].  It says that indeed Monads are equivalent to Arrows that satisfy
the type isomorphism

    K a b  <->  a -> K () b


( In my particular implementation I ran into a blocking problem that
has to do with bad organization, so can't pull it of in its current
form.  I'm still struggling with the existentials problem though.. )


[1] http://homepages.inf.ed.ac.uk/wadler/papers/arrows-and-idioms/arrows-and-idioms.pdf


Entry: Existential types
Date: Thu Aug 18 00:10:54 CEST 2011

I was trying to understand why pattern matching followed by
application works fine, but pattern matching and returning doesn't
work.  In other words: why don't existentials support record
selectors, and why is passing them as arguments to functions not an
issue?

A typical example: a type with a hidden value type and a function with
hidden input type.  Both are the same so the value can be passed to
the function.

  data Exst a = forall b. Exst b (b->a)

The following doesn't type check, for the simple reason that it is
completely unknown what type the function produces.  It can be
anything.

  value (Exst v f) = f

It's not that the pattern matching itself doesn't work.  The following
works fine:

  eval1 :: Exst a -> a
  eval1 (Exst v f) = f v

The reason is that we don't know the type of v and f, only how they
are related: we do know that if we apply f to v we get the type that
the Exst is parameterized by.

It is also possible to pass the value to a function that expects a
parametric type that is the same as the one that's specified:

  eval2 :: b -> (b -> a) -> a
  eval2 v f = f v

  eval3 (Exst v f) = eval2 v f


                                 * * *


Now for the real problem I'm facing, given a data type that represents
a Kleisli arrow

  data Kl i o = forall s. Kl (i -> s -> (s, o))

Construct the type isomorphism:

  iso :: (i -> Kl () o) -> Kl i o
  iso = undefined

No matter what I try.  Case statements or CPS, I still can't get the
types to match: it always sees the type variable in the data
declaration as different from the one in any other specification.

Wait, I think I finally get it.  There is simply no way of knowing
that the s that is passed to kl is the same as the s of kl.

  data Kl i o = forall s. Kl (i -> s -> (s, o))
  iso :: (i -> Kl () o) -> Kl i o
  iso f = Kl $ \i s -> (\(Kl kl) -> kl () s) (f i)

It's possible to write that line with an implicit s, but the point is
still the same: the type that was fixed when the Kl that's being
unpacked was created could be completely different from the new
instance we're creating here.  The information on what that type was
is no longer present in the type of f.

  iso f = Kl $ \i -> (\(Kl kl) -> kl ()) (f i)

In a Monad, the join operation flattens two layers of wrapping into
one.  Doing this with an existentially qualified type doesn't work if
this data needs to be combined in any way, because all information
that they might be of compatible types has been deleted.

The same goes for bind: it takes information from outside the monad
and inserts it inside, crossing a border where type information has
been deleted.

What does work is to unpack, combine, repack.  Stuff that comes out of
a single wrapper can all be combined together.

[1] http://www.haskell.org/pipermail/haskell-cafe/2011-August/094718.html


Entry: Existential Monad problem: solved!
Date: Sat Aug 20 22:14:28 CEST 2011

One last attempt at trying to understand why this can't work:

  (.>) v f = f v  -- Patern binding doesn't work with existentials.

  type Kl i o = i -> Kl1 o
  data Kl1 o = forall s. Kl1 (s -> (s,o))

  bind :: (Kl1 i) -> (i -> Kl1 o) -> (Kl1 o)
  bind mi f =
    Kl1 $ \(s1,s2) ->
          mi      .> (\(Kl1 u1) -> (u1 s1) .> (\(s1', i)  ->
          (f i)   .> (\(Kl1 u2) -> (u2 s2) .> (\(s2', o)  ->
          ((s1',s2'),o)))))

The problem here is that the s1,s2 we feed into the unpacked Kl1 are
not compatible.  The type of u1 and u2 is completely unknown in the
expression of bind.

To understand this, let's try to take a look at Ryan's answer[1].
Paraphrased, look at what happens if we have an f doing something
like:

  f i :: Bool -> Kl1 o
  f i = if i then kl1 else kl2

where kl1 and kl2 could have different state types.  Because the above
is possible, you really can't assume anything.

The problem is really that the dependence on the i input is "too
powerful".  In the arrow approach for the type

  forall s. (s, (s, i) -> (s, o))

it seems to work because everything is neatly tucked in; no state
change possible.


[1] http://www.mail-archive.com/haskell-cafe@haskell.org/msg92609.html


Entry: ArrowApply and ArrowMonad
Date: Sun Aug 21 09:41:35 CEST 2011

As mentioned in the last post[1], it's not possible to use that
approach because you can't express compatibility of states in bind.
However, it is possible to make an Arrow instance which has
composition.

What I find here in Felipe's post[2] is that it is possible to create
the associated Monad[3] when making an instance of ArrrowApply[4].  So
it should not be possible to do so, or it's not really isomorphic, or
my previous explanation in [1] is wrong.

ArrowApply[4] generalizes ((a -> b), a) -> b, which is a curried form
of (a -> b) -> a -> b, from functions (->) to Arrows (-->), as an
arrow that applies another arrow to an input ((a --> b), a) --> b.

I tried to write it down but the wrapping confuses me.  First, what
does this mean in terms of non-wrapped types?

  a --> b               ==   a -> s -> (s, b)
  ((a --> b), a)        ==   ((a -> s -> (s, b)), a)

  ((a --> b), a) --> b  ==   ((a -> s -> (s, b)), a) -> s -> (s, b)

For the wrapped types this gives a very straightforward definition

  data Kl i o = forall s. Kl (i -> s -> (s, o))

  instance ArrowApply Kl where
    app = Kl $ \((Kl f), a) -> f a

However, it doesn't type-check.  The construction of app requires the
hidden type to be fixed when app is defined.  However, this type
depends on the _behaviour_ of app just as in [1], so there is a
dependency problem which is what the error message is trying to say:

    Couldn't match type `s0' with `s'
      because type variable `s' would escape its scope
    This (rigid, skolem) type variable is bound by
      a pattern with constructor
        Kl :: forall i o s. (i -> s -> (s, o)) -> Kl i o,
      in a lambda abstraction
    In the pattern: Kl f
    In the pattern: ((Kl f), a)
    In the second argument of `($)', namely `\ ((Kl f), a) -> f a'


                              * * *

The bottom line in this whole discussion seems to be that these 2
types are different:

    forall i o. (forall s. (i -> s -> (s, o)))
    forall i o. (i -> (forall s. s -> (s, o)))

In the latter the type s can depend on the value of i while in the
former it cannot.


[1] entry://20110820-221428
[2] http://www.mail-archive.com/haskell-cafe@haskell.org/msg92643.html
[3] http://hackage.haskell.org/packages/archive/base/4.3.1.0/doc/html/Control-Arrow.html#t:ArrowMonad
[4] http://hackage.haskell.org/packages/archive/base/4.3.1.0/doc/html/Control-Arrow.html#t:ArrowApply
[5] entry://20110821-123701


Entry: Data.Typeable
Date: Sun Aug 21 15:47:57 CEST 2011

According to Felipe[1] there is a way around the problem by using
Data.Typeable[2].  I still need to read again to make sure I get it
fully.

The idea is to move some of the type checking to run-time.  Of course
this makes it possible to have run-time errors or "default behaviour"
when the types do not match.

Maybe it's possible to enforce well-behavedness using some other
wrapper?

The bad behaviour seems to come from control flow, i.e. pattern
matching (case) or if .. then .. else.  Assuming this is the case then
one can say this works "if the user doesn't use control flow".  I'm
not so sure this is a good idea.

   It's a Monad, except when it's not.  And you'll find out when you
   run the program.

I think I'm sticking to Arrow unless I get a non-dynamic solution.
It's a nice trick to know though.

TODO: read again, test it and reply.


[1] http://www.mail-archive.com/haskell-cafe@haskell.org/msg92649.html
[2] http://haskell.org/ghc/docs/latest/html/libraries/base/Data-Typeable.html


Entry: From Applicative to Num
Date: Sun Aug 21 18:55:42 CEST 2011

It's been a couple of times that I've written something similar to the
following instance declarations for Num t, Applicative (X t), where X
is some kind of "bigger t".

Is there a way to abstract this lifting to Num (X t)?

instance (Eq (SigOp i o)) where (==) _ _ = False

instance Show (SigOp i v) where show _ = "#<SigOp>"

instance (Num o, Show (SigOp i o), Eq (SigOp i o)) =>
         Num (SigOp i o) where
  (+) = liftA2 (+)
  (*) = liftA2 (*)
  abs = fmap abs
  signum = fmap signum
  fromInteger = pure . fromInteger


To do it generally it might be best to restrict this so it doesn't
include all Applicative instances by defining a blessing class:

  class Applicative a => NumericApp a

The rest is straightforward.  The following is for NumericPrelude:

  instance (Algebra.Additive.C n, NumericApp a)
           => Algebra.Additive.C (a n) where
    (+) = liftA2 (+)
    zero = pure zero

  instance (Algebra.Ring.C n, NumericApp a)
           => Algebra.Ring.C (a n) where
    (*) = liftA2 (*)

  instance (Algebra.Field.C n, NumericApp a)
           => Algebra.Field.C (a n) where
    (/) = liftA2 (/)


Entry: Numeric Prelude
Date: Sun Aug 21 20:30:29 CEST 2011

Looks like an interesting project.  Standard Num is indeed a bit
hackish.

    "Numeric Prelude provides an alternative numeric type class
    hierarchy.  ... The hierarchy of numerical type classes is revised
    and oriented at algebraic structures. Axiomatics for fundamental
    operations are given as QuickCheck properties."  And more [1].

Orphaned? [3].  Maybe not, there's activity [4].

[1] http://www.haskell.org/haskellwiki/Numeric_Prelude
[2] http://hackage.haskell.org/package/numeric-prelude
[3] http://archlinux.2023198.n4.nabble.com/Please-orphan-haskell-numeric-prelude-td2967163.html
[4] http://web.archiveorange.com/archive/v/uW8vzWzyzFGR2S6TubjS


Entry: >>= vs. >=>
Date: Mon Aug 22 21:29:59 CEST 2011

The monad laws in terms of >=> just say that >=> is associative and
return is the identity:

   f >=> (g >=> h) = (f >=> g) >=> h

   return >=> f = f >=> return = f

So, why is do-notation (nested >>=) more prevalent?  Because it
provides nested variables and sequentiality, the usual playing field
for effectful computations.

Anyways, coming back to my SigOp language, it seems clear now that it
can't be a monad because I want to structure of the computation to be
fixed, because I want to use it to generate static code.  Monad is too
powerful.

ArrowChoice might be an interesting compromise: it allows processors
to be switched into different modes, where they could exhibit
different types.  Though it doesn't seem like I really need different
types, just different paths.


[1] http://www.haskell.org/wikiupload/e/e9/Typeclassopedia.pdf


Entry: NOT & CPS
Date: Mon Aug 22 22:36:37 CEST 2011

With a' == not a, why is?

        a'  ==  a => f

Because

    a => b  ==  'a v b

So double negation a'' == ((a => f) => f)

This is related to the type of a function that takes a continuation
argument.

    ((a -> r) -> r)

See Oleg's explanation[1].


[1] http://okmij.org/ftp/continuations/undelimited.html


Entry: Concatenarrow
Date: Mon Aug 22 23:41:57 CEST 2011

Instead of using Arrow notation to perform "tuple plumbing" when
composing Arrow computations, it might be also useful to use a stack
approach.  Somebody has to have thought of that before...

Essentially Arrows use binary trees for product (and ArrowChoice uses
binary tries for sum using Either).

Represent the empty stack by ().  Applying an arrow to the top of the
stack is "first".

The question is then: what is the "default" representation for unary,
binary, ... operations.  Arrows are naturally unary and tupled
(uncurried) binary.  So maybe it's best to make lift/unlift from that
rep to:

  (a -> b)      ->   (a, s)       -> (b, s)
  ((a,b) -> c)  ->   (a, (b, s))  -> (c, s)

The reason why Arrows are not curried is that there is no apply
operator.  This would give Monad power (ArrowApply), since arrows
(whole computations) can depend on input values, which makes structure
value-dependent.

Anyway, it seems important to note that arrows with non-binary tuple
inputs can't take inputs from other arrows, so it makes sense to
standardize on a way to provide multiple arguments.  The stack
approach seems to be a good comprommise.

So let's start with that: tuple <-> stack conversion.

  liftStack1 f (a,s) = (f a, s)
  liftStack2 f (a,(b,s)) = (f a b, s)
  liftStack3 f (a,(b,(c,s))) = (f a b c, s)

The problem with those is that they only work for functions.  To make
these compatible with Arrows we need to stick to something that's
accessible through tupling.  Wait... it's always possible to lift
plumbing functions so this is really not a big deal.  Things do need
to be uncurried though, so these look better:


Entry: Eliminating Existentials
Date: Thu Aug 25 10:43:38 CEST 2011

I'm trying to represent a sequence as an initial value :: s and an
update function :: s -> (o,s).  According to Oleg's reply[2] pointing
to [1] it's possible to use laziness to avoid these kinds of
existentials, by applying them.

Some points from [1] that might be useful:

  - Replace functions that operate on hidden types with type class
    constraints (bounded quantification) if functions are constant
    over types.

  - Replace other such functions as thunks: "apply away" the hidden
    parameters.


In my case this would mean to repesent the type as a list [o], or a
list function [i] -> [o].  The problem then is of course that the
original function is not observable.

Maybe my original point is completely wrong then: the function is not
observable anyway, unless the state is somehow part of a class that
can allow initial values, "and" a run function that produces the
result.

class InitState r where
  initFloat :: Float -> r Float
  initInt   :: Int   -> r Int
  run       :: (s, (s, i) -> (s, o)) -> ???

However, in case there is no structure that depends on input (No
Monad?) it should be possible to evaluate the update function
abstractly on a singleton list, obtaining the (i,s) -> o relation.
But this does not expose the state output.

It's actually not so hard: if I want to observe the state at some
point, I can't hide it completely.  Placing the evaluator in a type
class seems to be the thing to do.  This should allow evaluation to
list processors, and machine code separately.

But the stuff mentioned in the HC thread is quite interesting.  I
thought I understood then I see this weird other approach..  See next
post.

[1] http://okmij.org/ftp/Computation/Existentials.html


Entry: Streams and the Reader Monad
Date: Thu Aug 25 14:30:05 CEST 2011

I'm having trouble following the different ways of formulating
(input-dependent) streams.  Oleg wrote down the following bind
function and also mentioned this[1].  See also thread[2].

  data Kl i o = forall s. Kl s (i ->  s ->  (s, o))

  instance Monad (Kl i) where
      return x = Kl () (\_ s -> (s,x))
      (Kl s m) >>= f = Kl s (\i s -> case f (snd (m i s)) of
                                       Kl s' m' -> (s,snd (m' i s')))

This is different from what I've been trying to accomplish.  But let's
try to follow the diagonal bind described here[1].

In [1] the join operation for streams is written as producing a stream
from the diagonal of the stream of streams input.


So what is going on here?

Let's just try an example.  Let's try to bind the sequence [0,1,2,..]
to i -> [0,i,2*i, ..] according to the definition

  0. [0] 0  0  0  ..
  1.  0 [1] 2  3  ..
  2.  0  2 [4] 6  ..
  3.  0  3  6 [9] ..
  .. .. .. .. ..  ..

( This cannot represent streams with the kind of iterated (s,i)->(s,o)
dependence I'm looking for.  This is really an i -> Stream dependence,
where i determines the whole stream.  The trouble is that all rows are
independent of each other: there can be no history relation between
elements of the output stream, only *inside* a single row. )

So what is this stream combination useful for?  Let's dig further.


As Oleg mentions in [2], it's the Reader Monad.  Very curious.  See
also first comment in [1].  Why?  Because streams can be represented
as Nat -> a and the bind would then have type:

    (>>=) :: (Nat -> a) -> (a -> Nat -> b) -> (Nat -> b)

    a >>= f = \n -> f (a n) n

The reader's bind is a function that takes the environment n and uses
it to evaluate the computation a, using its result to obtain another
computation through f, which will be passed the environment n again.

Defining the stream monad on explicit streams[1] seems confusing.
Formulating it as the Reader monad makes it simpler; te body of the
bind function then looks quite straighforward.

So let's try to put the graph above in this formula:

  a n  = n         -- 0, 1, 2, ..
  f i n = n * i    -- 0, i, 2i, ..

  b n = f (a n) n = n * n


So it's clear, this cannot represent the discrete integral (partial
sum) operation, because b_n is independent of a_m, m != n, and for the
integral it would be dependent on a_m, for m <= n.


[1] http://patternsinfp.wordpress.com/2010/12/31/stream-monad/
[2] http://www.mail-archive.com/haskell-cafe@haskell.org/msg92702.html


Entry: Learning Haskell
Date: Thu Aug 25 18:33:49 CEST 2011

Funny.. the stuff from last couple of posts makes me realize I'm not
really reading functional programs.  I'm verifying them against a
mental model.  If I don't have the model yet, I need to *decode* the
program to build the mental model, then after that it's simpler to
read because I know what "trick" to expect.


Entry: Stream transformers
Date: Thu Aug 25 18:36:51 CEST 2011

Most of the posts have been about these types representing
input-dependent streams, where the stream is represented as a
recurrence relation:

     (s, i -> s -> (s, o))          (1)

     i -> (s, s -> (s, o))          (2)

     (s, s -> (s, i -> o))          (3)

The central question is, does i influence the the state transition
function (1), with one i for each update, does a single i produce an
entire stream (2), which needs some kind of stream-of-streams
flattening operation, or is the state independent of the input (3).
This is Arrow(1), Monad(2) and Applicative(3).

These are VERY DIFFERENT.

[1] http://www.mail-archive.com/haskell-cafe@haskell.org/msg92702.html


Entry: Are recursive signal processors applicative?
Date: Fri Aug 26 13:02:04 CEST 2011

They are arrows[1].  They do not seem to be monads due to structural
limitations; monads allow data-dependent computation structure while
recursive signal processors have a fixed structure[2].  However, they
seem to be a generalization of monads that supports do notation when
the type of the computation is properly fixed.

Formulated as

  data Signal a = forall s. Signal s (s -> (s, a))

they also do not seem to be applicative due to a data dependency
problem.  It is possible to write an Applicative instance for the
above type, but it is not powerful enough to encode something
isomorphic to (s,i) -> (s,o)

  instance Functor Signal => Applicative Signal where
    pure  v = Signal () $ \_ -> ((), v)
    (<*>) (Signal sf f) (Signal sa a) = Signal (sf, sa) (signalApp f a)

  signalApp :: (s1      -> (s1,       (a -> b))) -> -- fi
               (s2      -> (s2,       a)) ->        -- ai
               (s1, s2) -> ((s1, s2), b)            -- bi

  signalApp fi ai = bi where
    bi (s1, s2) = ((s1', s2'), b) where
      (s1', f)  = fi s1
      (s2', a)  = ai s2
      b = f a

The type of Signal (i -> o) is isomorphic to:

  s -> (s, (i -> o))

The state :: s cannot be influenced by the input :: i.
The type signature forbids any connection.

To give a more intuitive explanation of why this is impossible, think
about what happens when the recursion is unfolded.  The resulting type
is [i->o].  Can the functions in that list still depend on each other?

The answer is a clear no.  Those functions have to be pure.  Suppose
the list is [f0,f1,..].  If f1 depended on the input of f0 it would
not be possible to evaluate f1 without evaluating f0 first.  If they
are in a list there is no reason why we could not just ignore f0.

Funny how I still don't trust type signatures ;)
Side channels are always visible!


[1] entry://../meta/20110816-153448
[2] entry://20110821-094135

Entry: Streams with extra input.
Date: Fri Aug 26 13:34:18 CEST 2011

Let's look at the stream with an extra input, as suggested by Oleg[1].

  Op i o = Op (i -> Op i o, o)

The trouble with this is that for my purpose they are not explicit
enough: I need an explicit description of state transformation process
between successive outputs to generate code.

[1] http://www.mail-archive.com/haskell-cafe@haskell.org/msg92746.html


Entry: Haskell tuples vs. lists
Date: Fri Aug 26 16:06:33 CEST 2011

It's interesting to note the symmetry between:

  f [] = ...
  f (x:xs) = ... f xs ...

and

  instance F () where ...
  instance (F a, F b) => F (a,b) where ...

The f operates on lists of values while the F "operates" on types that
can have a nested structure known at compile time, i.e. (Int,Double)
or (Int,(Int,(Double,())))


Entry: mapM
Date: Fri Aug 26 17:44:30 CEST 2011

I keep running into the mapfold operation but don't see why nobody
ever mentions this function.

  foldr   :: (a -> b -> b) -> b -> [a] -> b
  mapfold :: ((a -> b -> (b, c)) -> b -> [a] -> (b, [c])

Maybe it's because this is a combination of two monads: state and
list, and it's usually written as a map?

  :: (a -> m b) -> [a] -> m [b]

Indeed, that's the signature of mapM[1].

It might be a good idea to learn how to use this.  At first glance it
seems that this needs a monad class definition, and that can be a bit
verbose.


[1] http://hackage.haskell.org/packages/archive/base/latest/doc/html/Prelude.html#v:mapM


Entry: The List Monad - Generate and Test
Date: Fri Aug 26 23:31:32 CEST 2011

When people talk about logic programming with the list Monad, what is
meant are things like this:

  import Control.Monad

  f n = do
    x <- [1..n]
    y <- [1..n]
    z <- [1..n]
    guard $ (x^2 + y^2 == z^2)
    return (x,y,z)

This is one of those super elegant Haskell tricks ;)


Entry: The Essence of Functional Programming
Date: Mon Aug 29 23:34:20 CEST 2011

I'm reading [1] again.  What I see now when seeing the monad types is
that essentially the beef is really hidden behind the type M.

  return :: a -> M b
    bind :: M a -> (a -> M b) -> M b

I wrote about this before in different words[2], but the real click
for me is that a Monad is 1. completely general wrt. the a and b in
the types above, and 2. completely abstract in that all its magic is
hidden behind the type M and the bind, return operations.

So if it's completely abstract, how can you create a function of type
a -> M b in the first place?  To be useful, every monad needs to
export some function to create values wrapped in M on top of the
standard composition interface made up by bind and return.

M a is always a value.  However, it doesn't have to be a naked value
of type a.  It can be the output of a function as in the Reader monad
:: e -> a, or the input of a function as in the Cont (CPS) monad :: (a
-> r) -> r.

Wadler mentions[1] that the basic idea when converting a pure function
into a monadic one is to convert a function of type a -> b to one of
type a -> M b.  The return and bind operations can then compose these
functions, or more intuitively[3], it can compose Kleisli arrows (a ->
M b) -> (b -> M c) -> (a -> M c).

What I've always found strange is that the do notation has such a
peculiar form if you think of it.  It has little to do with Kleisli
composition where arrows are not nested.  A do block has a single
return type M r, but several nested functions with input a1, a2, a3,
... one for each arrow.

The thing is that do notation is to let (applicative style) what
Kleisli composition is to function composition (point-free style).
It's not so much that do is strange, it's that nested let is strange!
In some sense a point-free approach is more natural compared to
creating a context of many visible variables to allow random access.


[1] http://homepages.inf.ed.ac.uk/wadler/papers/essence/essence.ps.gz
[2] entry://20110723-141330
[3] http://blog.sigfpe.com/2006/06/monads-kleisli-arrows-comonads-and.html


Entry: Needing GHC extensions
Date: Tue Aug 30 19:50:49 CEST 2011

I often run into issues that don't fit well in the standard Haskell
type system.  Am I used to too much ad-hoc datastructures?  Does my
code usually have corner cases I don't recognize because I don't need
to cast it in types?


Entry: Cover Condition
Date: Fri Sep  2 09:38:10 CEST 2011

What does this mean?

       the Coverage Condition fails for one of the functional dependencies;
       Use -XUndecidableInstances to permit this


Entry: a -> M b   vs.  M a -> M b
Date: Sat Sep 10 08:44:47 EDT 2011

What is this about[1]:

    A key observation of Moggi's was that values and computations
    should be assinged different types: the value type a is distinct
    from the computation type M a.  In a call-by-value language,
    function stake values into computations (as in a -> M b); in a
    call-by-name language, functions take computations into
    computations (as in M a -> M b).

[1] http://homepages.inf.ed.ac.uk/wadler/papers/essence/essence.ps.gz


Entry: State Space model vs. Mealy Machine
Date: Sat Sep 10 09:21:40 EDT 2011

Time to cleanup some terminology.  While a State Space Model (SSM) is
more general than a Mealy Machine in that it's space of states might
be infinite while a MM has a finite set of states, when an SSM is
implemented in hardware it is necessarily a MM because a finite
approximation is made with

However, as a structural description (intension?) the two are usually
very different.  A SSM's state is usually parametric: a single rule is
expressed in terms of state coordinates, while a MM is usually
case-based: each distinct state point corresponds to a separate rule
expression.

[1] http://en.wikipedia.org/wiki/Mealy_machine
[2] http://en.wikipedia.org/wiki/State_space_(controls)


Entry: Logic
Date: Sat Sep 10 20:49:38 EDT 2011

I know very little of mathematical logic.  When I think of logic, I
mostly think of boolean algebra and logic gates.

The thing is that, being educated as an engineer and not a
mathematician, I'm very much biased towards semantics (sets and
functions), or maybe even more the relationship between some
mathematical structure and the measurable reality it models.

Semantically, Boolean algebra talks about operations (functions)
B^n->B, defined on the set of truth values B = {0,1} and its powers.
It is the model for logic gates.  Such a system is easy to work with
in a muddy intuitive way because it is feasible to exhaustively verify
correctness of expression manipulations.

However, the approach used in formal logic is axiomatic.  The point of
an axiomatic system is to provide a means to derive new expressions
(theorems) from a collection of initial expressions and rules of
inference/construction.  Such a system is called a "calculus".  A
derivation that leads from axioms to theorems is called a proof.

In [1] Wadler mentions that:

    In a single landmark paper Genzen (1935) introduced the two
    formulations of logic most widely used today, natural deduction[3]
    and the sequent calculus[2], in both classical and intuitionistic
    variants.

Reading further, it seems that the importantance of the sequent
calculus is mostly because of the cut-elimination theorem[4] which is
about "composition of proofs": if there exists a chain of proofs that
can be combined using the cut rule, there also exists a direct proof.


[1] http://homepages.inf.ed.ac.uk/wadler/papers/dual/dual.pdf
[2] http://en.wikipedia.org/wiki/Sequent_calculus
[3] http://en.wikipedia.org/wiki/Natural_deduction
[4] http://en.wikipedia.org/wiki/Cut-elimination_theorem
[5] http://en.wikipedia.org/wiki/Boolean_algebra_(logic)


Entry: Protocol-oriented programming (part 2)
Date: Tue Sep 20 17:22:06 EDT 2011

Started here[1].  Recently been thinking about this again because I
ran into some code that is gratuitiously un-streamable, i.e. a byte
stream with non-causal data dependencies across large parts (packets)
that basically requires the use of large buffers.

Is there a way to turn the question around?  How to design a data
protocol such that small buffers can be used?

     In absence of other constraints (i.e. robustness), a good
     protocol is one that is easy to print / parse.

How to formulate this in terms of languages and automata?


[1] entry://20110423-004220


Entry: Minimal erase binary counter for Flash memory
Date: Fri Sep 23 15:31:29 EDT 2011

Problem with Flash memory: 0->1 is costly and needs to happen in bulk,
while 1->0 is free.

How to implement a counter in Flash that has a good tradeoff between
few erase cycles, and little redundancy.

- Full redundancy: one bit per increment, no erase.
- No redundancy: erase on every increment.

Something in the middle could be a XOR mask and a setting one bit per
time, based on something like a gray code.


Entry: Embedded patterns and translation
Date: Sat Sep 24 14:53:43 EDT 2011

* Conversion between a sealed "foreach" API and a wide-open
  open/access/close.  This is the universal traversal API idea[1], but
  in practice it doesn't work that well in C due to lack of partial
  continuations.

* Conversion between task code to state machines, i.e. if recursion is
  finite and there are no arbitrary pre-emption points this should be
  always possible to automate.  State machines are a pain to write.

[1] http://okmij.org/ftp/papers/LL3-collections-enumerators.txt


Entry: Inductive inputs / outputs
Date: Sun Oct  2 18:44:03 EDT 2011

From the LLVM code:

-- An alias for pairs to make structs look nicer
infixr :&
type (:&) a as = (a, as)
infixr &
(&) :: a -> as -> a :& as
a & as = (a, as)

This is useful as it's possible to inductively build function types
with multipe inputs and outputs.  See for a 4 in, 4 out function.

Curried:

  (* -> (* -> (* -> (* -> (* :& (* :& (* :& (* :& ()))))))))

Does it also exist in the non-curried variant?  It seems that in that
case it's no longer a sequence, but a tree split by the `->' type
constructor:

  (* :& (* :& (* :& (* :& ())))) -> (* :& (* :& (* :& (* :& ()))))

Why is induction (linear list structure) useful?  It allows the
definition of an enumerable set of type classes, starting with the
base case and working up through induction, just like one would write
recursive functions on recursive types.

It doesn't seem that functionally the non-curried case is less
powerful, just that it's more of a hassle to deconstruct the type in
the inductive rule.

See next post for example of how a function arity can be obtained from
a function type using a type class and 3 instances.  Once base case ()
and one for each induction.


Entry: Function arity
Date: Tue Oct  4 20:22:14 EDT 2011

{-# LANGUAGE TypeOperators, TypeSynonymInstances #-}

-- Small test for parameterizing over types that represent multi in /
-- multi out functions like:
--   (* -> (* -> (* -> (* :& (* :& (* :& (* :& ())))))))

infixr :&
type (:&) a as = (a, as)
infixr &
(&) :: a -> as -> a :& as
a & as = (a, as)

-- First test is to map the type signature to a pair of numbers
-- representing the I/O arity.

class (NbIO f) where
  nbIO :: f -> (Int, Int)

instance NbIO () where nbIO _ = (0,0)

instance NbIO os => NbIO (o :& os) where
  nbIO os  = (ni, no + 1) where
    (ni, no) = nbIO (snd os)

instance NbIO f => NbIO (i -> f) where
  nbIO f  = (ni + 1, no) where
    (ni, no) = nbIO (f undefined)

# nbIO (\a b c -> (a,(b,(c, ()))))
# => (3,3)


Entry: Learning Haskell
Date: Fri Oct  7 20:09:53 EDT 2011

And the saga continues.  I spend long times in total frustration not
understanding what a cryptic type error means when I'm playing with
type classes.  Usually I just try to go about differently and succeed
to my great surprise.  In short: I often really don't get it yet.  At
those times I don't know where to actually look to understand why a
particular construct will not work.

The moral of the story is that just adding type signatures will often
solve the problem, and if it really doesn't work, it's probably a
conceptual error, i.e. try harder!


Entry: Apply pure function in monad
Date: Fri Oct  7 20:44:38 EDT 2011

Often I run into something like this:

      m >>= \x -> return $ f x

Often with `f' being a data constructor.  Does that have a name?

Indeed it does:

*Main> :t liftM
liftM :: Monad m => (a1 -> r) -> m a1 -> m r


Entry: Haskell overlapping type class instances
Date: Sun Oct  9 11:50:39 EDT 2011

Sometimes it can be very useful to have overlapping instances when
encoding data structures at type-level.  Especially so when
implementing embedded languages in Haskell.

[1] http://www.haskell.org/haskellwiki/GHC/AdvancedOverlap


Entry: CBN & CBV
Date: Sun Oct  9 14:43:47 EDT 2011

Is it possible to use a simple interface to abstract over both these
types of binary operators:

  a   -> b   -> M c
  M a -> M b -> M c

This would make it possible to combine nested expressions (unnamed
intermediates) and explicit bindings using the same interface.
Something that might work is this:

  mi a -> mi b -> mo c

Where we can have mi = mo or mi = 1, the identity monad.  Let's play
with this a bit.

It seems that this would work, but there is a problem making it
implicit.  I.e. the identity monad would always need a wrapper.

Overall it seems just simpler to work with (M a -> M b -> M c)
functions and use an explicit return when binding nodes.  See before.


Entry: Lifting pure functions to dataflow functions
Date: Thu Oct 13 13:55:44 EDT 2011

One of the problems that has been puzzling me for a while is to find a
good representation of the transformation that maps a pure
"applicative" function to a dataflow function.

The simplest case is one input, one output:

    (i -> o)   ->   (i' -> o' -> m ())

Here i, o are value types, while i' and o' are reference types, and m
is some kind of monad that keeps track of the reference one-time
binding or multiple assignment.

Generalizing this to multiple in/out is straightforward when the
outputs are encoded in a recursive tuple type as mentioned before.

    (i1 -> i2 -> o1 :& o2)  ->

    (i1' -> i2' -> o1' -> o2' -> m ())

An ad-hoc way would be to implement this for assignable references,
though with some more effort it should be possible to use one-time
binding as in the data flow language in Concepts, Techniques, and
Models of Computer Programming[1].  Here the input and output types
could be constrained by type classes.

I wonder, isn't this just lifting functions to Arrows?

[1] http://www.info.ucl.ac.be/~pvr/book.html


Entry: Types
Date: Thu Oct 13 15:20:10 EDT 2011

There is something extraordinary about programming with strong typing.
I spend a lot of time trying to come up with correct types for doing
program transformations.  Sometimes it seems like a total waste but
when things fall in place, usually the elegance shines.

At this point it feels as if I'm never really going to "get it".  As
if there is so much structure hidden behind the deceptively simple
constructs of type classes.


Entry: Functional Dependencies and Undecidable Instances
Date: Sat Oct 15 11:09:03 EDT 2011

Since I rely on them for some type-level hackery, let's look at what
these actually do:

  UndecidableInstances

  FunctionalDependencies

  ScopedTypeVariables

  FlexibleContexts

  FlexibleInstances

All except ScopedTypeVariables[2] are explained here[1].

[1] http://cvs.haskell.org/Hugs/pages/users_guide/class-extensions.html
[2] http://www.haskell.org/haskellwiki/Scoped_type_variables


Entry: Monads
Date: Mon Oct 24 17:38:58 EDT 2011

Learning Haskell I sometimes get the wrong intuition..  An example: It
was not at first clear to me that these two are not the same.

      (CBN)   f mx my

      (CBV)   do
                x <- mx
                y <- mx
                f (return x) (return y)

Here I'm using the Call-By-Name (CBN) and Call-By-Value (CBV)
terminology from [1] to indicate the different meaning between the
two.  The first one passes two (named) computations to f, while the
second one passes two values (they are monadic wrapped, but they are
"pure" due to the return).

The main difference is that in the second step, the sequencing has
already happened before f is invoked.

Now why is this?  It's probably simpler to see with a single case.

Why are these two expressions not always the same?

    mx >>= \x -> f $ return x

    f mx

The answer is that there is absolutely no reason they should be, and I
don't understand why I had the idea in the first place.  From the
perspective of f the two cases are vastly different.  In the former f
will always get a value that's the result of a return, which means it
is "simple" or "pure", while in the latter f can receive a value that
could be more complex.

Here's a counterexample using the list monad that illustrates this
difference.  The first test passes a one element list to f and does
that 3 times, collecting the result of each evaluation in a list.  The
second test just passes the list.

  t1, t2 :: Monad m => (m a -> m b) -> m a -> m b
  t1 f mx = mx >>= (\x -> f (return x))
  t2 f mx = f mx

  f l = [length l]
  l = [1,2,3]

  (t1 f l, t2 f l)  -- ([1,1,1],[3])


[1] http://homepages.inf.ed.ac.uk/wadler/papers/essence/essence.ps.gz

Entry: Monad transformers
Date: Mon Oct 24 22:30:34 EDT 2011

Two questions:

1. How can it be?

Following [1].  Dropping wrappers for clarity, we have:

  StateT s m a  =  s -> m (s, a)

which is a state monad parameterized by a monad.  If m = Id we
get the ordinary state monad:

                   s -> (s, a)

if two of these are chained we get

  StateT t (StateT s Id) a =

                   t -> (StateT s Id) (t, a)
                   t -> s -> (s, (t, a))

So the deal is: a monad transformer is parameterized by a monad in
such a way that simply substituting a different type will give a
meaningful result.


2. How can it be used?

Each transformer needs to define a method that provides access to the
wrapped monad:

  lift :: (MonadTrans t, Monad m) => m a -> t m a

It looks like the MTL has some type classes defined that prevent the
use of multiple lift applications to dig into the onion[2].

[1] http://web.cecs.pdx.edu/~mpj/pubs/modinterp.html
[2] http://www.haskell.org/haskellwiki/Monad_Transformers_Explained


Entry: Zipper: data structure derivatives.
Date: Tue Oct 25 13:22:11 EDT 2011

I need an inverted tree for the syntax representation of a simple
flowchart language with lexical scope for primitive data bindings and
mutually recursive functions (only tail recursion, no call stack).

data Expr = Let     Bind Expr
          | LetRec  [Fun [Var] Expr] Expr
          | If      Var Expr Expr
          | App     Fun [Var]

How to systematically derive?  Each recursive Expr node needs to be
turned around.


Writing this as a polynomial with variable:

 x = Expr

And coefficients:

 b = Bind
 f = Fun [Var]   -- no recursion here, so can be combined in 1 term
 v = Var
 a = App         -- same

We get for the constructors in the order above:

   b x
 + (f x)^n x
 + v x^2
 + a

The derivative of the polynomial is:

   b
 + (n+1) (f x)^n
 + 2 v x

The 2 numeric constants that appear distinguish between the different
branches of the LetRec and If trees.  I'm puzzled that Bind is just a
value though.  Let's try to reconstruct a type from the polynomial.
Hmm.. there's too much information lost.  The numeric constants refer
to different constructors, and the x seem to refer both to original
trees and inverted trees.  But the general idea seems to work out: If
has 2 choices, one for each branch, and LetRec has a couple for which
of the branch we're at.  Let has no selector, meaning there is only
one constructor that refers to the inverted Let list.

So let's do it manually.

EDIT: It appears to not be necessary.  The trick is to use delimited
continuations (which are zippers implicitly).  In [1] this takes the
form of a state-continuation monad, which in my Haskell implementation
looks like this:

makeVar :: TypeOf (Code t) => (Code t) -> MCode (Code t)
makeVar term@(Code sterm) = SC c where
  c (CompState n) k =
    Let var sterm (k state' ref) where
      state' = CompState $ n+1
      ref = Code $ Ref $ var
      var = Var typ nam 0
      typ = typeOf term
      nam = regPrefix ++ (show $ n)


[1] http://en.wikipedia.org/wiki/Zipper_(data_structure)
[2] http://www.cs.rice.edu/~taha/publications/conference/pepm06.ps


Entry: OverlappingInstances, IncoherentInstances
Date: Wed Oct 26 16:33:32 EDT 2011

Notes:

- GHC's default behaviour is that exactly one instance must match the
  constraint it is trying to resolve.

- It is fine for there to be a potential of overlap; an error is only
  reported if a particular constraint matches more than one.

- The -fallow-overlapping-instances flag instructs GHC to allow more
  than one instance to match, provided there is a most specific one


[1] http://www.haskell.org/ghc/docs/6.6/html/users_guide/type-extensions.html


Entry: Commuation
Date: Thu Oct 27 10:05:16 EDT 2011

How to name this pattern?

What happens a lot to me when programming with type classes in Haskell
is that I run into commutation problems, meaning that I run into
operations like

         a (b t) -> b (a t)

and their inverse that encode a morphism between the two types with
different nesting order, basically saying that a and b commute.

In general this doesn't hold: such functions usually do something
significant, and might overall not be invertible.

However in other cases the morphisms might be bidirectional and
somewhat trivial.  Is there a way to represent such a morphism in an
abstract way?  I.e. is there a way to automatically derive "trivial
morphisms" for cummutative type constructors?

Sorry, no example as this is just a vague hunch..


Entry: Invertible Functor?
Date: Thu Oct 27 11:09:02 EDT 2011

I recently ran into a] post about invertible functors.  Can't find it
now.

This is the type of application and abstraction in the embedding of a
typed lambda calculus:

  _app :: r (a -> b) -> r a -> r b
  _lam :: (r a -> r b) -> r (a -> b)

Where _app would just be from an applicative functor, here _lam is the
inverse.


Entry: Sussman: We Really Dont Know How To Compute
Date: Thu Oct 27 22:24:00 EDT 2011

- Brains are fast
- Computing: limiting factor is programmers / programming
- Generics and abstract evaluation
- Autodiff
- dynamic reconfigurability
- trade provability for flexibility
- propagators
  -> recent breakthrough: cell = info about value, not value
  -> cells merge insformation monotonically

[1] http://www.infoq.com/presentations/We-Really-Dont-Know-How-To-Compute


Entry: Functional Dependencies
Date: Fri Oct 28 13:47:56 EDT 2011

Main idea is that multi-param type classes are relations = sets of
tuples, and can be seen as relational databases.

The basic problem with multi-param type classes is ambiguity.  When
composing operations, it might be that some type parameters that are
in the constraints no longer appear in the right hand side.  I.e.

       TypeRel a b c => a -> b

Here 'c' is ambiguous.  See [1] section 2.4 for examples.


[1] http://www.reddit.com/r/haskell/comments/7oyg5/ask_haskellreddit_can_someone_explain_the_use_of/


Entry: Type Scoping
Date: Fri Oct 28 15:04:35 EDT 2011

Why are these 2 not equivalent?

# as type variables are not the same

  _lambda = lambda where
    lambda :: Args Value as ras =>
              (ras -> Identity (Value t)) -> Value (as -> Identity t)
    lambda f = Value rf where
      rf as = do
        Value t <- f  $ unpack $ Value (as :: as)
        return t

# as types are the same

  _lambda = lambda where
    lambda :: forall as ras t. Args Value as ras =>
              (ras -> Identity (Value t)) -> Value (as -> Identity t)
    lambda f = Value rf where
      rf as = do
        Value t <- f  $ unpack $ Value (as :: as)
        return t


Entry: Continuations
Date: Sun Oct 30 01:22:10 EDT 2011

It is rather instructive to creative use of a continuation monad for
Let insertion[1] implemented in a pure functional style; I'm writing
one in Haskell.  It uses a combination of CPS and direct style to be
able to create nested expressions, effectively manipulating the
toplevel continuation.

[1] http://www.cs.rice.edu/~taha/publications/conference/pepm06.ps


Entry: Cross Stage Persistence
Date: Mon Oct 31 10:59:05 EDT 2011

I was thinking about code generation, and didn't see what a true
multi-stage language like MetaOCaml would bring over the
typed-class-and-untyped-syntax approach that is easy to do in Haskell.
Of course, as mentioned elsewhere (probably somewhere here[1]) the big
deal is cross-stage persistence.  For maximum flexibility, you want to
be able to call library functions in generated code.

For me that's not an issue, since I'm only using it for offshoring.
The target is limited in that there are mostly no libraries needed;
it's all low-level calculations.

[1] http://okmij.org/ftp/


Entry: DB normalization
Date: Mon Oct 31 17:18:46 EDT 2011

For abstracting my Apache logs I'd like to do something like this:

  (url, ip, client)

but represent it as this:

  (idu, idi, idc)

  (idu, url)
  (idi, ip)
  (idc, id-client)

because:
  1. many of the urls & ips & clients will be duplicated
  2. i'm interesting in the set of unique url, ip, client, ...
  3. implementation: does it search faster, use less storage?


Is this non-normalized?  I'm not sure, though putting them back
together is a join[1]:

  If columns in an equijoin have the same name, SQL/92 provides an
  optional shorthand notation for expressing equi-joins, by way of the
  USING construct:

  SELECT *
  FROM   employee
         INNER JOIN department
            USING (DepartmentID);


Thinking about this a bit, it makes sense to replace fields with IDs
in this way for the reasons above whenever the only useful operation
on that field is just equality.  Equality distributes over keys.
Other fields like data, number of bytes, ... that support other
operations (i.e. range) should stay in the main table.

[1] http://en.wikipedia.org/wiki/Join_(SQL)


Entry: Generalized Arrows
Date: Tue Nov  1 09:03:26 EDT 2011

I was intrigued by the use of the word "metaprogramming" in:

Like Haskell Arrows, generalized arrows provide a platform for
metaprogramming. Unlike multi-stage languages and Haskell Arrows,
generalized arrows allow for heterogeneous metaprogramming. Arrows
support metaprogramming only when the guest language is a superset of
Haskell, because every Haskell function can be promoted to a guest
language expression using arr.  Generalized arrows remove the
assumption that this sort of promotion is possible. This enables
heterogeneous metaprogramming.

[1] http://www.cs.berkeley.edu/~megacz/garrows/


Entry: Edge filtering for Interrupt On Change
Date: Tue Nov  1 13:28:37 EDT 2011

Given CHANGE and STATE registers, and assuming that reading the CHANGE
register also clears it so it can detect new hardware events, how to
reliably filter out edges?

Suppose we have an ISR that is triggered by CHANGE becoming 1, reads
CHANGE, then reads STATE.  Then we filter an edge based on the value
of STATE.  For short pulses we filter on the trailing edge.

Can it reliably detect an arbitrarily short pulse that is long enough
to be handled by the hardware, but might be too short to be seen by a
port readout, or that is shorter than the interrupt to CHANGE readout?

In the case of a short pulse, these are the possible ordering of edge
and register read events, with l,t the leading and trailing edges and
C = CHANGE read and S = STATE read.

 (1)  l C S   t C S       both C S are in time to see first edge
 (2)  l C     t S C S     S is too late ...
 (3)  l       t C S       bot C S are too late ..

In the first 2 cases, 2 interrupts are triggered, while in the second
case only one is triggered as the HW edge detect did not see the
second edge.  Using just the value of S is not enough to detect only
one edge per pulse, as this would trigger on both edges in case (2).


Remarks:

  * The above seems to work for pulses that are long enough or short
    enough, but the middle case is puzzling.

  * Is this a problem?  How to fix it?

  * Why didn't I hear about this before?  Probably because it's solved
    by using long enough pulses such that interrupt -> C,S reads all
    happen before the trailing edge.

  * For level triggered interrupts of course one triggers on the
    leading edge because it is never going to be missed as it will
    only be lifted after acknowledgement which is necessarily after
    detection.

The scenario above is for an Atmel AT91SAM7.  After a bit of Googling
I find this in Microchip AN566[1] which talks about using the IOC pins
for handling external interrupts:

  An interrupt pulse with a small pulse width requires less overhead
  than a wide pulse width. A small pulse width signal must be less
  than the minimum execution time of the interrupt service routine,
  while a wide pulse width must be greater then the maximum time
  through the interrupt service routine.

They mention these 2 good cases (1) and (3), but not the bad case (2).
Maybe for the PIC there is no problem because there is an atomic read?
I'm not sure exactly how the mechanism works there..

Indeed, It looks like there is only a single interrupt flag, not a
per-pin flag as on the AT91, and this flag is cleared when the input
pins are read.  So here there are only two cases, the C and S
operations are atomic:

 (p1) l CS t CS
 (p2) l    t CS

Is there a way to fix the AT91 problem?  Does a 2nd read solve it?
Let's try C S C.

 l C S C   r C S C
 l C S     r C
 l C       r S C
 l         r C S C

At first sight it looks like this makes it at least possible to
distinguish the cases, but it makes it harder to do in parallel for a
number of pins..  FIXME: check with clear head..


[1] http://ww1.microchip.com/downloads/en/AppNotes/00566b.pdf


Entry: Ad-hoc syntax design
Date: Tue Nov  1 16:14:19 EDT 2011

There are two ways to look at languages:

  - Properties of the grammar.

  - Properties of the parser.

For an ad-hoc language the latter is far more useful to focus on than
the former.  It's cool to be able to derive parsers from grammars
automatically, but the restrictions that are necessary to make this
work well require some getting used to.

On the other hand, if you focus on keeping the parser simple so it can
be done by hand using recursive descent, there are a couple of ways to
make language design decisions that keep the parser simple.  In
general this is: "avoid backtracking".

Usually, some amount of backtracking is necessary, but it's probably
best to keep it local and bounded such that it can be implemented by a
simple linear succession of tries that don't need large context store,
i.e read something that's finite size and stick it in a buffer, then
try to parse it in any of a couple of ways.


Entry: Relational Databases
Date: Wed Nov  2 15:03:27 EDT 2011

I don't know much about relational databases[1].  However, I did read
the chapter on relational programming in CTM[2] again yesterday, and
was reminded that relational programming is essentially logic
programming.

CTM uses the Oz language to unify (!) a lot of different programming
concepts, essentially by separating variable and values, allowing to
separate:

  - variable creation (here's this variable)
  - variable binding (that variable is the same as this variable)
  - value creation (this variable has that value)

The interesting thing here is that you can have both directed
(functional, dataflow) but also undirected information flow, which is
essentially logic programming.

The big deal in programming with relations (predicates) is exactly
that: information can flow in many directions.  This is the SQL
"WHERE" clause.


[1] http://en.wikipedia.org/wiki/Relational_database


Entry: Lifting & subclassing
Date: Mon Nov  7 19:39:16 EST 2011

Is lifting the same as subclassing?

I.e. in OO, a derived class can call it's superclass' methods, which
is really similar to lifting an operation over a larger type,
i.e. what 'return' ('pure') does.

The difference is that in the OO case it happens automatically, while
in the (typed) FP case it uses an explicit conversion.


Entry: Integer Programming
Date: Fri Nov 11 10:50:38 EST 2011

I got the coefficient data for an integer programming (IP) problem in
a spreadsheet.  I want to get it into Haskell.  How to do this?

I first tried to copy & paste them into emacs and use keyboard macros
to beat them into shape so they can be directly represented in Haskell
source code.  This doesn't seem to be such a good plan.

Is there a way to dump out a part of a spreadsheet in a format that's
easy to get into Haskell?  Maybe CSV is the simplest way, so I'm
trying that first: copy and paste part of a table into a new
spreadsheet, then save as CSV.


Entry: Elementary function evaluation
Date: Tue Nov 15 20:12:23 EST 2011

Probably for PIC/dsPIC: sin/cos/exp/...


Entry: Condition Variables are not Semaphores
Date: Sat Dec 10 15:28:21 EST 2011

A condition_signal() will only unblock a condition_wait(), but it will
not cause semaphore-like behavior where condition_wait() will not
block if there is a condition waiting.

A condition variable is really just a queue of threads that are
waiting to be woken up.

( Observed in some obscure thread-priority bug where one thread was
not allowed to start up to the point it was actually waiting on a
condition variable, so it missed a signal causing a deadlock. )

[1] http://en.wikipedia.org/wiki/Monitor_(synchronization)#Blocking_condition_variables


Entry: Composite return values in C
Date: Thu Dec 15 09:49:21 EST 2011

This is something which I never use because I thought it was not
possible, but returning composite values is not a problem in C.  What
I wonder is how the ABI handles this.

  struct foo {
      int a;
      int b;
      int c;
  };

  struct foo make_foo(void) {
      struct foo foo = {};
      return foo;
  }

As mentioned in [1], see the -fpcc-struct-return and
-freg-struct-return options in [2].  It seems that this is in essence
not a problem.


[1] http://stackoverflow.com/questions/161788/are-there-any-downsides-to-passing-structs-by-value-in-c-rather-than-passing-a
[2] http://gcc.gnu.org/onlinedocs/gcc/Code-Gen-Options.html


Entry: Continuation Monad & compilation
Date: Sun Dec 18 11:14:07 EST 2011

It looks like the continuation monad is a very important/useful
abstraction for compiling tree structures with lexically scoped
identifiers, especially when you want to have an idea of "current
context" in which those identifiers are defined, i.e. the operation:

  Insert at current point in the subtree a definition of identifier ID
  and evalate the rest of the syntax generation in a deeper subtree
  that cannot escape the context of this definition.

Essentially what a continuation monad can do, used in this way (as a
partial continuation) is to make sure that subsequent continuation
manipulations can't escape a subtree.

I find this remarkable to the point of leading me to change my mental
picture of a partial continuation as a "guaranteed consistent context".

What I don't understand though is why partial continuations appear so
naturally in Haskell's contunation monad.  Maybe the way I'm using it
in the meta/dspm compiler is just a bit ad-hoc special-cased for this
to show up that way..


Entry: Functors
Date: Tue Dec 27 08:50:22 EST 2011

Writing a typed embedded language I've been running into a pattern of
why I call "type commutation", which turns up when writing
representations of functions/structures in terms of
functions/structures of representations, i.e.:

  2x1D  (1)   r (a -> b)  <->  (r a) -> (r b)
        (2)   r (a, b)    <->  (r a, r b)

  1x1D  (3)   r (x a)     <->  x (r a)

Case (1) is almost the Functor type in Haskell, which expresses this
for functions.  To be exact it is a bidirectional functor[1].

What is this generic pattern called?  I.e. not just functions but
generic 1,2,3,... dimensional type constructors?

Also, these all seem to be commutations between * -> * (i.e. r in the
above), and * -> * or a multi-argument kind (i.e. x, (->), (,) in the
above), but never between multi-argument kinds.  Or maybe it does.
This is a relation between two * -> * -> * kinds.

  2x2D  (4)   (a -> b, c -> d)  <->  ((a,c) -> (b,d))

Anyways, such a relation has many more degrees of freedom that don't
seem to make sense, like replacing c (input) and d (output) in the rhs
of (4).

There's a clear pattern here.  I'm missing some language to talk about
it.  Probably category theory.  Is this a natural transformation[2]?

As a diagram, representing (,) tupling as =>

    a -> b            a       b
      ||              ||  ->  ||
      \/              \/      \/
    c -> d            c       d

With lots of handwaving, I guess so.  A natural transformation maps
functors to functors.  In the left diagram. -> is an arrow, => is a
Fuctor, and in the right diagram, => is an arrow and -> is a Functor.
Anyways... for later deconfusion.


[1] http://hackage.haskell.org/packages/archive/fclabels/0.4.2/doc/html/Data-Record-Label.html#3
[2] http://en.wikipedia.org/wiki/Natural_transformation


Entry: Mapping trees to integers
Date: Sat Dec 31 09:46:36 EST 2011

More specifically in Haskell: given a tree which is only terminated in
nodes that are temselves mappable to integers, how to map such a tree
to an integer (i.e. for exact hashing?)

Mapping (positive) integer sequences to integers is quite
straightforward.  As a base use the primes, and express the tuples as
prime powers.  Anything that can be mapped to positive integer
sequences can be encoded that way.

So how about trees?

Since part of the problem is solved, the question remains: how to map
an arbitrary tree into a sequence of positive integers?

Let's start with this [1]

Actually.. It's a lot simpler maybe to just map everything to bits.

Here's the two implementations mapping the datatype Type to integers.

  data TypeName   = AFloat | AInt | ABool | AVoid  -- atomic
                  | ATree TypeTree                 -- composite
                  | AType Int                      -- indexed type (see PrettyC.hs)
                  deriving (Eq,Show)

  data TypeTree   = AAtom Type
                  | ANil
                  | ACons Type Type
                  deriving (Eq,Show)

  data Type       = Type TypeName TypeOrder
                  deriving (Eq,Show)

  type TypeOrder = Int

Prime-encoded positive sequences:

  primes :: [Integer]
  primes = sieve [2..]
    where
      sieve (p:xs) = p : sieve [x|x <- xs, x `mod` p > 0]

  hashPos :: [Integer] -> Integer
  hashPos is = hp is primes where
    hp [] _ = 1
    hp (i:is) (p:ps) = p ^ i * hp is ps

  typePos :: Type -> [Integer]
  typePos = typ where
    name AFloat = [1]
    name AInt   = [2]
    name ABool  = [3]
    name AVoid  = [4]
    name (AType i) = [5, 1+i]
    name (ATree t) = [6] ++ tree t

    tree ANil = [1]
    tree (AAtom t) = [2] ++ typ t
    tree (ACons t1 t2) = [2] ++ typ t1 ++ typ t2

    typ (Type n o) = name n ++ [1+o]


Binary sequences:

  hashBin :: [Integer] -> Integer
  hashBin = hb where
    hb [] = 1
    hb (b:bs) = b + 2 * (hb bs)

  typeBin :: Type -> [Integer]
  typeBin = typ where

    -- One case, no prefix.
    typ (Type n o) = (name n) ++ (num $ toInteger o)

    -- 6 Unique prefixes.
    name AFloat    = [0,0,0]
    name AInt      = [0,0,1]
    name ABool     = [0,1,0]
    name AVoid     = [0,1,1]
    name (AType n) = [1,0]   ++ (num $ toInteger n)
    name (ATree t) = [1,1]   ++ (tree t)

    -- 3 Unique prefixes
    tree ANil          = [0]
    tree (AAtom t)     = [1,0] ++ (typ t)
    tree (ACons t1 t2) = [1,1] ++ (typ t1) ++ (typ t2)

    -- Self-delimiting numbers.
    num 0 = [0]
    num n = [1, mod n 2] ++ (num $ div n 2)


[1] http://stackoverflow.com/questions/3596502/lazy-list-of-prime-numbers


Entry: Register Allocation
Date: Wed Jan 11 11:47:31 EST 2012

It looks like both Staapl[1] and the DSPM language in meta[2] will
eventually need some form of register allocation/reuse if I'm going to
compile down to PIC code without relying on a C compiler to do that
for me.

(OTOH, I wonder how good AVR GCC is doing that optimization job.)


[1] entry://../staapl
[2] entry://../meta
[3] http://en.wikipedia.org/wiki/Register_allocation


Entry: State machines
Date: Wed Jan 25 17:22:06 EST 2012

State machines have popped up a lot lately:

  - Current consulting project: how to make an exhaustive test for a
    relatively implicitly specified state machine.

  - Staapl: defining state machines (protocol-oriented programming) in
    a concatenative language: functional specification and
    instantiation (register / global variable allocation).

  - meta/dspm/Loop: SSA / CPS / ANF without procedure calls.


For most embedded work they seem to be a good solution, but can
sometimes be hard to test.  Is there a good way to link the high level
specification and low level implementation with a good testing
strategy.


Entry: State machines / parallellism and resource allocation
Date: Thu Jan 26 06:59:00 EST 2012

And then..  I'm thinking that in this whole parallellism debate,
shouldn't we go back to "writing" electronics instead of programs?

So I wonder, is that really just a problem of resource allocation?
SM's are finite, but most programming models are infinite (infinite
memory for storage and execution stacks/continuations).  Of course
this model breaks down because this infinite general model has to be
"small enough" to be implemented on a finite machine.


Entry: Reader monad and order
Date: Sat Feb 18 10:07:32 EST 2012

I'm trying to capture the idea of "context dependent state
transformation".  Something doesn't quite add up here...

There seem to be a couple of ways to formulate this.  Let c be
context, s be state and R be a Reader monad, which is a
context-dependent computation:  R a b  ==  a -> b

( Concretely: c is "world state" of an animation, i.e. read-only or
"stiff" background info like current time, and s is "object state" of
an animation, meaning the animation's dynamics = state of its
equations of motion. )


   A)  c -> s -> s    or    R c (s -> s)

   B)  s -> c -> s    or    s -> R c s

   C)  (s,c) -> s

Where R is a reader monad.

The computations I want to fit in a framework have initial s and c
available at the same time, so C) is the type that corresponds best to
reality.  Why is there an ambiguity when trying to write this as a
Reader monad?  Which of A) or B) is the correct/appropirate one?  Is
the Reader monad the appropriate model?  Is this a Co-Monad?  (EDIT:
The answer seems to be Yes[1]).


Something I ran into before while trying to capture state machines /
state space models is that the following correspondences are not
really bijective.  How to make that "really" precise?

    (a,b) -> c   <=>    a -> b -> c
    (a,b) -> c   <=>    b -> a -> c

EDIT: Above isn't expressed well.  According to [1] these really are
the same.  I just changed the animation types from

    m (s -> s)
to

    s -> m s

It seems that while this doesn't make a difference for the Reader
monad, for other monads it does.  I.e. I could use a state monad to
thread the RNG state without trouble.


[1] http://comonad.com/reader/2008/kan-extensions-ii/


Entry: State Space Models: Arrow and generalized functors.
Date: Sat Feb 18 10:28:05 EST 2012

I'm trying to find a good way to represent state-space models in the
standard Haskell type classes (categories?).

The basic form is the relationship between an update equation and
(infinite) sequences.

      ((s,i) -> (s,o)) -> ((s,[i]) -> (s,[o]))

This is a generalization of a haskell Functor in terms of Arrows
instead of functions.

The update equation is an Arrow:

      U s i o   = (s,i) -> (s, o)

Normal Haskell Functor F:

      fmap :: (i -> o) -> (F i -> F o)

Generalized Haskell Functor in terms of arrow A instead of
arrow/function (->):

      fmap' :: A i o -> A (F' i) (F' o)

What I'm interested in is then the less general F' = []

               (U s) i o -> (U s) [i] [o]

Where A = U s is the Arrow parameterized in the threaded state object.

So the final types are something like this:

      fmap  :: Functor f           => (i -> o) -> f i  ->  f o
      gfmap :: Arrow a, GFunctor f => a i o    -> a (f i) (f o)

What is GFunctor?  It's a pattern I don't recognize.  It pops up in
less general form (GFunctor == []) in iterated functions:

      gfmap :: ((s,i) -> (s, o)) -> (s,[i]) -> (s,[o])
      gfmap f (s,[]) = (s,[])
      gfmap f (s, i:is) = (s'', o:os) where
         (s',o)   = f (s,i)
         (s'',os) = gfmap f (s', is)

See next post.


[1] http://www.haskell.org/ghc/docs/latest/html/libraries/base/Control-Applicative.html#t:WrappedArrow


Entry: Functor in terms of Arrow
Date: Sat Feb 18 11:23:55 EST 2012

Dear HC,

Does AFunctor below have a standard name?  It's a generalization of
the Functor class in terms of Arrow instead of (->):

      fmap  :: Functor f           => (i -> o) -> f i  ->  f o
      afmap :: Arrow a, AFunctor f => a i o    -> a (f i) (f o)

It pops up in less general form (AFunctor = []) in iterated functions
(difference equations / state space models), where the arrow is the
update function parameterized by state type:

  data Iter s i o = Iter ((s,i) -> (s,o))
  instance Arrow (Iter s)

More concretely:

> afmap :: ((s,i) -> (s, o)) -> (s,[i]) -> (s,[o])
> afmap f (s,[]) = (s,[])
> afmap f (s0, i:is) = (sn, o:os) where
>  (s1, o)   = f (s0,i)
>  (sn, os)  = afmap f (s1, is)

> f (s,i) = (s', s') where s' = s + i
> is = [1,1,1,1,1]
> os = afmap f (0,is)  -- (5,[1,2,3,4,5])


Cheers,
Tom


Entry: Forking a random number generator?
Date: Sat Feb 18 13:39:19 EST 2012

For an animation framework I need random numbers in the "leaf nodes"
of an animation tree.  However, I don't want to introduce a serial
dependency over the tree, i.e. through a state monad to track RNG
state.

Is it possible to "split" an RNG such that it has a tree-like (Reader
monad / S-combinator) dependency graph, while keeping the sequences
generated in the leaf nodes of this tree "random enough".

There is a random number generator that is seeded by integers:

  mkStdGen :: Int -> StdGen

so maybe the question is: how to fork integers?  Or, how to fork them
enough such that collisions are rare?

A very un-informed way would be to just multiply the seed by a prime
number.  This will not reduce the configuration space and give a
reasonable "decorrelation" if the prime number is large enough?

The funny thing is that the decorrelation itself will be simpler to
express as a serial operation when forking a random number of states.
To parallellize this again a list of primes could be used..


Then when lists of primes arrive, it's probably also possible to just
use binary trees: shift by one and fork +0, +1 though that seems to
run out of states faster.


Entry: Monad transformers
Date: Sun Feb 19 09:58:44 EST 2012

Forget the creative forking of last post, I'm going to use Reader +
State.  I had to write a small example program to understand the
wrapping / unwrapping mechanism.


> f :: s -> M s
> f = undefined

which is wrapped in this monad onion:

> type M = ReaderT String (StateT Int Identity)

Given a value :: s and the function :: s -> M s, we can unwrap one
layer at a time.  First peel off the ReaderT, then the StateT and last
the Identity.

> run s = s' where
>   mStateT  = runReaderT (f s) "Context"
>   mIdentity = runStateT mStateT 123
>   Identity (s', _) = mIdentity


ACCESS:

> getInt :: M Int
> getInt = lift get

> getString :: M String
> getString = ask

(Check these later; timing out..)

[1] http://hackage.haskell.org/packages/archive/mtl/2.0.1.0/doc/html/Control-Monad-Reader.html
[2] http://hackage.haskell.org/packages/archive/mtl/2.0.1.0/doc/html/Control-Monad-State.html
[3] http://cvs.haskell.org/Hugs/pages/libraries/mtl/Control-Monad-State.html
[4] http://cvs.haskell.org/Hugs/pages/libraries/mtl/Control-Monad-Reader.html


Entry: Constraint programming, layout & choreography
Date: Fri Mar 23 20:47:42 CET 2012

The trick is to be able to non-causally describe convergence.  It's
easier to do this "backwards in time".  Same goes for spacial
parameters.

Example: have two animiation meet in one point.  It's simpler to
directly specify that they meet at time t=t' than to figure out when
to start them such that this property is met implicitly.

Example: how to center an asymmetric animation based on its final
configuration.

I need a way to convert a constraint-based description to a linear
sequence.  This most likely requires a 2-step approach: 1) solve all
unknowns and 2) perform a coordinate transformation or lookup.

So how to do this?  I'm probably going to be helped with just a local
constraint propagation solver (equations as bi-directional functions).


Entry: Relational, Logic, Constraint: CTM
Date: Sat Mar 24 16:50:17 CET 2012

Logic is Relational w. inference (composite relations)
Constraint is Relational w. fundeps + conversion to functional.

Better way? See CTM.


Entry: Invertibility through sparseness
Date: Sat Mar 24 17:05:35 CET 2012

Thinking a bit more about constraint solving.

If all constrains are linear, then GE is the way to go.

What makes local propagation interesting is that it allows solution of
nonlinear equations that remain unique (invertible) through
sparseness.

I.e. there's a difference between an equation like

             xy = 1
and
             x^2 - y = 0


The former is a bijection while the latter isn't.

The interesting observation is that many practical systems are
nonlinearly constrained but remain invertible, or at least locally
invertable for a wide range around the solution.


Entry: Invertible nodes constraint solver: limitations?
Date: Sat Mar 24 17:16:31 CET 2012

So what are the limitations of an invertible node solver?  I'd say: no
dependency loops.  If there is a loop, more powerful "primitives" are
necessary.  Each loop needs a "global" solver.

Though, note that it's OK to have loops in the (undirected) network,
but not in the derived DAG.

So how would this be called?  An N-in, M-out function that's
multi-directional, i.e. any pair of in/out can be swapped.

In case information is lost in one direction this will be related to
bi-directional lenses[1].  See next post.  Looks like I need bijective
lenses.

[1] entry://20120325-111325


Entry: Pierce's lenses - Bidrectional programming
Date: Sun Mar 25 11:13:25 CEST 2012

Bi-directional is less strict than bijective: sometimes information is
lost in one way, in which case the other direction is an update
operation that takes into account some of the information present in
the original source.

    get      :: S -> T
    putback  :: T x S -> S

    get (putback (t, s)) = t
    putback (get s, s) = s
  + putback (t2, putback (t1, s)) = putback (t2, s)

The last one is forgetfulness and is optional.  Has to do with delete
vs. undo.

If putback ignores the S argument it is bijective.  Sometimes too
strong but nice when it holds.

[1] http://www.cis.upenn.edu/~bcpierce/papers/lenses-etapsslides.pdf


Entry: Adding arrows to a network
Date: Sun Mar 25 11:39:36 CEST 2012

A bijective constraint network (equation network) is a collection of
nodes and relations, where each node takes part in a number of
relations, but is output to only one relation.  Solving the network is
to determine by which relation a node is determined.

The fact that this is local allows focus on compositon.

Can this be done using a simple bitmask?  I.e. this is a form of
3-state logic:

  0  output
  1  input
  x  unknown

Probably the I/O assignment and actual function "production" can be
separated.

- When a node is asserted, for each connected equation, determine if
  the degrees of freedom are reached and propagate for all relations
  not self.

Using an imperative algorithm this seems straightforwad.  Is there a
functional way?

Taking a break.  This probably needs some more background processing..

The input is a network so I'm thinking of using some kind of spanning
tree representation.  The result is also a DAG, so a simple tree rep
won't work for the result.  (If this where possible, the network might
have been represented by "a" solution, where changing of I/O
configurations would simply transform this solution).


Entry: The usefulness of local state / working with graphs in Haskell
Date: Sun Mar 25 13:10:56 CEST 2012

Some handwaving to follow.

I find it hard to do real work in Haskell.  Many, many algorithms take
advantage of in-place updates of data structures, moving from one
consistent configuration to another one in a (conceptually) single
step.  Often this incremental update is key to the efficiency of the
algorithm.  Creating a full duplicate of the data structure at each
algorithm step is often too expensive and probably also annoying,
since it often needs more info than a local update only.

It seems that most functional algorithms use some kind of inside-out
representation using zippers, which allows an abstraction of the
current edit point while keeping updates cheap: most of the deep
structure is reused while the only the local edit point needs a new
construction.

So let's dig into this a bit deeper.  Given a generic graph structure,
how to practically represent it in Haskell?

Google gives me this[1].  It describes how to work with graphs,
focusing on the operations: cata/ana-morphisms.  It's interesting how
abstraction is kept fairly high, while the nitty gritty uses an
imperative algorithm.

How to load this in my head?  Some take-away points:

 - knot tying requires node-equality to allow recursive traversal.
   since there is no pointer equality, this requires unique
   identifiers for each node.  overall it seems more of a hassle than
   anything else; finite lists of nodes/edges may make more sense from
   an implementation pov.

 - imperative algos (i.e. in ST[2]) aren't necessarily evil.  for
   graphs they are probably way too useful/efficient to dismiss
   (i.e. node marking vs recording node tags in a dictionary).


[1] http://www.haskell.org/haskellwiki/The_Monad.Reader/Issue5/Practical_Graph_Handling
[2] http://www.haskell.org/haskellwiki/Monad/ST


Entry: Evaluation of equation network
Date: Sun Mar 25 17:29:34 CEST 2012

Input:
  - set of nodes
  - set of relations that refer nodes
  - subset of nodes with initial values

Output:
  - compute the corresponding output values or fail.

It's important to not change the form of the algorithm when the subset
of specified values changes.  I.e. in the problem at hand, what is
important is the value of all nodes.  Which nodes that serve as input
should be shielded from the code that uses the node values.

However, it might be useful to perform the search in 2 steps: compile
I/O config to function, then use function multiple times.

Thinking about this, it seems that the most important part here is to
actually specify the inputs.  How to do this without loosing
composition?  I.e. if it's encoded in types then the types already
specify the structure..

I lack experience to find a way to express this properly..

( input spec ? )
-> X..XX.XX...  partially completed
-> XXXXXXXXXXX  fully completed, after evaluating equations


The structure is known at compile time.


  f x y =
    do
      [a,b,c] <- nodes 3
      in a x
      sum [a, b, c]  -- sum == 0
      prod [a, b]    -- product == 1
      solve

'solve' returns a structure of all the nodes, currently a list
(of floats) but this should probably be a heterogenous type

'nodes' creates a list of node variables

'in' initializes a node value

'sum' and 'prod' are multi-directional constraints.


Trouble with this approach is that it doesn't compose: the whole
network needs to be defined at once.  Well, maybe not.  Let's try to
implement this with the ST[1] monad and see where it breaks.

EDIT: it seems there is definitely some advantage here to separate
content from structure, which in Haskell can often be done using type
classes.  It seems that the whole "network building" could be done as
a compile-time computation.  Would this make it easier, or is this
just one of those neat side-tracks that doesn't add much to the end
result?

[1] http://www.haskell.org/haskellwiki/Monad/ST


Entry: Compile time mutable operations
Date: Mon Mar 26 12:00:11 CEST 2012

I want to perform some operations on a network data structure at
compile time, i.e. using type classes and fundeps.  Is it worth
finding this out, or should I look for a better way to do compile time
programming in Haskell?

I sure do miss Scheme macros.  The direct but incremental approach is
a lot easier to navigate.


Entry: The ST monad
Date: Mon Mar 26 12:18:17 CEST 2012

I need some mutable arrays, i.e. Data.Array.ST and have no idea how it
works.  So, here's some Q&A.

 - How to create a mutable array of size 10 with all 0 elements?

   arr <- newArray (1,10) 0

 - How to convert a mutable array to an immutable one?  The ST Array
   lives in the ST monad, so use newArray and operations on the array
   to construct a monadic value ST (STArray).  runSTArray can be used
   to convert this to Array.

That solves construction and final output.

 - How to iterate over the elements of an array?  Probably simplest is
   to use forM_, which is essentially a combination of map (construct
   a list of monadic computations) and sequence (weave a list of
   monadic computations into one computation).  The function `indices'
   caused some confusion as this is only defined for Array, so I use
   the following approach:

     forA a f = do
       (a,b) <- getBounds a
       forM_ [a..b] f

 - Updating array elements: use writeArray.


[1] http://hackage.haskell.org/packages/archive/base/latest/doc/html/Control-Monad.html#v:forM-95-
[2] http://www.haskell.org/haskellwiki/Arrays


Entry: Indexing binomial combinations
Date: Mon Mar 26 15:55:16 CEST 2012

What is the simplest way to "name" the different combinations in a
select I from N problem?

I.e. select 2 from 3:

1 -> .||
2 -> |.|
3 -> ||.

Select 2 from 4:

1 -> ..||
2 -> .|.|
3 -> .||.
4 -> |..|
5 -> |.|.
6 -> ||..

What I used above is just binary counting (digit reversed) and
skipping all entries that don't have the proper number of elements.

This seems quite straightforward, but I prefer something that doesn't
need a search.  For my app I is usually a lot less than N (mostly 1
and in some cases 2), how do I get the sequence number in the list
above starting from the coordinates of the empty spots?  For I=1 this
is already the desired encoding:

1 -> .||| |
2 -> |.|| |
...
n -> |||| .

Actually it's quite trivial when using a nested datastructure:

  - N lists, first element from N
  - N-1 lists, second element, each from N-1 possibilities

Problem solved.

EDIT: Sun Apr  1 12:02:11 CEST 2012

I made it work with the code below.  Question is if this is really
useful, and if an implicit table ref isn't a better way to go because:

* final result needs to be a table

* a single function acting on a table might be simpler to use than a
  bunch of functons + shuffler routines.

EDIT: Actually this is not even correct.  It indexes proper
permutations while what I need is just selections (all permutations of
outputs and inputs separately can use the same function)

I'm moving to an implicit, imperative array-based implementation.  See
next post.


-- A multi-directional equation node is represented by a tree of
-- functions, one for each possible combination if inputs and outputs.
-- Each level in the tree fixes one output.  Canonical ordering of
-- outputs is from left to right starting at index 0, and spanning all
-- remaining nodes at a particular level (i.e. first level has N
-- elements, second level N-1, ...)

-- I.e. for a 5 node, 1 output network
--  IIOII  <-> [2]
--  OIIII  <-> [0]
--  IIIIO  <-> [4]
--
-- for a 5 node, 2 output network:
--
--  IIOOO  <-> [0,0]
--  OOOII  <-> [4,3]
--  IOOOI  <-> [0,3]


-- The ordering of the EquFun list follows a canonical list of
-- permutations (defined elsewhere).  See equFunIndex
data EquImpl = EquImplFun EquFun
             | EquImplSelect [EquImpl]

equImplRef' :: EquImpl -> [Int] -> EquFun
equImplRef' = f where
  f (EquImplFun fn) [] = fn
  f (EquImplSelect fns) (i:is) = f (fns !! i) is

-- Convert list of I/O to permutation tree index.
data EquIO = EquIn | EquOut deriving (Eq, Show)
equIO2Ref = f [] 0 where
  f c n [] = c
  f c n (EquOut:e) = f (c ++ [n]) n e
  f c n (EquIn:e)  = f c (n+1) e

-- For debugging: using 0,1, instead of EquOut/EquIn
equIO = map f where
  f 0 = EquOut
  f 1 = EquIn


Entry: Node binding (equation solving) : sequential approach
Date: Sun Apr  1 12:11:48 CEST 2012

It seems too much hassle to find a "family of functions" approach,
especially because the result just needs to be a table.  Might be
simpler to just do it in-place, and define a single "fill in solver"
for each primitive equation type.

Still, that makes it difficult to separate the structural compilation
step (network -> function) from the evaluation step.

Still very confused.

So what about this:

  - structural compilation (abstract evaluation) produces indices for
    the shared data function.

  - function is an array -> array map, parameterized by 2 lists of
    indices for input and output.

Let's make an example for a linear functional.

I'd like to be able to work with mutable arrays inside the
implementation, so which should be the array at the interface
boundary?  This is determined by the type of

  runSTArray ::  Ix i => (forall s. ST s (STArray s i e)) -> Array i e

which is Array[1], for which the simplest constructor is:

  listArray :: (IArray a e, Ix i) => (i, i) -> [e] -> a i e


After playing with direct imperative access and looping indices for a
bit, it turns out that there are simpler ways.  I.e. the function
below solves a sum functional given a vector of coordinates and the
index to update.

  sumA = Data.Foldable.foldr1 (+)
  type Nodes = Array Int Double
  eqSum :: [Int] -> Nodes -> Nodes
  eqSum [o] ia = runSTArray $ do
    oa <- thaw ia
    writeArray oa o $ (ia ! o) - sumA ia
    return oa

But..  this doesn't take into account that the equations should have
references to nodes.  It seems that it's best to put the lowlevel
stuff inside an ST monad and work with references.  ( The excursion to
arrays was actually just a roundabout way of working with
references.. maybe also to not have to specificy holes by keeping them
implicit. )

  type Node s = STRef s (Maybe Double)
  foldNodes f = foldM (\accu el -> do
                          me <- readSTRef el
                          return $ me >>= f accu)
  foldNodes1 f (x:xs) = foldNodes f x xs

What about this as basic datastructure:
  - nb of nodes to satisfy
  - ordered list of references to nodes
  - op: fold over Just components

Probably an array of STRefs is easier to work with since it allows for
direct indexing.


[1] http://www.haskell.org/ghc/docs/latest/html/libraries/array/Data-Array-IArray.html#t:Array


Entry: Imperative programming in Haskell
Date: Tue Apr  3 12:52:25 CEST 2012

So I'm kind of fed up with this inability to express imperative
algorithms in Haskell.  Let's go for it.

1. collection of nodes -> Set STRef
2. collection of equations bound to nodes -> [STRef]

For specification we don't need to use String as node names; all can
be embedded in a monad such that lexical names can be used for nodes.

Ze Monad:

  [n1,n2,...] <- makeNodes n
  eq1 <- newEq [n1,n2,...]
  eq2 <- newEq [n1,n2,...]
  input n1 v1  -- (*)
  input n2 v2  -- (*)
  solve [eq1,eq2]
  return $ values [n1,n2, ...]

MAIN IDEA: The part marked (*) is what we'd like to change easily (in
the code) without having to change all the other code.


Entry: Product of State and ST?
Date: Tue Apr  3 13:16:26 CEST 2012

Does it actually make sense to use a product of State and the ST
monad?  Cant the global state go in an STRef?

Anyways, I couln't quite figure out how to do this (too abstract).

But I'm starting to get confused again.  Is it possible to build a
structure and its contents separately?  Structure can be reused.  I
planned to not think of that but maybe it's actually simpler to keep
them separate.

I.e. we don't need solvers that know how to scan for undefined
variables: this can be done completely generic.  The output of a
network compilation step is a program (which solvers to run in what
sequence connected to which nodes.


Entry: Interesting Recursion Patterns
Date: Thu Apr  5 22:44:36 CEST 2012

I ran into an interesting tree recursion pattern where there is both a
globally threaded state and a "fan-out" environment.  This doesn't
seem to be such an exceptional pattern though.


Entry: Equations for box layout
Date: Fri Apr  6 01:16:06 CEST 2012

Turns out there is a specific kind of equation that doesn't seem to be
easier to implement than with a dedicated algorithm that reflects its
recursive structure directly:

   - sum child box sizes into one box, propagate upward
   - receive 1 location and scale info and propagate downward

The algorithm is a very elegant approach using continuations to chain
everthing together.  More later!


Entry: Why are "stateful maps" not part of standard functions?
Date: Fri Apr  6 11:35:38 CEST 2012

Meaning:  (s,a) -> (s,b) -> [a] -> (s, [b])

Maybe this is because it's more easily handled in the state monad, in
combination with forM?   I'm not convinced, compare:

  test = flip runState 0 $ do
    forM [1..10] $ (\i -> do
                       s <- get
                       put $ s + 1
                       return $ i + s)


With:

  -- foldr with output stream
  foldo :: (a -> s -> (s, b)) -> s -> [a] -> (s, [b])
  foldo f = fld where
    fld s [] = (s, [])
    fld s (a:as) = (s, b:bs) where
      (s',  b)  = f a s
      (s'', bs) = fld s' as

  test1 = foldo (\a s -> (s+1, a+s)) 0 [1..10]

Why is 'foldo' not a standard library function?  Actually, it is! [1]


  mapAccumL :: (acc -> x -> (acc, y)) -> acc -> [x] -> (acc, [y])
  mapAccumR :: (acc -> x -> (acc, y)) -> acc -> [x] -> (acc, [y])

[1] http://www.haskell.org/ghc/docs/6.12.2/html/libraries/base-4.2.0.1/Data-List.html


Entry: forM
Date: Fri Apr  6 19:59:52 CEST 2012

Maybe it's simpler to just provide a forM-like abstraction for each
container datastructure instead of a fold.

   forM  :: [a] -> (a -> m b) -> m [b]
   forM' :: D a -> (a -> m b) -> m (D b)

it seems simpler to use than a fold.  I wonder though if this is
somehow equivalent.  Probably is, but in what way?

Actually, this seems to be Data.Traversable[1].

[1] http://www.haskell.org/ghc/docs/6.12.2/html/libraries/base-4.2.0.1/Data-Traversable.html


Entry: Data.Traversable  vs.  Data.Foldable
Date: Sat Apr  7 11:51:09 CEST 2012

Follow-up from last post.

From [1], a Traversable is a Foldable that is also a Functor.

This can be understood by observing that Foldable will just iterate
and thread a state, while Traversable will iterate, thread state, and
build a new data structure.  If you cancel the output datastructure
you get a fold, while if you cancel the threading you get a map.

What I find strange is that all of these can be seemingly defined in
terms of forM.  Let's try.

The missing insight seems to be that the constraint on traverse is
Applicative, and not Monad.  Why is this?  Below t is the container
type.

    traverse :: Applicative f => (a -> f b) -> t a -> f (t b)

The question is: why is a monad not necessary?  Or, why is the "monad
part" of a monad that's more powerful than the "applicative part" not
used?  Last time I was looking into this, the take-home argument was:
monads have value-dependent control flow, while arrows do not.  Not
sure about applicative though.

The important part for the Monad version of traverse is that it
imposes sequential traversal, eventually imposed by the composition of
calls to (>>=).  Can the same be done with applicative?  Maybe my main
problem is that I don't see how applicative imposes sequencing?  Let's
explore that in a separate thread.

BING!

The missing link is sequenceA, which bridges lists of computations and
the sequencing operator of an applicative functor.  This makes it sort
of obvious why Traversable just needs sequenceA.

However it seems simpler to define for directly, since this can be
converted to fmap by insert the identity Applicative.

[1] http://www.haskell.org/haskellwiki/Foldable_and_Traversable


Entry: Applicative and sequential operations
Date: Sat Apr  7 13:21:05 CEST 2012

Monads impose sequential operation through bind (>>=, >>) or the
Kleisli composition (>=>).  This translates easily to "do" notation
(let* for the schemers) which makes the sequentiall nature obvious.

However, Applicative functors also supposedly abstract side effects,
though I've never really understood what these "less powerful" side
effects really are.

In essence, what's the real difference between Applicative and Monad?

This SO article[1] says:

  Compared to monads, applicative functors cannot run its arguments
  selectively. The side effects of all the arguments will take place.

Obviously the difference is to be found in the API, so this would be
the difference between <*> and >>=.

The bridge between lists and sequencing seems to be this function:

  \a as -> (:) <$> a <*> as :: Applicative f => f a -> f [a] -> f [a]

which can be used in a foldr to turn a list of computations [f a] into
a computation that produces a list f [a].

I think I just rediscovered sequenceA for lists:

  sequenceA :: Applicative f => [f a] -> f [a]
  sequenceA cs = c where
    c = foldr push (pure []) cs
    push c cs = (:) <$> c <*> cs

That fills a gaping hole in my understanding.  Life will be different
from now on ;)

Combining sequenceA and map then gives traverse/for.

It's interesting how sequenceA is part of Traversable, i.e. that there
is no less generic version in Prelude that only works on lists.  Maybe
that's a good thing actually - more general from the start.

Letting this sink in for a bit it makes perfect sense: The sequential
nature of Applicative functors should really be compatible with
sequentially traversing a complex data structure.

[1] http://stackoverflow.com/questions/2104446/how-do-you-use-control-applicative-to-write-cleaner-haskell
[2] http://www.haskell.org/ghc/docs/latest/html/libraries/base/Data-Foldable.html
[3] http://www.haskell.org/ghc/docs/6.12.1/html/libraries/base/Control-Applicative.html
[4] http://www.haskell.org/ghc/docs/6.12.2/html/libraries/base-4.2.0.1/Data-List.html#5
[5] http://www.haskell.org/ghc/docs/6.12.2/html/libraries/base-4.2.0.1/Data-Traversable.html


Entry: Are state space models Applicative?
Date: Sat Apr  7 13:58:40 CEST 2012

Finally discovering sequenceA (which bridges lists and the 'sequencing
operator' <*> of an Applicative functor) makes me think that
Applicative is probably also a better abstraction for state space
models, as it's a lot easier to use than Arrow.


Entry: Left/Right confusion in mapAccum
Date: Sat Apr  7 15:18:50 CEST 2012

Basically, I used mapAccumR and my datastructure was reversed.  This
is kind of strange, and it doesn't seem to be explained in the docs.
Have to find the source later.

But the basic idea is that "left" and "right" refer to the order in
which the list is traversed.  The result list is always the same
order.

  mapAccumL :: (acc -> x -> (acc, y)) -> acc -> [x] -> (acc, [y])
  mapAccumR :: (acc -> x -> (acc, y)) -> acc -> [x] -> (acc, [y])

  *Main> mapAccumL (\acc x -> (acc, x)) 0 [1,2,3]
  (0,[1,2,3])
  *Main> mapAccumR (\acc x -> (acc, x)) 0 [1,2,3]
  (0,[1,2,3])
  *Main>

  *Main> mapAccumR (\l e -> ((e:l),e)) [] [1,2,3]
  ([1,2,3],[1,2,3])
  *Main> mapAccumL (\l e -> ((e:l),e)) [] [1,2,3]
  ([3,2,1],[1,2,3])

From the last example it's clear that L starts at the beginning while
R starts at the end.  So as long as the operator is associative, the
result shouldn't matter.


Entry: foldr from traverse + monad
Date: Sat Apr  7 16:53:35 CEST 2012

Defining foldl is simple using the State monad:

  foldl f s b = s' where
    (_, s') = runState (traverse f' b) s
    f' b = modify (flip f b)

Original question:

  Instead of "updating" state we could build a nested computation with
  a hole in it.  Is this somehow dual?

Indeed, and the approach is quite straightforward.

  foldr f s b = k' s where
    (_, k') = runState (traverse f' b) (\s -> s)
    f' b = modify (\k -> (\s -> k $ f b s))

Instead of getting a new state value by applying a function to it, it
works the other way around.  It takes the hole in which the final
value of the computating is inserted and replaces it with a different
hole, observes what is put in that hole, modifies it and puts the
result in the original hole.

Both approaches are dual: one updates values, the other updates holes.
Once duality pops up you find it everywhere of course:

- foldl / foldr are dual in the order they traverse and "cons" a list

- the above also works with State and RState (reverse State)

- in Data.Foldable the dual of a monoid is used to implement
  foldr/foldl in terms of foldMap  (library/base/Data/Foldable.hs[1])

[1] http://www.haskell.org/ghc/dist/7.0.4/ghc-7.0.4-src.tar.bz2


Entry: Reverse State monad
Date: Sat Apr  7 18:09:01 CEST 2012

It uses knot tying to construct a bi-directional data flow. From [1]:

  newtype RState s a = RState { runRState :: s -> (a,s) }
  evalRState s f = fst (runRState f s)

  instance Monad (RState s) where
      return x = RState $ (,) x
      RState sf >>= f = RState $ \s ->
          let (a,s'') = sf s'
              (b,s') = runRState (f a) s
          in (b,s'')

  get = RState $ \s -> (s,s)
  modify f = RState $ \s -> ((),f s)
  put = modify . const

Probably best to do this instead of the last 3 hardcoded methods:

  -- Is this allowed to be in here?
  instance MonadState s (RState s) where
    get   = RState $ \s -> (s,s)
    put s = RState $ \_ -> ((),s)

And also:

  instance Functor (RState s) where
    fmap = liftM
  instance Applicative (RState s) where
    pure = return
    (<*>) = ap


I don't find this in the standard libraries.  Is it removed?  Maybe
this is now a combination of ReverseT and State?

I used this monad to implement the bridge between traverse and foldr.
This works as long as the hardcoded "modify" is used.  Making RState
part of MonadState results in an infinite list of the same element.
Maybe this is just bottom and a conseqence of the knot-tying
interfering with the default "modify" from MonadState.

  foldr f s b = s' where
    (_, s') = runRState (traverse f' b) s
    f' b = modify (f b)


[1] http://lukepalmer.wordpress.com/2008/08/10/mindfuck-the-reverse-state-monad/


Entry: Applicative and Functor in terms of Monad
Date: Sat Apr  7 18:26:18 CEST 2012

A pain in the ass, but can it be done automatically?

   -- If we have Monad M then:

   instance Functor M where
     fmap = liftM

   instance Applicative M where
     pure = return
     (<*>) = ap


Entry: Duality: values <-> continuations
Date: Sun Apr  8 01:15:54 CEST 2012

Values and continuations are supposedly dual (for an interesting
example see [1]).  From[2]:

  We can think of continuations as a lack of , or request for
  values. We can see this duality in another way: values are the
  present's view of past computations, while continuations are the
  present's view of future computations.

So if a function takes a value to a value, what takes a continuation
to a contination?

[1] entry://20120407-165335
[2] http://www.google.be/url?sa=t&rct=j&q=values%20and%20continuations%20are%20duals&source=web&cd=12&ved=0CC4QFjABOAo&url=http%3A%2F%2Fciteseerx.ist.psu.edu%2Fviewdoc%2Fdownload%3Fdoi%3D10.1.1.48.4255%26rep%3Drep1%26type%3Dpdf&ei=HMqAT6HPHurJ0QXC9ej9Bg&usg=AFQjCNH4Tl0WmfN-p9Gb17FvGVV60N_UAA


Entry: Applicative Functors
Date: Sun Apr  8 10:29:43 CEST 2012

As is obvious in a sort of Zen way, just look at the API to see what
an abstraction does.  Look at what it really means without going too
much into story..

1. Needs to be a functor (i.e. container) ::  (a -> b) -> f a -> f b

2. It needs to support application        ::   f (a -> b) -> f a -> f b

What I missed before is sequenceA

  sequenceA :: Applicative f => [f a] -> f [a]
  sequenceA cs = c where
    c = foldr push (pure []) cs
    push c cs = (:) <$> c <*> cs

Thinking about this as: convert collection of element PRODUCERS into a
PRODUCER of a collection of elements.  What happens above is
construction using (:) but what if construction is something like:

  Tie the output state of the first to the input of the second.

Something didn't yet click...  But one of the key elements is
sequenceA.  Trying the original paper again[1].

Some observations from [1]:

* These all start from pure functions in the examples (a pure function
  applied to funny arguments).  However, after the first <$>, the
  result is no longer a pure function, but a collection of partially
  applied functions.  The ability to store such a collection is just a
  property of a Functor, i.e. up to here we just used fmap = <$>.
  After that, once we have a *collection* of functors to then further
  apply it to elements that are also in collections requires <*>.

* A reason to define traverse directly is that the usual definition as
  sequence . map will traverse the struture twice (if the compiler can't
  optimize this out that is..)

* The Monoid / phantom Applicative relation seems interesting but
  doesn't quite snap into place for me..  Try later.

* Different between Applicative and Monad: for a Monad the value of
  one computation can influence the second, while for Applicative the
  structure of a computation is fixed.  Moral:

    If you've got Applicative, that's good
    If you've got Monad, that's even better!

    If you need Monad, that's good
    If you need only Applicative, that's even better!

* Applicative are composable.

* Monoidal = symmetric interface for Applicative:
    unit :: f ()
    <,>  :: f a -> f b -> f (a,b)

  This separates the "combination" from the "computation" which can be
  done by fmap, i.e. :: (a,b) -> c.

  Which illustrates an important point: it's the combination of two
  values into one that allows sequentiality to emerge.


[1] http://www.soi.city.ac.uk/~ross/papers/Applicative.pdf


Entry: Experimenting with Applicative
Date: Sun Apr  8 11:18:39 CEST 2012

EDIT: Probably mostly meaningless..

Let's play with this a bit.  Take the Functor (s ->).  A functor value
F o is then (s -> o), i.e. an output where state is abstracted.  The
<*> for the Applicative version of this would chain together

                            (s -> (i -> o))
                            (s -> i)
                               to
                            (s -> o)

This is actually the environment variable from [1].  Can this
implement state machines?  Doesn't look like it because 's' can't
depend on the valueo of the i->o or i types, because those types are
completely general (container!) and thus not accessible to the
implementation of <*>.  What could happen though is a fixed update
s->s that's applied on every <*>, i.e. the increment of a counter.


Let's construct an example of "just" threading state.

  import Control.Applicative

  data Thread s a = Thread (s -> s) a
  runThread (Thread inc a) = (a, inc 0)

  appThread :: Thread s (a -> b) -> Thread s a -> Thread s b
  appThread (Thread cf f) (Thread ca a) = Thread cb b where
    b = f a
    cb = ca . cf

  instance Applicative (Thread s) where
    (<*>) = appThread
    pure a = Thread (\n -> n) a

  instance Functor (Thread s) where
    fmap f = (pure f <*>)

An example of this would be a counter, wich counts the number of
elements used in a computation.

  makeCount x = Thread (\n -> n + 1) x


[1] http://www.soi.city.ac.uk/~ross/papers/Applicative.pdf


Entry: Monoidal
Date: Sun Apr  8 11:55:42 CEST 2012

Called Monoidal in [1] but doesn't seem to be used in this way in
general.  Basic point: define Applicative in a more orthogonal way by
extending the properties of a Functor by:

    unit :: f ()
    <,>  :: f a -> f b -> f (a,b)

where f (a,b) -> f c can just be handled by fmap.


[1] http://www.soi.city.ac.uk/~ross/papers/Applicative.pdf


Entry: Stateful iteration (for / forM) or unfold?
Date: Fri Apr 13 14:29:47 CEST 2012

In writing the layout engine, I run into a dilemma when writing loops.
Either I use a classical approach (for / forM) where all side effects
can be kept in a Monad or Applicative instance, or I work with a
combination of map/zip and unfold, where the "stateful" part is
constructed separately.

The conclusion I tend to draw is that it doesn't matter much.
Sometimes it's just useful to perform unfold before map to have some
decoupling.  Other times its simpler to combine iteration and state..
But if you can do one, you can of course do the other..


Entry: Just update state.
Date: Mon Apr 16 08:19:57 EDT 2012

What's the simplest way to have a collection of mutable variables?
I'm writing an animation state update routine and the number of
parameters is getting quite large, so it becomes hard to do something
like:

  update (Params a b c  ...) = Params a' b' c' where
    a' = ...
    b' = ...
    c' = ...

also because not all parameters are updated at each step.

Maybe this is a symptom of bad factoring, but frankly I don't have the
time right now to look into it too deeply, so I wonder how to do just
have "no fuss mutable state".

From memory this seems to be the ST monad and STRef, which is like IO
and IOref, except that it can be executed like this:

   1. create a bunch of STRef from immutable data (= thaw)
   2. perform mutable computations on STRef
   3. copy contents of STRefs to immutable data (= freeze)


Entry: Monad transformers
Date: Mon Apr 16 11:24:20 EDT 2012

Question: I know (intuitively) I want to combine 2 monads.  How to
know which order they go in?  For some the order doesn't matter, for
other it does.

Practically, I want to combine state and reader.

So I just used this:

  runSR m = evalState (runReaderT m r0) s0

and the nice surprise is that operations are automatically lifted:
even if state is the inner monad, get/set/modify "just work".

See also "Monad Transformers Step by Step" [1].

To find out whether monads commute, it's probably best to just write
out the types (without wrappers) and verify manually.  An example of a
pair of transformers that does not commute is MaybeT and StateT[2].

[1] http://www.cs.virginia.edu/~wh5a/personal/Transformers.pdf
[2] http://en.wikipedia.org/wiki/Monad_transformer


Entry: Updateable state by name
Date: Wed Apr 18 15:38:12 EDT 2012

Sometimes mutable state combined with hierarchical namespace is really
what you want.  How can the following be handled in haskell?

      hierarchical.namespace.variable = 123

I guess the variable would be an STref.  The hierarchy can then be
implemented with whatever datastructure that is suitable.


Entry: Maybe and monoid
Date: Wed Apr 18 16:14:07 EDT 2012

Is this function meaningful in general?

     f :: Monoid m => Maybe m -> m
     f Nothing  = empty
     f (Just m) = m


Entry: Real-world programming in Haskell
Date: Sun Apr 22 09:38:40 EDT 2012

A beginner's critique on using Haskell for real-world programming
(i.e. not just writing a compiler or other data -> data converter).

Haskell is great for refactoring code in small increments.  However,
introducing state is often a fairly large structural change.  I found
this to be a non-trivial learning step.

Contrary to what is advertised often, adding state to a functional
algorithm isn't easy: it's a large syntactical change touching a lot
of code.  Once the framework of state is in place (Monad or
Applicative or Arrow ...), THEN it becomes easy to pick-and-choose the
side effect.  However, going from pure to side-effect isn't always
straightforward.

The pattern I've seen is that state is usually necessary to make the
leaf nodes of a tree processing algorithm share some information.  In
an imperative language this is a no-brainer.  In a pure functional
language this is a truly big change which has to be anticipated,
i.e. better always think that state is going to be part of the
picture.

As seen on HN[1],

  "Functional programming" in the real world isn't about dogmatically
  eliminating mutable state. It's about managing it. It's about
  modularity.

As always, abstraction is your friend.  The basic idea in pure
functional programming seems to be that "recursion is a low-level
operation".  You should always abstract recursion behind an operator.
This way, moving from pure to side-effects is a lot easier.

One way I've found useful to do this is by using Traversable (or
Foldable) instances.  Traversable gives you a "map with side-effects",
i.e. the good old for loop with local loop state.

Once a for loop is present, ad-hoc, local monad contraptions can be
used to perform any kind of side-effect during traversal.

This is a nice approach, but requires some getting used to.

The big revelation while working with this kind of approach is that
making all connections explicit shows quite clearly how imperative
programming often relies on "arbitrary connections" hidden in access
to global state, breaking modularity.


[1] http://news.ycombinator.com/item?id=3858698


Entry: Compiling FRP to state machines
Date: Mon Apr 23 13:22:07 EDT 2012

FRP (playing with functions of time) seem to be a nice way to
construct any kind of "choreography", being it animation, music, user
interface, ...

State machines are a de-facto way of representing reactive systems in
a low-level, low-resource form.

Is there a way to connect both?

Maybe it's time to read Conal's paper[1] again..

[1] http://www.delicious.com/redirect?url=http%3A//conal.net/papers/icfp97/


Entry: Resist structure
Date: Mon Apr 23 15:01:25 EDT 2012

Existing structure that can't reflect a new requirement is always the
cause of development slowing down.

Is there a way to avoid it altogether?

Does that make sense in any way?  Probably not.  What's the closest
point that actually makes sense?  Make structure as general as
possible?  For every constraint (== structure) provide maximum
flexibility == keep things orthogonal.


Entry: Database (push vs. pull)
Date: Mon Apr 23 15:28:45 EDT 2012

Whenever a program becomes highly parameterized (i.e. text layout) it
seems best to structure the parameter store as a database,
i.e. instead of using explicit data structures and push information to
the correct place, it seems simplest to push only a DB reference, and
let the app query the necessary parameter.


Entry: Params with defaults
Date: Wed Apr 25 09:47:50 EDT 2012

Find a different approach to do this:

  alpha = execState (forM attrs scan) False where
    scan Alpha = put True
    scan _ = return ()

which sets a default and picks overwrites it with the last matching
attrib in the list (if any).

Trouble: this seems to requre a default case (scan _).  Is it possible
to do it in a different way such that this default case can be
abstracted?  I.e. can mismatches be mapped to nothing instead of
raising an error?


Entry: Forth Direct vs. Indirect threading?
Date: Mon May 14 09:18:05 EDT 2012

I forget..  What is again the difference?

IIRC, indirect threading is easier to implement in C...

Starting out with just straight-line code, it might be simplest to
forget about code fields altogether.

So, what is pointed to by IP?  It's essentially an opcode for a VM.


Again: what's the simplest threading mechanism to use to implement
Forth in C?  I think it's called switch threading, where the main loop
is something like:

  switch(*ip++) {
    case CMD_EXIT: ... ;
    case CMD_LIT: ... ;
  }

Disadvantage: this needs a "call" opcode for composite words.  So it's
essentially one layer on top of subroutine threading (ST for a stack
VM).  Probably, CALL can use half of the address space.  Also, this
needs both CALL and JUMP for tail calls + RETURN for primitive words.
Is it really simpler?

What about making the instruction stream abstract?  Calling a word ==
pushing a new instruction stream, and popping it at the end (EXIT).


Entry: Representing self-delimiting numbers
Date: Mon May 14 17:36:51 EDT 2012

Start out: one byte = 8 bits = 256 values.
To make this self-delimiting, some of the values can be used as extensions.

Using one bit is simple, i.e. midi-style, but 2^7 is an awkward number.


Entry: Model-View-Controller vs. functional GUI
Date: Mon Dec 24 21:11:31 EST 2012

I'm trying to find out why writing GUIs is such a pain, and whether it
is possible to simplify the approach using some functional programming
tricks.

MVC[2] might make sense in the OO world, but why all this state?  The
problem is really simple: a GUI is an animation that responds to user
input: frame0 -> input -> frame1 -> ...  each frame represents a
different interpreter.

There's a loud voice in my head screaming FRP[1], but I don't see
immediately how everything links up...

There should be a way to relate the idea of cursor into a model tree
(zipper in a model) to cursor into a gui tree.  Can both be the same?
Is there a way to relate model and gui in a more direct way such that
the interface emerges somehow "automatically" ?

I believe that MVC as used in OO world often has not so clear
separation between concerns, but the basic idea is very simple (see
figure in [2]):

           View is generated from Model
           Controller updates Model

Here, the model is usually fiarly obvious.  Generation of View from
Model should be a no-brainer (projection to gui parameters followed by
injection into the drawing system).  Updates from the Controller to
the model should also be simple.

However, the tricky bit is how View and Controller are tied to each
other.  I.e. a mouse click on a canvas probably needs to be dispatched
depending on how exactly it is rendered by the model.

In my own use, I've always found the V/C distincion to be quite
arbitrary.  Maybe I just don't understand, but there seems to be
something missing: something that resembles the "physical".  Maybe
there should be an intermediate point?

       Abstract Model - Physical Model

Where interactions to the physical model are simple physical events,
like one would expect from a game engine, and the constraints of the
physical model are directly mapped to the constraints of the abstract
model.

What about this: define 2 maps a -> p, p -> a, relating the abstract
and physical models in such a way that changes to the physical model
that are not consistent do not map back to the same representation.

What I describe above is almost exactly this[4]:

  One of mind-opening ideas behind Functional Reactive Programming is
  to have an event handling function producing BOTH reaction to events
  AND the next event handling function. Thus an evolving system is
  represented as a sequence of event handling functions.

Reactive Banana[5] sounds interesting.


[1] http://en.wikipedia.org/wiki/Functional_reactive_programming
[2] http://en.wikipedia.org/wiki/Model%E2%80%93view%E2%80%93controller
[3] http://broadcast.oreilly.com/2008/10/mvc-as-anti-pattern.html
[4] http://stackoverflow.com/questions/2672791/is-functional-gui-programming-possible
[5] http://www.haskell.org/haskellwiki/Reactive-banana


Entry: Problem with FP
Date: Mon Dec 24 21:35:02 EST 2012

Is that it is too easy to fall into the state trap by just hiding it
in "extended datastructures", i.e. a pure structure "with some
metadata attached", which changes its meaning.  Hard to explain..

Basically whay I want to say is that FP design is more about finding
data representations that do not require encoding of exceptional or
intermediate situations, or at least to make them more explicit so
they stand out more.

Maybe what I'm trying to say is that things that are solved in OO
programs by setting a variable, are solved in FP programs by adding
another constructor to create a non-orthogonal concept: "what I would
like it to be + some ugly diff with reality".


Entry: ping-pong layouter in C
Date: Tue Dec 25 23:16:52 EST 2012

I have some Haskell code that performs a hierarchical box layout as:

   stacker:  box1, how big are you
   box1:     I'm .... (possibly recursively determined)
   stacker:  box2, how big are you
   ...
   stacker:  parent, I'm this big
   parent:   ok, go here, and shrink/grow
   stacker:  box1, go here and s/g...
   box1:     here's my list of allocated children
   stacker:  box2, go here and s/g ...
   box2:     here's my list of allocated children
   stacker:  parent, here's my list of allocated children
   ...

See function boxLayout in [1].

This is essentially a 2-pass algorithm with a bottom-up and a top-down
information flow.

In Haskell this is implemented as something like this

   layout :: abstractbox -> (boundingbox, (location, stretchedbox) -> children)

Basically, an abstractbox is mapped to a boundingbox (determined by
calling this kind of function recursively on contained children) and a
continuation.  The continuation is passed the final location and
dimension of the box as determined by the parent after gathering
bounding box information.  The call to this continuation then recurses
down the chain a second time to place children and return the result
upstream again.

This is a lot more elegant that implementing a similar approach using
mutable state, since it requires only local reasoning, and no manual
flow control.

I would like to do the same thing in C.  I can use a GC so I could
create closures manually.  The GC is limited however: just a CONS cell
memory.

A closure in C could then be abstracted as:

   closure(env, arg1, arg2, ...)

where env is a data structure built on top of CONS cells.

The question is then: where does the magic come from?  Is it merely in
the GC - i.e. not having to keep track of structures that have been
used up?

So, is it worth using a 3rd party collector (I don't want the Boehm
monster) or can the simple CELL collector in libprim be used without
too much notation overhead?

Alternatively, for a simple algo like this it might be good enough to
keep track of the intermediate strucutres and free them once per
iteration.  The data should be fairly contained.  That might be easier
since proper C structs can be used directly.

Also, it might be possible to do the layout part off-line and generate
just the (static) event processor chain.

[1] http://zwizwa.be/darcs/hgl/Box.hs
[2] http://code.google.com/p/dpgc/


Entry: Ad-hoc polymorphism is higher order abstract syntax
Date: Wed Jan 23 13:52:49 CET 2013

It seems that Haskell type classes, implemented at run time using
extra "implementation" parameter passing, is sufficient to implement
any kind of higher order syntax representation, to the point that they
are somewhat equivalent.


Entry: Promises
Date: Sun Feb 17 20:46:58 CET 2013

Also mentioned in [1].  Looks like promises are means of constructing
DAGs, i.e. result events in terms of (pure) operations on intermediate
events, but they also have a sort of "Maybe monad" built in, in that
breaking of promises can propagate, just like proper result values.

The promises are based on the idea of Futures, which Racket has an
implementation of[2].

[1] http://www.youtube.com/watch?v=b0EF0VTs9Dc
[2] http://docs.racket-lang.org/reference/futures.html


Entry: Love the Lambda
Date: Wed Feb 20 15:12:22 CET 2013

( From email correspondence. )

Teaching this stuff is not easy.  I applaud the brave ones that
attempt it.

I remember struggling very hard to understand the one-argument
function thing when I was learning OCaml (similar to F#). It's hard to
track back what actually changed in my understanding, but it has
something to do with changing "basic substrate" in how I think about
programming.

My understanding at this point is that *pure* functional programming
is fundamentally different from imperative programming.  I find the
"you already know X" approach in a lot of teaching to be a bit
misleading.  The problem is in the reliance on weird ways of combining
higher order functions.  Because lambda in essence is actually quite
simple once it becomes natural, all the difficulty is in building
these intricate combination patterns out of different types of
functions. I've been discussing with Antti a bit that "lambda is
low-level".  It is somewhat of a machine language in pure FP in that
it is the only thing that can be used to build any kind of program
structure.  Anything else builds on top of it.

For me it took several attempts spread over about 2 years before the

   a -> b -> c

thing started to make sense.  Then another 2 years to stop being
afraid of higher order functions (functions taking function as input)
and type constructors entering the picture, e.g.

   map :: ( a -> b ) -> ( List a -> List b )

What made me understand this, is to write programs, get deeply
confused and suddenly end up with a new way of looking at things.

I remember that starting with Scheme, I found "map" to be such a
revelation.  One way of looking at it is that everything else is sort
of a generalization of "map".

I'm thinking that teaching the use of and then the implementation of
map might be a sort of optimal point in teaching functional
programming.  Nobody cares about dorky factorials, but lists are
intuitively obvious.  Discussing map ties into functions, recursive
data structures, anonymous functions (lambda), higher order functions
and recursion.

At this point I think that pure FP

1. is really not rocket science, the underlying ideas are fairly
   simple, but

2. is *very* different from thinking sequentially and it takes some
   getting used to.

The big change is to wrap your head around thinking in data-flow and
"composable things" instead of state and sequential manipulation of
state.

Basically, I think that there is really no shortcut in learning this,
but it is definitely learnable if you have a basic understanding of
programming. Pure FP is a new "computer architecture" that requires a
different way of thinking. But there are ways to make the transition
from imperative to pure FP a bit less painful.  My journey was

  C -> Perl -> Python -> Scheme -> OCaml -> Haskell

Each transition required a little shift in thinking.  There are
probably other ways tailored to one's specific experience.

Love the lambda ;)


Entry: Don't write a one-pass compiler
Date: Fri May 10 19:20:44 EDT 2013

One of those non-obvious obvious things..

It's often hard to do in one pass because of a global->local
information flow.  At least a pass "leaving holes" is necessary,
i.e. a lazy approach.

That is, unless you're using a lazy language, in which case some magic
might be possible performing in one pass what ordinary morals need to
do in two.


Entry: Computers for Cynics - Ted Nelson
Date: Wed May 29 18:43:08 EDT 2013

Very funny.

[0] http://www.youtube.com/watch?v=KdnGPQaICjk
[1] http://www.youtube.com/watch?v=Qfai5reVrck
[2] http://www.youtube.com/watch?v=c6SUOeAqOjU
[3] http://www.youtube.com/watch?v=bhzD2FKEEds
[4] http://www.youtube.com/watch?v=_xL19f48m9U
[5] http://www.youtube.com/watch?v=_9PmIkAYhI0
[6] http://www.youtube.com/watch?v=gWDPhEvKuRY


Entry: State Machines
Date: Fri Aug 2 11:36:00 EDT 2013

Writing a deeply embedded program as a collection of state machines
removes the need for an RTOS.  This can be good, because an rtos by
itself is a big requirement, and threads are usually not very
efficient.  Especially so if a proper factorization of a problem would
contain many different threads.

So, I've been thinking about writing a (simple) racket system that
combines ideas from

- Antti's Bream
- Tom Hawkins' Atom
- MyHDL

Basically, tail-recursive Scheme code (without any real recursion or
re-entry) is a relatively high-level way for writing state machines,
as it solves the variable binding problem: Scheme's lexical scope
avoids a global state structure with lingering "don't care" variables
as would be the case in an explicit C implementation.

Such a scheme program can be turned "inside-out" to yield a C
implementation with all blocking points turned into exits + state
entry (condition==true is mapped to state change).

An optimized implementation could use register re-use.

So simple, no-frills approach could work well.

( What I'm describing here is probably 90% of what is in existing good
C compiler for a small microcontrollers. )


So what's the concrete problem I want to solve?

Build a translator from shallow (non-properrecursive) tail-recursive
blocking Scheme code to non-blocking state machines with a condition
abstraction.  Write it the way a microcontroller works:

   - allow for polling operation: do state transition when READY flag
     is set.

   - allow for interrupt: avoid polling, just run the update when an
     interrupt occurs, notifying a certain flag is set.


Looking at VHDL's sensitivity lists: these seem to be there only for
simulation, i.e. to update the state of a process whenever an input
changes.  Synthesized logic will do whatever its circuit does.
Essentially this is the bridge between a physical system (the circuit)
and a simplified model: an event-driven digital system.


Entry: MyHDL
Date: Sat Aug 3 16:45:00 EDT 2013

Looked at MyHDL and found out that my initial understanding was wrong:
it does not use generators to implement state machines (FSMs).

It uses generators to implement events, i.e. signal change wait
points.  Values returned by a generator are used by the MyHDL
scheduler to wake up simulations.


The reason has probably to do with synthesizability.  I'm not sure
exactly how that works in MyHDL, but it seems that an *implicit* FSM
in the form of a generator is too opaque to recover state.  This is an
essential part of the abstract-syntax-oriented approach I'm proposing:
access to syntax is essential because a syntacting transformation is
necessary to map yield/wait syntax to state machines.


Entry: State machine generator
Date: Sun Aug 4 10:38:00 EDT 2013

* Convert MyHDL-style yield/wait syntax to explicit state machine + wakup list.

* Implmenent MyHDL-style event simulator


Entry: Suspend / Resume syntax translation
Date: Tue Aug 6 19:56:00 EDT 2013

Essentially, a yield point in a piece of code needs to:
- capture all variables visible at exit
- re-instate them at re-entry

Instead of re-instating variables (C implementation), it seems simpler
to just replace variables with object references.

However, the C optimizer can eliminate re-instantiation of variables
that are not used.

Still, re-using storage space is not something that happens this way.
How to go about that?  It requires some form of register-allocation
algorithm on the variables visible from different states.

The trade-off is that copying is wasteful, but indirect accesses are
more expensive.  Keeping all state in an indirect object makes task
switching very fast, and maybe that is what we should optimize for
when aiming at a design that can support high task count.


A dumb direct approach is to prefix all variables with level names
that unambiguously encode the position of the variable in a scope
nesting.  E.g. l_1_2_<name> is the variable <name> in the second level
2 block nested in the first level 1 block.

An additional optimization step could use this encoding to perform
sharing, i.e. l_1_<x> and l_2<x> can be shared, since they are never
visible at the same time.

In a first iteration, we could stick to a single function body,
i.e. no nesting.  Later, nesting can be implemented using inlining.


Is this all there is?

- Functionality: flatten variable scope into a structure, using block
  coordinate name prefixing.

- Optimize:

  - Leave "temp" variables on the stack, i.e. those that do not cross
    a yield point.

  - Share storage space between mutually exclusive block scopes.


Entry: State machine translator - how to start?
Date: Sun Aug 11 10:20:05 EDT 2013

This is an exercise in dealing with control flow.  What are the
necessary elements?

- function inlining: any (higher order) function call that contains a
  yield note

- variable binding analysis.

- function body rewrite bridging yield statements.


It's hard to get a good overview, so maybe best to start writing code.
Approach:

- Abstract interpretation of Haskell Language.C


Entry: Tasks vs state machines
Date: Fri Aug 16 21:22:10 EDT 2013

I wonder if it is really just about the switching mechanism:

- task: switch stack pointer, registers and other global/CPU state

- state machine: switch current object

i.e. from a system's perspective, the difference is in task switching
speed.  If nothing has to be copied, this can be quite fast.  However,
the cost is amortized through slower indirect access.

What non-preemptive tasks/state machines (let's consider them
equivalent) have in common is a better handle on where the data
actually is: no time is wasted in "caching" per-thread state.


Entry: Salea Logic in Racket?
Date: Sun Aug 25 12:27:40 EDT 2013

Maybe that's something to try out the design of the state machine
generator.  I.e. bootstrap it in Scheme macros first, then see what
can be done at the C language level.

An interesting avenue there is to make a translator to Staapl,
i.e. work on a way to do abstract state machine specification that
gets compiled down to fixed memory addresses.

There are two interesting problems:

- Folding code "inside out" at yield points.

- Optimizing data / state allocation.


Entry: Objects are names for things you don't control
Date: Sun Sep  8 18:21:40 EDT 2013

Basic idea: when data is just data, don't turn it into an object.
Work with "dumb data" as much as possible.

The border between the two seems to be:
    " IS THE THING MADE ENTIRELY OF BITS IN MEMORY ? "

Objects are things that encapsulate state, i.e. a printer is an
object.  It contains paper, not something you can influence as a
programmer.  However, a document is not an object, it's a data
structure.


Other ways f putting it:

- self-contained pieces of information should be values, not objects.

- using an object to representi a value sequence is just an optimization

- replace objects with processes (another manifestation of sequences).


Entry: Iterator Blocks
Date: Sat Sep 21 23:57:45 EDT 2013

The task/SM translation I was thinking about is the same as "iterator
blocks" mention here [1]:

  Iterator blocks allow to have both advantages at the same time:

  - Their code looks pretty much the same as with internal iteration

  - The compiler transforms this code into a class/object/type
    implementing the interface of an external iterator

  - The generated iterator object can often be implemented very
    efficiently via a simple FSM

This[2] aso mentions "yield" in C# being converted to a state machine.
Another one[3].  Looks like the C# semantics is same/similar to
python's: only yield in main method.

So looking again at python generators: a 'def' creates a generator
when the body has a yield operation in it.  A yield operation in a
called function doesn't work as expected: control nesting can't be
implemented in function calls: all control nesting needs to be local
in the definition.


[1] http://michaelwoerister.github.io/2013/07/26/Iterator-Blocks.html
[2] http://www.marshut.com/nxyuu/the-future-of-iterators-in-rust.html
[3] http://blogs.msdn.com/b/shawnhar/archive/2010/10/01/iterator-state-machines.aspx


Entry: Loop transformation algebra
Date: Fri Dec 13 14:37:47 EST 2013

Is there a way to separate a canonical representation of loop
operations from its implementation as a management problem for
intermediate storage?


Entry: Good Tech Blogs
Date: Sun Dec 29 20:26:23 EST 2013

https://gist.github.com/jvns/8172943


Entry: Data Direction and Control Flow: 2 x 2 = 4
Date: Sat Mar  1 00:51:13 CET 2014

A little arcane, but quite fun.  Do you push or pull data?  It
depends.

Some types, using dataflow parameters (as in Oz, easly
poor-man-modeled usin C++ references or C pointers.)

- sink       : write(from x)
- source     : read(to x)
- operation  : process(to y, from x)

These can be neatly composed:

sink * op = sink
op * source = source
op * op = op

Now, what I often forget is that these have duals.  There's a thing
that "puts something in a sink", and a think that "pulls something
from a source".

In practice, what are these co-objects (anti-objects)?  If sink,
source, and operations are models of data processing (push, pull and
flow), the co-objects correspond somehow to physical ports or the
operating system transferring control flow to a program when an event
occurs.  A co-sink is something that writes into a sink.  Note that an
co-sink is not a source!  The asymmetry is the caller/callee relation.


       |  caller    callee
-------+---------------------
sink   |  sends     receives
source |  receives  sends


process is caller for both send and receive.

then there's a missing 4th case: the buffer, which is callee for both
send and receive.

It seems that "push" programming (sink-oriented programming) is the
most natural, as it has pysical-time coinciding with execution on a
CPU.

So is "pull" programming (function evaluation) then only a model?  Is
the concept of evaluation just upside down?


Entry: Dynamic typing / eval and polymorphism
Date: Sun May 25 10:18:08 EDT 2014

Trying to fit a square peg into a round hole: designing a DB schema
for work for a model that is partly OO, i.e. it has some data to type
mapping.

If I understand correctly, the this kind of value to type relation is
not something that ordinarily works in a RDB, but is possible to
emulate with conditional functions.


Entry: Qt Pyside layoutChanged recursion
Date: Tue Jun 17 13:15:43 EDT 2014

I can't help but thinking that cross-connecting view updates should
just work.  It doesn't.

How does a human programmer solve this issue in Qt?  It must pop up a
lot.  Simply redraw everything?

Maybe have a look at QML[1]

[1] http://en.wikipedia.org/wiki/QML


Entry: A rust project: blog database!
Date: Thu Jun 19 00:59:10 EDT 2014

Something that needs speed to be useful is a word-based index into a
body of text, and yes I do have a body of text!


Entry: State machine compiler
Date: Fri Jun 27 09:55:47 EDT 2014

Some ideas:

- In deeply embedded applications, there is no dynamic creation of
  state machines.  This is important for architectures such as PIC
  where memory indirection is very expensive as compared to flat
  memory access.  Optimize for static state machines.

- Language-wise, there are a couple of levels:
  - pure functions + recursion
  - blocking imperative procedures: explicit dynamic yield/suspend
  - non-blocking imperative event handlers (e.g. object or case statement)

- The most useful transformation is that from blocking procedures to
  non-blocking event handlers state machines.  The essential operation
  is to capture the current environment into an object.

- To tackle this problem: start with a scheme compiler, and perform
  the continuation capture operation at yield[6].  This needs:
  - lambda
  - apply
  - begin (imperative sequencing)

  I started working on this before.  Where's that code?


[1] entry://20130811-102005
[2] entry://20130804-103800
[3] entry://20130802-113600
[4] entry://20130816-212210
[5] entry://20130806-195600


Entry: CoArbitary
Date: Thu Aug 14 19:39:15 EDT 2014

   "the CoArbitrary class continues to confuse me"[1]

To make an arbitrary function a -> b, make a generator for b based on
a generator for b and some "shuffling" applied through the value of a.

Why is there a 0 in the following list?

*Main> shrink [1,2,3]
[[],[2,3],[1,3],[1,2],[0,2,3],[1,0,3],[1,1,3],[1,2,0],[1,2,2]]

because shrink 1 produces 0.


[1] http://www.reddit.com/r/programming/comments/1mcu8/roll_your_own_window_manager_haskell_and/c1md04


Entry: The mess we're in
Date: Sat Sep 20 16:43:53 CEST 2014

Joe Armstrong's condenser.
- abolish names and places


[1] https://www.youtube.com/watch?v=lKXe3HUG2l4


Entry: State machines
Date: Sun Sep 28 18:56:37 CEST 2014

For testability, the important part is non-divergence, meaning that
the effective state space / input space is rather small.

For system design, the reason to pick state machines is synchronicity:
i.e. design with *GEARS*.


Somewhere in there is a simple formalism that allows reduction of
complexity of state machines, making them verifiable, while at the
same time providing a better language syntax to specity "gear"
relations.


Entry: LLVM haskell
Date: Sun Sep 28 21:39:45 CEST 2014

[1] http://www.stephendiehl.com/llvm/


Entry: Mirage
Date: Mon Oct  6 01:06:55 CEST 2014

This is truly amazing!

[1] http://www.infoq.com/presentations/mirage-os


Entry: Static actors
Date: Fri Jan  2 17:18:53 EST 2015

So one would use actors to ensure robustness.  According to Joe
Armstrong, you need at least 2 machines to have robustness so
concurrency is an essential element.  Splitting tasks in supervisors
and "happy path" application code allows one to not handle errors
locally: just let it fail and let the supervisor restart = separation
of concerns.

Interesting, but a bit resource intensive.  I wonder if it's possble
to find subsets of this where the implementation is actually done by a
static set of state machines executing with static scheduling, and
statically known message queue sizes or getting rid of mailboxes
altogether (i.e. size = 1: just a variable to pass to another state
machine).

So what are the sets of constraints that need to be verified to make
actors reduce to static state machines with static or at least
predictable scheduling?

One particular element would be the need for a hidden "clock"
property, meaning that there is a concept of logical time that would
create equivalence between messages.

Essentially, for each kind of event, there is a DAG that computes
(input,state) -> (output, state).

I.e. the computation is finite.  For any kind of event, the response
of all actors is to block waiting for a new message.  All feedback
should either be through internal state, or externally to the system
(e.g. the real world).

I actually have code for this.  Maybe revive it?


Entry: folds
Date: Sat Jul 11 22:08:20 EDT 2015

If sequences are best exposed as folds, how do you reprsent a map over
a fold?

Basically, I want the fold itself  as a variable, not a function call.


Entry: Reactive Programming
Date: Sun Jul 12 23:50:11 EDT 2015

Note related to [1], just primed..

A reactive program creates a dataflow graph.
So focus on that graph and its evaluation.
If a network is static, all code can be compiled to push mode only.
For every event there is a static path of updates to be evaluated.

[1] http://research.microsoft.com/apps/pubs/default.aspx?id=158828


Entry: The How and Why of Fitting Things Together - Joe Armstrong 
Date: Fri Jul 17 23:28:58 EDT 2015

Make everything look like an Erlang process: turn N^2 into N!

[1] https://www.youtube.com/watch?v=ed7A7r6DBsM


Entry: LING: Erlang on bare metal
Date: Sat Jul 18 15:15:40 EDT 2015

"Linux is just a snowball of drivers."  Basically, nobody cares about
the OS; it's just that this particular one seems to have the right
drivers.

[1] https://www.youtube.com/watch?v=GIzTxuXvpxM


Entry: Simplicity
Date: Wed Aug 12 17:33:10 EDT 2015

There is another reason to "optimize" code: simplicity.

Things are getting so ridiculously unreliable that it seems time to
start over: toss out the OS.


Entry: Two-object state?
Date: Sat Sep  5 15:30:37 CEST 2015

Erlang's imperative part basically is message passing.  I've noticed
this pattern: A two-process combo where one implements the control
flow, and the other implements memory access through RPC.

What other side effects can be implemented that way?

EDIT: Really?  "Our alternative to a monad transformer stack is the
single monad, for the coroutine-like communication of a client with
its handler." [1]

So this is the Eff Monad.  I did run into that name before but don't
recall the concept.  Maybe it was the Eff language base on algebraic
effects[2].  And also PureScript[3].

[1] http://okmij.org/ftp/Haskell/extensible/index.html#introduction
[2] http://www.eff-lang.org/
[3] http://www.purescript.org/learn/eff/


Entry: CPS in Javascript
Date: Sat Sep  5 21:08:59 CEST 2015

http://matt.might.net/articles/by-example-continuation-passing-style/


Entry: Ownership is Theft: Experiences Building an Embedded OS in Rust 
Date: Sat Oct  3 12:35:51 EDT 2015

However, embedded platforms are highly event-based,
and Rust’s memory safety mechanisms largely presume threads

http://amitlevy.com/papers/tock-plos2015.pdf


Entry: Sequences as folds.
Date: Fri Feb 12 11:45:16 EST 2016

Quick remark.

Representing sequences as folds quickly gets into Arrow
territory. e.g. trying to pass things "on the side" because the
iteration scheme is fixed, and the list structure is gone.

( Formalize more? )


Entry: LL,LR  polish,reverese polish
Date: Tue Feb 23 18:14:34 EST 2016

http://blog.reverberate.org/2013/07/ll-and-lr-parsing-demystified.html


Entry: USB and acks
Date: Fri Feb 26 10:27:06 EST 2016

Thinking about how to make a reliable packet transport mechanism over
an unreliable but ordered transport, using something like usb:

- packet + checksum
- ack
- 1-bit sequence number to drop duplicates


Entry: Sequences vs. folds
Date: Tue Mar 22 13:54:58 EDT 2016

For a project, I'm using abstract folds to represent iteration over
possibly infinite sequences.  It meshes well with the actor model:
tail recursion, processing an ordered message stream.

However, it is interesting to see (or re-discover) the distinction
between the two views.

With some hand waving, it seems that I am rarely interested in the
sequence iteself.  The point is almost always a reduction of the
sequence into some form of object that represents a property of the
sequence.  99% of the time, that is best represented as a left fold in
a strict language.


Entry: A state machine language
Date: Thu Mar 24 11:27:57 EDT 2016

It would be interesting to build a front end around a state machine
model I've been using recently.  See zwizwa.be/git/sm

The core element is an event queue.  An ordered list.

Lock-free structures are currently not needed.  There are a lot of
possible ways to implement this.  What is the canonical one?

A priority queue:
- remove max (min)
- insert

This[1] suggests a binary heap implementation.

[1] http://algs4.cs.princeton.edu/24pq/


Entry: CRDT: eventual consistency
Date: Thu Apr 28 17:04:16 EDT 2016

Set-based updates that are associative, commutative and idempotent,
can be consistently and automatically merged.

https://en.wikipedia.org/wiki/Conflict-free_replicated_data_type
https://www.youtube.com/watch?v=bhYKrSUqSlo


Entry: Distributed systems theory for the distributed systems engineer
Date: Sat May 14 09:24:55 EDT 2016

http://the-paper-trail.org/blog/distributed-systems-theory-for-the-distributed-systems-engineer/


Entry: unique identifiers
Date: Sun Jun  5 22:20:02 EDT 2016

It's (likely?) not possible to generate unique IDs without a central
authority.  One more reason to be fault-tolerant.

And an interesting illustration of how programming Erlang turns some
age old assumptions on its head.  Shared state is possible, but it
requires a process to handle its sequentiality, so stands out like a
sore thumb.


Entry: why does printing code look so ugly?
Date: Fri Jul  8 01:34:16 EDT 2016

same for graphics drawing code, and anything that produces a stream of
instructions in a non-functional way.


Entry: Futures in Rust
Date: Sat Aug 20 15:47:30 EDT 2016

http://aturon.github.io/blog/2016/08/11/futures/

Might be interesting to use this instead of writing an ad-hoc state
machine compiler.  The above is already a sate machine compiler.


Entry: Declarative distributed programming
Date: Thu Aug 25 11:23:37 EDT 2016

Building (heterogenous) distributed applications is hard.  Why?

It seems inevitable to escape a design that relies on distributed
state, and communicating stateful objects have very complex
interaction patterns.


I've enjoyed using Erlang to mainly make the problem known:

- You will not understand the problem you're solving until you've seen
  your proposed solution fail - due to distributed interacting state.

- The only hope you have to get anywhere near implementing your
  requirements is to iterate development.  Updates are essential in
  this process.

- It makes sense to focus on the happy path, and design for robustness
  through restarts: if something unexpected happens, give up and go
  back to a known state.  Move design effort into designing
  supervisors.

- There is probably no coincidence that the supervisor approach
  resembles the idea of the "germ line" in biological evolution:
  everything except the germ line is disposable, as long as it runs
  long enough to propagate the germ line.


All this leaves the functional programming mumbo-jumbo out of the
picture, or at least delegates it to the meta level.


Entry: Blocking tasks to state machines
Date: Mon Aug 29 12:07:03 EDT 2016

What I want is to be able to code something that is clearly a state
machine, but would be easier expressed as a collection of
communicating processes.

Basically, stick to RPC / coroutines / generators. But implement it as
synchronized state machines.


Entry: Streams vs. coordinates (transposes)
Date: Mon Aug 29 15:08:19 EDT 2016

Transposes are easy to express as operations on coordinates.
Compositions of transposes then become compositions of functions.

I wonder if it is possible to do the same kind of transformation on a
stream of data?

E.g. by caching the "block" coordinate.

The trick:
   1.  Compute output stream as a stream of highlevel coordinates.
   2.  Convert these coordinates to physical coordinates
   3.  Memoize the reads
   4.  For monotonous input, output is monotonous and it can be implemented using a one-way reader.


Clever, but nothing new -- links in to Feldspar's basic idea.

This needs to sink in for a bit.  Properties used to advantage:

  - monotinicity in -> monotonicity out (means random access is not
    needed, streaming + caching works)

  - composition of coordinate-processing functions is *much* simpler
    and easier to express easier than composition of functions
    containing loops that move bytes around.


Entry: TDM closed formulas and derivatives
Date: Mon Aug 29 17:39:32 EDT 2016

Additionally, is it possible to compute an update equation from an explicit formula?
There must be a way if the translations are of a certain type!
E.g. define "derivative", and implement it for the operations.

The idea of "cycling through channels" should somehow correspond to
exponential functions / roots of unity.


Entry: Eliminate the OS
Date: Fri Nov 25 12:32:28 EST 2016

Operating systems by themselves should not be too hard to eliminate :
most code can be tucked away in a library.

The main problem is still: what about the hardware?  Currently the
only way hardware is "cheap", is when it is adapted to an os through
drivers.

Can this be disentangled?  Can devices just be devices?  Loosely
coupled things that send messages?

What I've seen mostly from writing drivers, is that the problem is
that on a chip, nothing is really independent: a lot of configuration
needs to happend before something takes proper shape, and ties into
somethnig else.


Entry: device drivers
Date: Fri Nov 25 12:50:03 EST 2016

So, what is the real problem?
Writing device drivers.
Talking to hardware, fixed things.
Adapters.
Translators.


Basically, making new toys run old code.


Device drivers are boring because they do not look glamorous, but they
do pose a fundamental problem: nothing speaks the same language.

And concerning device drivers, some problems are:
- protocols that are very state-sensitive
- tied into global infrastructure which has constraints (clock config, io config, ...):
  this makes disentanglement, abstraction hard
- non-orthogonal configuration: only specific combinations supported
- learning how to use hardware through datasheets is hard
- reliance on sequential configuration changes:
  CPU writes to registers, with some constraints on order but some arbitrariness as well
- not everything is idempotent


Once you have a better way to writing device drivers, it should be
easier to change hardware also, e.g. find more optimal
representations.

Also, check out rump kernels.


Entry: Joe Armstrong & Alan Kay - Joe Armstrong interviews Alan Kay
Date: Fri Nov 25 12:52:28 EST 2016

https://www.youtube.com/watch?v=fhOHn9TClXY

- sketchpad
- http://www.cba.mit.edu/events/03.11.ASE/docs/Minsky.pdf
- question your beliefs (negate them)
- monads are a kludge: why do you treat this as a religion?
- inverse vandalism: making things because you can

My beliefs:
- FP is for writing compilers vs. do everything functional
- device drivers are a necessary evil vs. optimize writing device drivers


Entry: idempotency and desired state
Date: Mon Nov 28 14:53:03 EST 2016

How to you intelligently bring something to a desired state?

When the succession of state progressions is linear, steps could be
performed conditionally.


Entry: event-driven systems
Date: Wed Dec  7 15:53:11 EST 2016

Why is there such a dichotomy between:
1. dispatch of multiple events from a single wait point
2. wait for one specific event, then proceed sequentially

Likely this is artificial, e.g. these are two possible implementation
forms of event-driven programming.  When there is a clear ping-pong
going on, the latter is more straightforward as it allows recursive
decomposition, if the dispatcher handles more than one "client", the
direct dispatch is better.

There is a natural way to view these: they correspond respectively to
the callee and the caller in an RPC call.

So from there, maybe it is client systems that are better expressed
sequentially, and server systems that are better expressed with
dispatch.


Entry: Joscha Bach 4rth C3 lecture
Date: Sat Dec 31 01:23:13 EST 2016

DNA is not a blueprint, but an OS.
At no point is there ever no cell.

https://www.youtube.com/watch?v=K5nJ5l6dl2s


Entry: Make illegal states unrepresentable
Date: Fri Jan 20 11:08:27 CET 2017

What is the analogy for cases where it is not possible to do this, but
it is possible to further constrain the data structure?

The point is that the illegal states don't make it past any kind of
machine interpretation.  So whether this is a simple explicit
constraint (the shape of the data structure), or some constraint that
is expressed ad-hoc, it shouldn't matter.

Maybe this really flies in the face of the original point?

So let's refine:

1. try to express all constraints as structure, as types, or as
proofs.  (whatever your tool set allows).

2. if that doesn't work, express it as properties that are exercises
by an automatic test generator such as quickcheck.


https://vimeo.com/14313378


Entry: GUIs: constraint vs. reactive
Date: Sat Feb 11 10:58:49 EST 2017

Maybe the correct paradigm for guis isn't reactive programming, it's
constraint programming?

The main problem in UI programming in OO-fashion, is to propagate
changes.  On input, the model changes, which should reflect other
views.

Ways to solve this:
- manual notification spaghetti
- recompute entire view once model updates
- "directionalize" the constraint program that describes widget relations


Entry: Sequences as Folds -> Fused Loops
Date: Mon Feb 20 16:39:27 EST 2017

A great advantage of representing sequences as folds is that loop
fusion is free.  And even more general: arbitrary stream processing
can be expressed like this where "chunk sizes" can vary between stream
processors in a very straightforward way.

Write this up, and turn it into a C or Rust code generator.

It works well in practice because often it is not possible to pick
chunks sizes, and not automating that step will always create a mess
of ad-hoc for loops in C.


Entry: Nested folds and intermediate results
Date: Mon Feb 20 22:01:21 EST 2017

What I call "intermediate results", is a pattern in DSP processing,
where the time-ordered data dependency is broken and where there is a
"spatial" or "multi-pass" component.

This is hard to formulate in a loop-folding based functional approach
because there is no longer a single loop.

Algorithms are essentially multipass if there is a global->local
data dependency.

Are nested folds as they appear in the "sequences as folds" approach a
way to deal with this?


Entry: Tree diffing
Date: Tue Feb 21 16:51:39 EST 2017

http://stackoverflow.com/questions/5894879/detect-differences-between-tree-structures
https://en.wikipedia.org/wiki/Graph_isomorphism_problem


Entry: Pi Calculus
Date: Mon Mar  6 12:04:41 EST 2017

Would it be useful to spend time learning the Pi Calculus?

http://erlang.org/pipermail/erlang-questions/2003-November/010783.html


Entry: State machine notation
Date: Mon Mar  6 12:07:28 EST 2017

I need a good notation to represent a state machine as a set of
equations in a way that allows some properties to be extracted, or at
least quick-checked.

I spend way to much time "hacking" machines using incomplete
reasoning.

Events "act on" states.  Events are operators, so let's represent them
with capitalized identifiers, leaving lower case identifiers to
represent states.

   s1 A = s2

At this level of abstraction, there are no simultaneous events.

Some possible extensions:

- To represent simultaneous events, compose two or more "proto events"
into one event.

- Not sure how to implement dependency in events, such as timeouts.


Example: a debounced trigger.


events:
A - activating edge
R - releasing edge
T - timeout after a

states:
i  - idle
w1 - active edge seen, waiting for releasing edge or timeout
w2 - timeout expired
t  - releasing edge seen, fully triggered

not modeled: after the occurance of A, a T is scheduled for a moment
in the future.  if multiple A happen, multiple T would be scheduled.
(not a good way to model this).


i  A = w1
w1 T = w2 (passed, waiting for release)
w1 R = i  (filtered)
w2 R = t  (fully passed)

t  _ = t  (once triggered, ignore events)


Entry: State machines
Date: Mon Mar  6 12:43:32 EST 2017

The thing is really that often you don't want to write down a state
machine's transition rules explicitly.  Why? Because that is a very
low-level description of the problem that often requires introduction
of intermediate states that you really don't care about in your
problem description.

What you want is to write down some other model, that talks about
actual events -- often event filtering -- and leaves out the details.

Some patterns I've ran into:

- Sequentially perform a number of operations, possibly finitely
  nested in loops, and wait for a set of events.  This is the most
  common one, and corresponds to what otherwise would be an execution
  thread in a typical multitasking OS.

- Perform resets / restarts when "exceptional cases" occur.


The debouncing problem in the previous thread is more easily expressed
as a sequential process:

events:
A - activating edge
R - releasing edge
T - timeout after a

process:
1. Wait for A
2. with timeout_process {
   3. wait for T,R
      R -> reset to 1
      T -> continue
}
4. output A
5. wait for R
6. output R
7. stop


Notes:

- The scope here indicates a resource that needs to be cleared on
  leaving the scope, in this case the timer process/state.

- The second timeout process is an _essential_ part of this.  There is
  something on the outside of the main process that can not be
  represented by something inside it (apart from emulating a
  scheduler).

- This can be implemented by CSP (channel-style) and Erlang-style
  multiprocessing.


Entry: Extending timer capture values
Date: Wed Mar  8 18:32:29 EST 2017

Code looks like this:

  CR  TR  CR  TR

CR = capture interrupt is checked, and if there is a value it is
loaded.  after this a new capture event can happen.

TR = current timer value is read.


The task is then to:

- extend the read capture value correctly

- update the extension counter on T rollover

The ambiguity is in the location of the actual capture event and the
rollover event in the grid imposed by CR and TR.

  TR  CR  C  TR     CR  TR,  or
  TR  CR     TR  C  CR  TR

  T0         T1         T2

It is straightforward to time extend C, if we assume we have an
extension counter E that represents the state of the extension after
the "nearest" rollover in the count.

  if C is near the small end -> extend with E
                   large end -> extend with E - 1


The question is then, how to update E when a T rollover occurs?


It seems this cannot be done without causing a race.  I can't say
exactly why, but I also can't answer the question about when to update
E such that it can be trusted when it is read to extend the C value
that is read at CR time.


My intution was to do the update "far away" from the point of use.  

The solution I had was to use two extension counters, using the
assumption that a counter is not used for a very large margin around
the time when it gets updated, effectively "double buffering" it.

counter use     update
M(id)   Q2 Q3   Q4->Q1
W(rap)  Q4 Q1   Q2->Q3

Then based on whether the captured count is in one of these regions,
the extension is easy to compute:

Q1     W
Q2,Q3  M
Q4     W-1    


I currently beleive it can not be done by updating a single counter at
or near the rollover, because the order of the events is not known.


EDIT:

One possibility is to keep track of one extra bit in the extension to
disambiguate if the "nearest" rollover has been accounted for.  This
likely is equivalent to keeping two counters, because they always stay
only one bit apart.

Essentially, that one bit encodes the information "does the current
extension account for the 'current' rollover or not?".

This bit is the high bit of the last read timer:
0 : rollover has occured
1 : rollover has not occured

But then this still needs an extra bit to take into account _whether
to look_ at that bit.  And we wouldn't if we're in Q2,Q3, but we would
if we're in Q4,Q1.

So an aternative way:

- roll over the main counter in the straightforward way.  this gives a
  32-bit value that is "near" the capture event.

- based on whether the capture event was before or after the rollover,
  extend it with the correct side.


EDIT2:

"extending with the correct side" can be done simpler:


Given the notation above, we know that

  -  C =< CB =< T   =>   C =< T
 
  -  The extension E:T can always be computed correctly

There are only two possibilities:  (E-1):C or E:C

If E:C > T it must be (E-1):C

That's it


The condition that C happened at or before T is the property that
allows disambiguiation.


EDIT:

So the fundamental event ambiguity is:

  TR R C TR  vs

  TR C R TR


I.e. did the rollover R happen before or after the capture event C.
The second TR will catch the rollover, but extension of C depends on
the orer of R and C.

The disambiguation works because in the second case (C <= R), E:C will
turn out to be past TR, which has to be wrong, so (E-1):C is the
correct extension in that case.


Checks

 TC C TR    R TC TR
 TC   TR C  R TC TR

The C<->TR swap doesn't seem to matter in this case.


Summary: The problem was finding the right way to look at this.


Another solution:
https://e2e.ti.com/support/microcontrollers/msp430/f/166/t/276588


Entry: Code is data because of right fold
Date: Sat Jul  1 12:18:12 EDT 2017

Any recursive datastructure has a generalized right fold by making
constructors abstract.

This can remove the need for an explicit representation as a data
structure.  A fold is enough.


( More generally: data's only reason for existence is to be passed as
input to code.  The only thing that is important, really, is the code
that produces real-world effect. )


Entry: recursive descent parsers
Date: Wed Jul  5 10:25:45 EDT 2017

About parsers.  I don't really understand table parsers, and most
parsers I've written are recursive descent parsers, as most languages
I need to parse are very lisp-like, e.g. do not need backtracking.

I would like to get to a point where this "feedforward" nature is
expressed in a more direct way.

Take as example the GDB status language, which in first approximation
is a nested set of key value bindings and is quite representative of a
lot of configuration languages out there:

<atom>    ::= <symbol> | <number>   ; not specified further
<obj>     ::= <atom> | <dict>
<dict>    ::= "{" <entries> "}"
<entries> ::= "" | <entry> | <entry> "," <entries>
<entry>   ::= <atom> | <binding>
<binding> ::= <symbol> "=" <obj>


The property of the parser is to be able to pop a character and
determine what to do with it without putting it back.  This way the
structure is a left fold.


A natural way to represent this is to split the tokenizer and parser.
The tokenizer will just collect non-control characters in a list, and
upon encounter of a control character, will push the atom somewhere.

This "somewhere" is what this is all about.


Representation: the "current expression" is an inside-out term with a
hole in it (a zipper or cursor).  The tokenizer will fill this hole
whenever an atom is parsed, and will create a new expression with a
hole.

The point is then to establish the meaning of the control characters
as "hole transformers".  E.g. each control character is a function
that takes an atom and a hole, and produces a new hole.

The "primitive hole" is then just an object.  When the parser calls
this it will will terminate.


Now one by one, define the meaning of the control characters.

","  if the current hole is an atom, change it to a list hole,
     otherwise, append to current list hole
"="  if the current hole is an atom, change it to a binding

"}"  delete the hole at the end of the list (or fill it with nil?)

This structure needs to know the type of the current hole to be able
to transform it.  That is not the same as a lisp parser.

How to represent this?  Because of continuation transformation it
seems impossible to represent a continuation as a function.


I can't really stabilize thought about this.  There is something about
the idea of attaching concrete meaning to individual characters that
is very appealing, but oth it seems convoluted.

There is no need for backtracking, but the "hole transformation" is
definitely a form of going back and correcting a previous assumption.

Let's forget about "=", but do the list control characters first.


"}"  close current list hole with a nil
"{"  insert a list hole in the current hole
","  push a pair into the current list hole


A hole is not a continuation, it is a continuation transformer:

(Obj,K) -> K


Some context:

- parse starts with a continuation (a hole).

- a control character takes the current token, pushes it into the
  hole, and updates the hole


"}" is tricky as it has two meanings:
  - empty atom, insert []
  - non-empty atom, insert [a]


I lost it... it seems like a good idea but I can't get a hold of it.


Maybe the reason that this is difficult is that I'm not separating out
the tokens.

Thinking a bit, it seems that what makes this difficult is exactly the
postfix nature: "," and "=" change the meaning of the text that comes
before.

Maybe this can be solved in the tokenizer?  E.g. the tokenizer should
turn the input stream into a prefix-only stream, transorming [a = b]
into [= a b].

Note that this is not possible for lists, as those are delimited, but
that might not be such a problem.

In any case this does start to look like a waste of time for the
current task of parsing the GDB message format.


Entry: Parser combinators
Date: Wed Jul  5 12:21:59 EDT 2017

http://www.little-lisper.org/website/pc/index.html
http://www.goodmath.org/blog/2014/05/04/combinator-parsing-part-1/
http://eprints.nottingham.ac.uk/237/1/monparsing.pdf


Entry: The most general I/O processor?
Date: Wed Jul  5 15:33:04 EDT 2017

For the tokenizer, what works was:
- input as an outer iterator
- output as a left fold


How to turn an inner iterator into a fold?  You can't because the
control flow of the fold is one-shot.  The way to do this is to block
the fold when it's running, and that needs a task.

Look at fold:gen(), it would be similar.

EDIT: Works, but...

Converting fold into source is a leaky abstraction.  This popped up
only when writing parsers.  I wonder if it makes sense to then turn
the parser into a source, as sources are easy to convert to folds.

Doesn't seem so.  A parser is naturally written with individual
(blocking) read/write calls.  (Fold is write, while inner iterators
are read).


Entry: 6 Iteration structures
Date: Wed Jul  5 15:58:59 EDT 2017

Some representations of sequences.

https://github.com/zwizwa/erl_tools/tree/master/src

- fold.erl   : left fold
- pfold.erl  : left fold with early stop

- source.erl : inner iterator (stream)
- iseq.erl   : infinite sequences (almost special case of source.erl)

- sink.erl   : sink-parameterized generator

- igen.erl   : impure generators

- unfold.erl : pure sequences represented as (finite) unfolds


The difference is in which operations are explicit:

fold,pfold:  functional write (state update)
source,iseq: functional read
unfold:      functional read with explicit state
sink:        imperative write (abstract function or process send)
igen:        imperative read

Since these are a nice orthogonal mix of classes, a there might be a
more appropriate naming scheme.

These are duals in the caller/callee sense.

For the functional ones there are finite/infinite vs full/truncate.


(EDIT: this was editited to add unfold.erl)


Entry: Manual parser
Date: Wed Jul  5 22:25:34 EDT 2017

So I ended up writing a manual recursive parser.  I had to resort to a
hack to be able to solve the equal infix operator, which is patched at
two places: close and atom.

%% I: input
%% Q: current queue
%% S: stack of queues
p([open    |I], Q,          S)               -> p(I, [],             [Q|S]);
p([close   |I], Q1,         [[{eq,K}|Q2]|S]) -> p(I, [{K,r(Q1)}|Q2], S);
p([close   |I], Q1,         [Q2|S])          -> p(I, [r(Q1)|Q2],     S);
p([{atom,V}|I], [{eq,K}|Q], S)               -> p(I, [{K,V}|Q],      S);
p([{atom,A}|I], Q,          S)               -> p(I, [A|Q],          S);
p([equal   |I], [K|Q],      S)               -> p(I, [{eq,K}|Q],     S);
p([comma   |I], Q,          S)               -> p(I, Q,              S);  %% (1)
p([],           Q,          [])              -> r(Q);


So equal quite literally changes the meaning of the last object parsed.

It also changes the continuation: we're no longer putting the result
in the queue, but in the second slot of the pair.

{Q,S} is a represetnation of the continuation.

S is always a stack of Qs, but there are two kinds of Qs:
- list     the hole at the end of a list
- {eq,K}   the hole in the second slot of the pair

The rules likely become simpler if the continuations are made
abstract.  Let's give that a try.

{Q,S} -> {[],[Q,S]}

I find CPS hard.  Maybe it's just lack of training, but making that
translation really doesn't come naturally.

Initial Q=[], S=[]


This is a push to a list, which as an ordinary function is:
fun(V) -> [A|V] end

What I miss is muscle memory..


Functions in CPS form look like this (e.g. the Haskell do block):

(define (pyth x y)
 (sqrt (+ (* x x) (* y y))))

(define (pyth& x y k)
 (*& x x (lambda (x2)
 (*& y y (lambda (y2)
 (+& x2 y2 (lambda (x2py2)
 (sqrt& x2py2 k))))))))


Note that the only reason to use CPS is to be able to also pass the
input as part of the state using just tail recursion.

So here's a systematic approach:
- write a recursive parser in direct style
- convert it to CPS mechanically
- add the extra input argument

So in erlang it is actually not necessary to write it in CPS, because
the input can be abstracted away into a read() call by using another
process.

But it's nice to keep things pure of course..


EDIT: actually, this needs some form of back-patching for "=" unless
that is changed to do it directly.


Entry: line assembler iteration pattern
Date: Mon Jul 10 15:47:29 EDT 2017

Currently: state machine.  Push in a chunk, get a chunk in reply or
not.  This fits the map+filter operation, or fold + unfold.


Entry: definition control-dominates use
Date: Tue Jul 11 11:59:38 EDT 2017

Can this principle be used to ensure caches are coherent?

Title is a quote from Olin Shivers from a talk on control flow,
CPS.  I believe presenting his scheme loop macro.

The basic idea is that a variable can't be referenced before it is
initialized, by construction.


Entry: Intersection between igen.erl and source.erl
Date: Sat Jul 15 10:58:06 EDT 2017

Problem: can't turn a list into an igen without creating a separate
process.  But it is possible to constrain the interface such that:

- reader will only "pop once".
- the "next" thunk is explicitly updated


How to guarantee single-use of the the "next" thunk?  This doesn't
seem possible without storing state somehwere -- same problem as
implementing proper lazy evaluation.

So it's not possible to constrain this at run time.  Can it be
expressed in types?

EDIT: no generic solution, so solve it in the implementation.  I.e. if
a function uses the input souce in a "pop once" fashion, it can
manually convert an igen using igen:to_source_leaky/1 and guarantee
proper usage such as performing igen:close/1.


Entry: Just put the constructors in a dictionary
Date: Mon Jul 17 16:51:52 EDT 2017

It's such a cool pattern to abstract a data structure's constructors
as functions in a dictionary, e.g. to implement a generalized right
fold.


Entry: foldable
Date: Mon Aug  7 14:25:15 EDT 2017

monoid necessary?

associativity, but in a left fold there is a certain notion of "time"
that seems to contradict associativity?
probably looking at this wrong.


Entry: CCC Conal
Date: Mon Aug  7 15:38:07 EDT 2017

http://conal.net/papers/compiling-to-categories/
https://www.youtube.com/watch?v=vzLK_xE9Zy8

Presented as an alternative to EDSLs.

Generalize the standard lambda form to CCC representation, with
overloadable operations for id,const,abstract,apply.

45:00 

Interesting if the derivative is taken to be the linear map (s.t. the
derivative of a linear function is that linear function), out follows
that derivative distributes over composition:

Df o Dg = D(f o g)

The remark about the reuse is interesting in the autodiff example.  In
the beginning I thought this would not be the case.  It's a problem
I've run into many times.

However, in the graphs it is the operations that are not reused.
I.e. the addres, the multipliers, but I guess that is the whole idea
to parallelize it.

- Interval analysis.
- SMT Constraint solving

Bottom line: do not invent new vocabulary just to get different
interpretations.

Any monadic computation gives rise in a natural way to a CCC.


So it's clear, this is what RAI is going to be built on!
It's really what I've been missing.


Entry: Decarative vs OO
Date: Fri Aug 25 16:16:02 EDT 2017

A declarative presentation model can act as an impedance match between
a stream of incoming user edit events, and a stream of outgoing view
update events.

Elements:

- (Event, PM, DM) -> (PM, DM)
- (PM, PM) -> Commands


EDIT: This is a powerful abstraction, and is currently reshaping the
way I think about cache synchronization across high-latency links.
Essentially, you "encode the setters" in the presentation model.  More
later...

This can be made all generic, where the "diff command interpreter"
sends only one type to the view: update.


Entry: Paths: trees as flat key-value maps
Date: Sat Aug 26 10:26:13 EDT 2017

Updating leaf nodes in a hierarchical data structure can be done using
paths, where each element in the path represents one layer of
hierarchical wrapping.

It can be beneficial to keep in mind that there is a bi-directional
map between:

- The hierarchical data structure as embedded in a (funcional)
  programming language.

- The "flattened" version of this structure, represented as a
  key-value map, where keys are (encoded versions of) data structure
  paths.

An example of the latter is:
- A database table or flat key-value store
- The 'id' attribute to DOM element association in a web browser

An added advantage is that the "path" representation is easily
diff-encoded, bridging "declarative" and "object oriented" worlds.


Entry: Declarative vs. OO
Date: Sat Aug 26 11:00:55 EDT 2017

This generalizes quite a bit, all the way to any stateful API.  It
likely makes sense to build a declarative state model on top of such a
stateful API, where the differences ARE the API.

And for cases where it is difficult to do this, the approach could be
seen as a way to structure a stateful API.  Stateful APIs will likely
always be necessary for efficiency reasons.

It could also be a good way to test state transitions using property
based testing: generate different states and have system transition
between them using the commands generated.


Entry: differentiating constructors
Date: Tue Aug 29 03:54:31 EDT 2017

Differentiate constructors.  Basically this is about updating web
views from small changes in algebraic data types representing the view
model.

React is a nice idea, but it is too much focused on the DOM.  I
believe there is a simpler way by focusing on re-interpreting the
rendering function.

Rendering a web page is a function from VM -> DOC.  The VM can be
"diffed" into path ins/del/set operations.  (VM0,VM1) -> [DVM].

It should be possible to programmatically derive those operations from
the original VM -> DOC function.

( VM0, VM1, VM -> DOC )  -> [ DDOC ]

Probably based on som ( VM -> DOC ) -> [ DVM -> DDOC ] function.

Conal hinted at this.  Maybe look it up.  There must be some
information about this in the Haskell world.


Entry: Trees vs. Paths
Date: Sat Sep  2 13:33:47 EDT 2017

This correspondence has been on my mind.  It is actually quite
trivial, but has far-reaching consequences for organizing data in a
"functional way".

E.g representing data structures as functions, databases.

Maybe this is what Kmett's lenses libarary is about.


Entry: Trees vs. Paths: derivatives?
Date: Thu Sep  7 16:22:51 EDT 2017

For finite differences it is possible to define a derivative operator
that behaves as the ordinary derivative.  Can this be done for finite
differences of trees as well?  The "zipper" is more like an analog
derivative in that it is centered at a single point.


Entry: Twitter thread on "derivatives"
Date: Fri Sep  8 21:12:39 EDT 2017

https://twitter.com/tom_zwizwa/status/906316251559022592

Phil mentions "incremental lambda calculus".
https://github.com/paf31/purescript-incremental
http://www.informatik.uni-marburg.de/~pgiarrusso/ILC/
http://www.informatik.uni-marburg.de/~pgiarrusso/papers/pldi14-ilc-author-final.pdf


The rules in the paper are:

- free:   D(x) = dx
- abs:    D(\x.t) = \x.\dx.D(t)
- app:    D(s t) = D(s) t D(t)
- closed: D(c) = []


The interesting thing is that both abstraction and application "double up".

It is interesting that this is the curried versions of "dual number"
form, in the same way this can be done for numerical automatic
differentiation:

- abs:   D( \x -> t )   ->  \(x,dx) -> D(t)
- app:   D( s t  )      ->  D(s) (t, D(t))

So it's probably possible to write the entire thing bottom-up,
building trees from primitive trees by composing their "dual number"
representation.

So how would this be represented exactly?

So this is lambda calculus.  How to make this practical?  All
primitives need to be translated as well.  For data structures this
boils down to constructors and accessors:

For maps, this is:

fun(K,V) -> #{ K => V }
maps:merge/2
maps:get/2

These need to be curried, then transformed in the normal way.

So the problem boils down to:

- write a tree transformation function as the composition of the 2
  constructors and 1 destructor functions.

- combined with other Erlang construct expressed as functions

- perform the transformation


In the paper, Bags are used as an example of a primitive data
structure.

What I used in diff.erl are "bags with tags". I.e. finite functions.
Can the fact that these are _actual_ functions be used?

E.g. the getter is definitely function application,
The constructors are some form of function abstraction.


So where does the computational saving come from?  From not
re-evaluating parts of the expression if the input is constant.
The paper mentions something about "nil detection".


In ILC, the "plugins" define the primitive operations.

v0 (+) d  = v1
v1 (-) v0 = d

Derivative definition:

f (a (+) da) = f a (+) f' a da


So the part I don't get is the need for the "Nil changes are
derivatives" detour.

Clear though, that the way forward is to really understand how the
function changes work: it countains the entire point of "pushing this
through" abstraction and application.

Also, trying Conal's Haskell plugin would probably be a good idea to
get a more grounded understanding of all of this.


EDIT: Reading again the part about change structures on functions.
The important thing to note is that df is a function of two arguments:

df a da

I still don't get how this is introduced, defined...
Go back to section two and read it again.  The nil change is iportant.


Entry: Why is cache invalidation so hard?
Date: Sun Sep 24 00:21:48 EDT 2017

https://martinfowler.com/bliki/TwoHardThings.html

( Otoh, naming is hard because it is arbitrary but still constrained
by common cultural denominators. )


Is it the same problem as garbage collection?  How to predict if a
piece of data will be needed in the future?  It would require an
oracle -- something that uses inacessible information.  GC is a lower
bound to that -- we're sure something won't be used if it can't be
reached by traversing a program's data graph.


Entry: Re-inventing computing: no I/O
Date: Mon Oct  9 14:37:38 EDT 2017

http://www.haskellforall.com/2017/10/why-do-our-programs-need-to-read-input.html
http://conal.net/blog/posts/can-functional-programming-be-liberated-from-the-von-neumann-paradigm

Thinking about this, the "rube goldberg machine" is a reality. For me,
it mostly manifests as a build and test system.  And a collection of
code that depends on infrastructure.

On a deeper level: I've been writing code that is "actor-heavy", where
this explicit I/O pattern is very apparent.

It is important to realize in the context of Conal's post, that actors
are an implementation mechanism.  This is exactly what I've felt in
implementing erl_tools library functions: the composable parts are
almost inescapably pure.  The need for multiple processes almost feels
like a failure (e.g. when using intermediate tasks or partial
continuations to perform between iteration representations).

It's interesting that Conal takes the input to brains, and the output
from brains as just boundaries of a system as a whole. A good point is:

    What looks like imperative output can be just what you observe at
    the boundary between two subsystems.


I/O is what's on a wire between two systems.  It is an implementation
detail.

Can actor systems be designed in this way?  That would be my main
question at this point: looking at the Erlang system I've just built,
is there a way to look at the protocols and see them as an
implementation of something more abstract?

Thinking about this, I really want to revisit the idea of re-defining
what a computer is: an augmentation of the human brain/body system.

Anything else will ensure you get stuck into a local optimum created
by current technology.

    Too stuck in the obvious to discover new truths.


So let's keep this in mind: explicit I/O is a consequence of how
systems are implemented, and can likely be eliminated, and turned into
function application, or shorter:

    TURN I/O INTO FUNCTION APPLICATION


Looking at the application, I see the following uses of message sending:

- Server RPC, essentially abstracting some state, and exposing it to
  otherwise isolated processes.

- Tracing / notifications: allowing some other task (or tasks) to
  check progress of a particular sequential process.


RPC already looks like function application. It just misses
referential transparency.  But this can always be traced back to the
necessity to keep state in a single place.

The notifications are essentially used to also modify state.
I.e. once a particular (physical) operation has occured, some other
operations become possible on the modified state.

To map this into a more functional approach, one has to do away with
the essential sources of state: the hardware, and the hardware
implementation constraints.

Who is to blame for this state, really?

The need for explicit state seems to always be rooted in physicality.

I/O can be eliminated as implementation detail upto that phusicality.
At some point this needs to be made explicit.  The question is then:
can you avoid exposing the physicality to the user?

In my case: no.  The idea that data is in a certain location is a key
element of the device as it presents to the user.  The only thing
anyone is interested in is the data, but the stateful interface is a
limiting factor.


One pattern: if the RPC is 1-1, i.e. one client and one server, it
disappears in the model: it can be abstracted away almost completely.
Only in the error handling and time delays it is apparent.

If the RPC is N-1, there is an essential part that cannot be
eliminated.  And this is not I/O (RPC already looks like normal
function calls), but it is state.

in my case, this is state that in some sense is essential (exposed to
the user), or very hard to eliminate due to implementation
constraints.

It seems the core idea is still to get rid of explicit state, much
more than to get rid of explicit I/O, which seems to already happen if
there is an incentive to write functional style code.  ( I'm counting
environment-parameterized code as functional "enough".  Read-only
environment are still much less restrictive than shared state. )


Entry: source.erl vs unfold.erl
Date: Mon Oct 23 15:01:46 CEST 2017

( Also added unfold.erl to previous post. )

To make this work, defined unfold.erl which has a type

    {S,F} with F :: S -> {E,S}

This is roughly equivalent to pure lazy sequences in source.erl

    L :: () -> eof | {E, L}

The difference is that S and F are explicit in the former, but hidden
in the latter, which likely makes it a better canditate.

Odd that I was thinking about explicit unfold as something better than
a lazy sequence.  Just arbitrary fluke?


Entry: hans followup
Date: Tue Oct 24 16:54:08 CEST 2017

- look at agda
- check out courseware (ask link)
- fully abstract compilation


Entry: Dependent types and proofs
Date: Sat Nov  4 13:39:50 EDT 2017

Why are dependent types and formal methods linked?
Agda, Coq have them.

They are used to encode quantification for predicate logic.


Entry: Stream processing patterns
Date: Fri Nov 10 08:44:01 EST 2017

I'd like to generalize the lessons learned from fold.erl and
source.erl into a Rust-style library -- i.e. not dependent on G.C.

There is something about sequences and state machine traces, and how
they merge and transform that is very interesting to me.  There is
some structure that I cannot yet fully exhibit.


Entry: Treat things that are isomorphic as the same
Date: Thu Nov 16 12:17:03 EST 2017

Basically, it seems more important to not break interfaces than it is
to "clean up" naming schemes.

As long as renames are "total" it should be fine, but for many things,
especially when long-lingering state is involved, this cannot be
guaranteed.

Main problem: maintaining some form consistency as systems evolve over
time is very hard without "binding" being well defined
(i.e. collapsing the decoupling name to actual reference).


Entry: Unknowingly making the wrong assumptions
Date: Thu Nov 16 13:52:25 EST 2017

Those damn unknown unknowns.
Therefore it helps to make explicit all decision points.
Expose the design as a decision tree.


Entry: Functional vs. Idempotent
Date: Mon Nov 20 15:59:10 EST 2017

Is idempotent just another word for improperly optimized lazy
evaluation?


Entry: Fibers are not Threads
Date: Fri Nov 24 23:28:58 EST 2017

http://lambda-the-ultimate.org/node/5478#comment-95163
https://en.wikipedia.org/wiki/Fiber_(computer_science)

Basic ideas: fibers are a sequential programming construct with
indeterminism, where concurrency is an optimization.


Entry: the real problem is integration
Date: Sun Nov 26 16:52:14 EST 2017

And that's where we're all stuck at.
That's where non-ortogonality shows up.


Entry: Are monads just sequential execution?
Date: Wed Nov 29 12:02:21 EST 2017

( a -> m b )
( b -> m c )

( a -> m c )


They guarantee that something (implemented inside the 'join') is
always interleaved between two independent subcomputations.

It replaces ordinary composition with something customized.

Similar to arrows, but for arrows alledgedly the structure of the
computation is fixed, while for monads it is value-dependent.  How so?


Entry: parsing
Date: Thu Nov 30 01:15:57 EST 2017

https://softwareengineering.stackexchange.com/questions/338665/when-to-use-a-parser-combinator-when-to-use-a-parser-generator


Entry: Broadcast events are late binding
Date: Sat Dec  2 13:22:40 EST 2017

They are an inversion of a/multiple RPC call/s.


Entry: LALR parsers
Date: Sat Dec  2 18:19:37 EST 2017

Time to understand this: how to use, and what the normal form means
wrt. how it is implemented.

But first, implement some parser combinators first.  The problem,
really, is to write down the grammar.


Entry: Transitioning
Date: Tue Dec 19 09:59:08 EST 2017

Here's a pattern I run into.  I have a collection of state machines
that operate independently, and I want to create a new state machine
that is the abstraction of a parallel bundle, but conceptually behaves
as one.

What I'm arriving at, is the need for the concept of "transitioning"
in the abstraction.  I.e. the state diagrams cannot be the same.

For each lower level state transition:

   A_n -> B_n

The higher level bundle needs

   HA -> HAB -> HB

Where HAB happens before each A_n -> B_n transition, and HB happens
after.

E.g
  
   (HA,  A, A) ->
   (HAB, A, A) -> 
   (HAB, B, A) ->
   (HAB, B, B) ->
   (HB,  B, B)

It might even be better to not make up new states but indicate that

HA = (A,A)

HB = (B,B)

And have several explicit intermediate states:

HAB = (A,B)
HBA = (B,A)

Where it doesn't matter which of these two traces happens:

(A,A) -> (A,B) -> (B,B)
(A,A) -> (B,A) -> (B,B)


So what am I actually looking for?  A way to name these intermediate
states.  Conclusion: it seems simplest to use carthesian product and
derive the higher level states from the lower level states.


Entry: contained vs. association
Date: Tue Dec 26 11:13:34 EST 2017

Is an implementation detail:

Works for code and data:
- extend a record vs. create associated table
- extend a program vs. subscribe to events

Is this the expression problem?
https://en.wikipedia.org/wiki/Expression_problem
- add cases to data
- add functions over data
without recompilation.

Solved in Racket through mixins.


Entry: category theory is type theory
Date: Tue Dec 26 15:46:52 EST 2017

https://cs.stackexchange.com/questions/3028/is-category-theory-useful-for-learning-functional-programming
https://cs.stackexchange.com/a/3256


The basic topics you would want to learn are:

    definition of categories, and some examples of categories

    functors, and examples of them

    natural transformations, and examples of them

    definitions of products, coproducts and exponents (function
    spaces), initial and terminal objects.

    adjunctions

    monads, algebras and Kleisli categories


following
http://events.cs.bham.ac.uk/mgs2012/lectures/ReddyNotes.pdf

- elements can be emulated by morphisms: 1->A.

- some categories are not "well-pointed", so in general a category is
  more general than a set

- not all denotational models for programming languages are
  well-pointed, even for simply typed LC.

It's really in those small remarks:

    Recalling that categories are graphs with certain closure
    properties, we would expect that maps between categories would be
    first of all maps between graphs.


Entry: merging forks
Date: Thu Dec 28 11:27:11 EST 2017

A lot of parallel processing seems to be about merging modifications
to an initial fork point.  How to look at this?

https://en.wikipedia.org/wiki/Merge_algorithm


A "pure" merge is a set/relation union, and is in that sense trivial.

In practice, there are impure merges, where nodes (key->value maps)
are replaced based on some other order relation.


So how does a 3-way merge work in this context?
https://en.wikipedia.org/wiki/Merge_(version_control)

It is a hack. Given a diamond graph:
P = parent
B1,B2 = branches
M = merge

B1 |= P = B2 -> M=B1, and conversely
B1 = B2 != P -> M=B1=B2
B1 != B2 != P -> conflict


Entry: julia
Date: Sat Dec 30 01:12:32 EST 2017

https://docs.julialang.org/en/stable/manual/parallel-computing/
https://devblogs.nvidia.com/parallelforall/gpu-computing-julia-programming-language/
https://github.com/JuliaGPU/OpenCL.jl


Entry: haskell dsp
Date: Sat Dec 30 01:19:14 EST 2017

https://idontgetoutmuch.wordpress.com/2017/06/02/1090/
https://hackage.haskell.org/package/accelerate


Entry: bloom filters
Date: Sat Jan  6 01:20:05 EST 2018

https://en.wikipedia.org/wiki/Bloom_filter

negative outcomes are always true, positives can be false.

works for applications where individual false positives are not a
problem, e.g. caching, optimizations.


Entry: Hans meeting
Date: Mon Feb 19 11:11:42 CET 2018

- Marcus Voelter DSL Engineering
- Engineering consulting vs. product development
- Ge moet het kunnen uitleggen


Entry: Foldee?
Date: Thu Feb 22 12:08:20 CET 2018

How do you call the function that is folded?
(e,s) -> s ?

Candidates:
- update  (left fold)
- concat  (right fold)

It's not exactly a monoid operator, because types can be different.


Entry: Embedding Erlang in Haskell
Date: Wed Feb 28 16:38:39 CET 2018

In general, the thing that is missed when doing so is that doing this
for the language isn't so much of a problem, but doing it in a way
that can "mock" the standard library is a very different problem.

This is a lot of work either way:
- rewrite primitives
- write mock functions, possibly introducing errors

This only really makes sense for very low-level code, e.g. uC code
without any kind of OS.

For anything else, it is probably best to just write code in Haskell,
and create some kind of API.

It's probably possible to make an RPC API that is typed at the Haskell
side, to be able to test code.


Entry: Loops: MIMO
Date: Sun Mar  4 11:29:01 CET 2018

Pattern that comes bac:
- loop over multi in/out chunked buffers
- suspension on in or out

Use this as base abstraction, then implement it based on substrate.


Entry: Equality is relative
Date: Mon Apr  9 13:30:24 EDT 2018

Equality is for two things to be indistinguishable relative to an API,
i.e. it is not able to tell them apart by manipulating them in any way
to produce different results.

Where of course "difference" is defined in terms of some other kind of
more primitive notion of equality.


Entry: Presentation model from #{}
Date: Sun Apr 15 13:34:47 EDT 2018

(gwtest_tom@panda.zoo)13> diff:diff(#{}, #{a => #{b => #{ c => 123}}}).
[{insert,[a],#{b => #{c => 123}}}]

So it will return a tree as a node, which can then be diffed
recursively to produce multiple insert commands.

I'd like to make this canonical.

Basic idea: I currently do not have the machinery to derive updates
from a presentation model from scratch, but it might be possible to
make the "insert" canonical, such that only a single mental model
needs to be used.

Try this with the thermostat.

[{insert,[a],#{b => #{c => 123}}}]

Would then become

[{insert,[a]},
 {insert,[a,b]},
 {insert,[a,b,c], 123}].

This requires the notion of an empty container.


Note that 'insert' is a full construction (all bells and whistles),
while 'update' might be a small mutation inside that node.

So there is still some duplication.

What I really want is:

[{insert,[a]},
 {insert,[a,b]},
 {insert,[a,b,c],_}
 {update,[a,b,c],_,123}]

I.e. separate the creation of the hole, and the filling of the hole.
It could even use the '_' atom to indicate this.


This is best rephrased in the language of (nested) environments,
variables and bindings:

(exo@10.1.3.2)13> [diff:split(C) || C <- diff:diff(#{},#{z => #{a => #{b0 => 0, b1 => 1}}})].
[[{env,[z]},
  {env,[z,a]},
  {var,[z,a,b0]},
  {bind,[z,a,b0],0},
  {var,[z,a,b1]},
  {bind,[z,a,b1],1}]]


As a side effect, this also cleans up cluttered nested layout definitions.

And it would be easy to implement, because all the commands will be
generated from an empty diff.

As a slight variation, if the structure is known, the creation of the
environment can already create all the structure for the variables
such that {var,_} can be ignored.  Same for nested env.

Summary:  3 concepts are important:

- nesting, which maps to layout structuring
- leaf nodes that contain a value
- values


EDIT: Tried this.  Not 100% perfect, but it's a start.


Entry: Transactions, diffs, editable views, lenses
Date: Sat May  5 07:15:45 EDT 2018

Maybe time to revisit Kmett's tutorial again?  Although I do have the
feeling the hyper-abstraction done there is overkill.  What I need at
this point, is a practical way to just edit the damn database tables.

So what _is_ an editable view?  And how to make them less ad-hoc?

An edit is a state to state map: s->s.

An editable view v is something that can take a user event e, and map
it to a model edit.

v = e -> (s->s).

There is no way around specifying the enumeration of edits.  However,
once a primitive set of edits is completed, they can likely be
composed and their compositions transacted.


Entry: Tree to DB
Date: Thu May 31 18:01:10 EDT 2018

1. Iterate, creating (path,val) pairs

2. For any substructure where path has the same shape, map it to some
   coordinate vecor

3. Insert coordinate ++ val into database

Et voila.  Finite functions.


Entry: Pattern matching netlists
Date: Wed Jun 27 08:16:19 EDT 2018

Problem: given a flattened circuit netlist.  Replace a given sub
network structure by another.

On schematics this is easy: draw a circle around the subcircuit.  Cut
it out, and replace it with an n-port.

How to automate the circle drawing part?  What is a subcircuit?  An
algorithm for finding its boundary nets and internal components.

An example: opto coupler network:
- remove 4-port IC U231 (type HMHA2801)
- remove 2-port resistor (U231:4 , 3.3V)
- remove 2-port diode    (U231:1, U231:2)
- remove 2-port resistor (U231:1, U231:2)
- remove 2-port resistor (U231:1, input)

Now this is a heuristic.  If there are any other components connected
to those nets that look the same, the heuristic will not be able to
choose.  Unlikely that this will happen but this is not an automated
algorithm.  It could be used to generate a representation that needs
human intervention.

I see no other way than to explicitly name the components and the
nets.


Instead of replacing networks, a simpler approach might be to model
individual components directly.  Then, by manually assigning semantics
such that the overall semantics is the desired one, it would be
possible to simulate.


In particular, the HMHA2801 optocoupler network is current driven,
while the simulation I'm interested in is only logic level.  This
means that some resistors will need to be modeled as shorts, some as
open circuits and the optocoupler itself as a logic gate (inverter in
this case).

That should work, leaving the problem of directionality.

Either model a component as a relation, or as a function, explicitly
identifying inputs and outputs.  The latter might be more appropriate,
but it does not allow for bi-directional signals such as I2C.  How to
cope?

An example is the I2C bridge TCA9509.  The model needs to be a
relation.

How to express that a relation has a funcional dependency?

The trick is likely to represent it as a set of functions, one for
each I/O configuration.  However it might be possible to side-step the
problem by introducing "tests" on the circuit as a whole.  E.g. assert
both ends with random inputs and see whether it is part of the
relation or not.

This is maybe what logic validation is: assign a random value to all
nodes of interests, and compute whether it is part of the circuit or
not.

E.g. a resistor and diode can be modeled as open, short:

short: \(p1,p2) -> p1 == p2
open:  \(p1,p2) -> True

And the opto coupler is:

short_opto \(p1,_,_,p4) -> p1 == p4

But it misses the constraint that p1 is an input.

How to model the idea that:  

- if p1 driven     -> p4 driven
- if p1 not driven -> p4 undefined

Without encoding it as a function?

This requires multi-valued logic:

Just $ Just True
Just $ Just False
Just $ Nothing       (driven but invalid,undefined)
Nothing              (not driven)

By taking a circuit and assigning some inputs (Just $ Just x), all
relations can be evaluated and either lead to a contradiction, or a
solution.  The solution can then still have Nothing or Just Nothing.

In this representation,

short: [(a,a) | a <- [Just (Just True), Just (Just False), Just Nothing, Nothing]]
open:  []


But instead of reinventing, it might be better to map everything to a
logic program encoded in Haskell.

The example here uses the List Monad to filter out a solution based on
exhaustive search and the 'guard' operation from Control.Monad


https://wiki.haskell.org/Logic_programming_example


I'm all for simple solutions.  How to start this out?

Another question: how about using a logic program to turn the circuit
into a function?  Or a collection of functions parameterized on
directionality of some signals?


EDIT: After a walk, I know what I want.

I want functions.

Not all subcircuits are functions, so they can be parameterized
somehow, giving a family of functions.  Representation can be untyped,
just an endofunction on netlist partitions.

Relations are too cumbersome to work with.


How to turn a netlist into a function?  The necessary information at
each node is to identify the driver node.  There can be only one
driver node.

All nodes that do not have a driver will be circuit inputs.

EDIT: Lost train of thought after getting interrupted again.
EDIT: Other problem solved, brain free again.

So how to tackle this.  The thing to do with a function is to evaluate
it.  That already determines which nets are driven.  From the nets
that are driven, a depth-first evaluation can be propagated using a
decision procedure.

To drive a pin with a value:
- Find net associated with pin
- Find all other components in the net
- For each component that has all its inputs driven, propagate

The entire circuit will then be modeled as a function from input
connector pins to output connector pins, parameterized by some
directionality configuration.  There will only be 3 kinds of nets:

- inputs (provided)
- outputs (provided)
- internal nodes (rest)

I've been here before, but the structure is somehow different.  Make a
generic evaluator?


EDIT: Once representation is there, algorithm seems straightforward.
I do wonder what the core problem is.  Directionalize a graph?

EDIT: Took a long refactoring stretch to finally express it in a
simple way.  It all boils down to data structures, again.

EDIT: Indeed, been there before.  The key data structure is the input
wait list, updated in response to pins being asserted, in turn
propagating changes.  The rest is to make it easy to translate net
values to input port values.


Entry: Representing I/O
Date: Wed Jun 27 13:59:29 EDT 2018

Abstract it as a tristate triplet?


Entry: MyHDL
Date: Wed Jun 27 17:10:08 EDT 2018

So instead of going through building a network evaluator, what about
generating MyHDL code and have it do the lifting?

Maybe not such a problem since the evaluator isn't very difficult.


Entry: Haskell graphs
Date: Wed Jun 27 17:39:25 EDT 2018

https://hackage.haskell.org/package/algebraic-graphs


Entry: A theorem in a context
Date: Thu Jun 28 10:56:17 EDT 2018

Something mentioned in Robert Harper's Homotopy Type Theory (HoTT)
lectures is the idea that theorems can be local.

I do not understand why this would not be possible in classical logic,
but it is clear that "locality" is something that is desirable in
programming.  Local theorems show up in contexts all the time in
programming, as functions defined in some lexical context.


Entry: Circuit netlists: structural and semantic operations
Date: Sat Jun 30 12:05:43 EDT 2018

See also haskell.txt

The main ideas:

- A netlist is a partition (a set of disjoint sets).  This realization
  provides some guidance about how to represent netlist operations as
  set and partition operations.  See Partition.hs

- A short is a relation and I/O behavior is a function.  These are
  fundamentally different concepts.

- Shorts are netlists.  Implementing shorts is computing the union of
  two partitions, based on a "coalesce" operation.  See Partition.hs

- The set of named nets is the quotient set of the netlist partition.
  The key insight here is that a net name is just another element of
  one of the disjoint sets, where "component" is the PCB, and "pin" is
  the net name.

- Evaluating functions on a netlist is best done wrt. the quotient set.

- If there is a strict order on pins, this can be used to pick a
  representative, e.g. as the minimum of a partition element.

- Evaluating a net can be done by introducing semantics to each
  component, represented by the set of inputs, and the function that
  computes outputs from inputs.  Evaluation is then depth-first
  recursion for all components that have all inputs available.

- A practical consequence is that it is possible to evaluate only a
  subset of a netlist, making it clear what is the fanout.


Entry: Idris
Date: Sat Jun 30 22:05:26 EDT 2018

It might be time for dependent types.  After reading a bit (Agda,
Idris or Coq?), I'm inclined to go for Idris for its focus on being a
programming language over a proof assistant.

http://docs.idris-lang.org/en/latest/tutorial/typesfuns.html


Entry: Monoids
Date: Sun Jul  1 23:45:34 EDT 2018

It clicks..

https://bartoszmilewski.com/2014/12/05/categories-great-and-small/

Monoid elements are morphisms of a 1-object category, with monoid
operator mapping to composition.

"... we can always recover a set monoid from a category monoid. For
all intents and purposes they are one and the same."

"A lot of interesting phenomena in category theory have their root in
the fact that elements of a hom-set can be seen both as morphisms,
which follow the rules of composition, and as points in a set."


Entry: Incremental Relational Lenses
Date: Thu Jul 12 10:07:06 EDT 2018

https://arxiv.org/abs/1807.01948

Extends original work on Relational Lenses by allow small changes to
cause possibly small changes.


Entry: optimizer generators
Date: Sun Aug  5 22:25:33 EDT 2018

1. run a "compiler compiler" for a long time, possibly incrementally
improving the "compiler".

2. snapshot the process in 1. to "release" the current compiler.


Entry: Effects
Date: Thu Aug 23 13:17:17 EDT 2018

https://arxiv.org/abs/1710.10385

We show that, in any language with exceptions and state, you can
implement any monadic effect in direct style.


Entry: Glue code isn't "dirty"
Date: Tue Sep  4 11:14:11 EDT 2018

It's actually quite an involved process of defining semantics through
protocol translation.

What is "dirty", is the hoop jumping that is often involved to find a
place where both ends can meet.


( Cognitive reframing :)


Entry: transforming blocking code into state machines: Rust Async transform
Date: Sun Dec  9 16:07:22 EST 2018

https://news.ycombinator.com/item?id=18641796
https://blag.nemo157.com/2018/12/09/inside-rusts-async-transform.html


Entry: difference structures
Date: Sat Feb  9 11:46:21 EST 2019

There's a lot of place where difference structures are useful:
- incremental build systems
- user interfaces

However, it seems almost impossible to create these without proper
language support.  Especially for build systems, which are usually
quite ad-hoc, it is almost impossible to fit them with a good
incremental system.


EDIT:

This is very similar to multi-pass structures, where a first pass can
record some index information that makes subsequent passes easier to
perform.  The index information doesn't contain anything new, i.e. it
is a cache.


Entry: Processor vs. generated state machines
Date: Wed Feb 20 07:31:11 EST 2019

The reason I built a processor is to be able to write nested
sequential programs as opposed to flat state machines.  But I already
have a structure that does just that (sm.h), so maybe figure out a way
to turn a monad into a way to nest state machines?

Everything for which you can imagine an intepreter (e.g. the CPU in
this case), can be turned into a compiler.

What I want is a Forth language that is then statically analized and
compiled into a state machine.

It doesn't have to be Forth, just something that can capture the
finite nesting structure of a call sequence and reify it into state
machines.

Maybe do it in two steps: in C first, then try to map it onto Seq.

So basic idea is to generate sm.h code.

EDIT: The main bit is to split a function into suspend points.  Let's
try to solve that first.

This is mostly CPS transformation.  Partially, because any calls that
do not block can be kept as ordinary calls.  That part is easy.  The
hard part is to represent the continuation as nested C structs.

So an explicit representation of a stack would probably be a good
idea.  E.g. ANF.

EDIT: Another important part is to be able to trade off between doing
things in multiple steps, and all at once.


Summary: why use a CPU vs. state machine?
- sequential programming
- function decomposition


Entry: glue code
Date: Thu Feb 28 14:10:27 EST 2019

If glue is just mapping between things, this usually means there are
not a whole lot of cases to be handled.  Once the data types are
mapped, that's that.  Code like that doesn't need to change a whole
lot, so can be dynamically typed.

Reasons for dynamic typing:
- fast compilation and late binding allows for instantaneous deployment
- if there are protocols, there is going to be a dynamic matching step anyway

To make dynamic types work, the tooling to reduce feedback is
absolutely essential.

Once it gets complex, or when computation and algorithms are
necessary, it's probably time to switch to static types to structure
things in a way the compiler can understand and verify.


Entry: representing algebraic data types as folds
Date: Wed Mar 13 12:12:33 EDT 2019

It is the formal version of what I've been trying to do at a low level
as well, to avoid the need for actually constructing data.

https://en.wikipedia.org/wiki/Mogensen%E2%80%93Scott_encoding
https://stackoverflow.com/questions/16426463/what-constitutes-a-fold-for-types-other-than-list

Inspired by some posts by Brian McKenna @puffnfresh

"I don't wish for pattern matching when writing Java, Kotlin,
TypeScript, etc. I just represent data structures using their
folds. Scott-encoding. Super useful!"


So the idea is straightforward.  However for "protocol oriented
programming", these are not functions that return results, but
functions that perform a side effect.

Maybe construction can be thought of that anyway: perform a side
effect in a store, and return a pointer to the constructed data.


Entry: Fold with cons into queue
Date: Mon Mar 25 10:32:19 EDT 2019

EDIT: This takes a detour, leading up to the core idea

There is no way I can think about this that makes the approach
actually useful, except what I already have, which is to conceptually
thranslate the nested form into a path list, and then applying a fold.

No matter how I turn this: the easiest approach is going to be to
present the data in the way that the client will actually want it.
Which probably means nested C structs, where pointers can be easily
translated.

Practically:
- receiver has a flat queue and can receive flat messages.
- flat messages have a default encoding for nested structures

To implement this without copying: each "flat" message should be
transformed into a nested message at the point it enters the queue.
If the granularity of the constructors is small, the queue
implementation will be efficient.

The client only ever sees nested data structures.

So what is the difficulty here?  Representing pointers.  Both sender
and receiver need to know where they are located.

So it seems best to avoid representing pointers directly, but
represent a structure that can be reCONStructured.  So implement a
fold anyway, but the fold will actually perform allocation in a queue!

That's it.

The key insight is to "bring your own constructors".  So indeed,
folds.  When combined with a queue, the alloc/free problem is solved:
alloc can do one constructor at a time, filling data structures with
raw machine pointers, and dealloc skips the entire message.

Then, to simplify, use only a handful of types.  Or generate the
ser/deser code?

Summary: fold with cons into queue.


Remark: What about overflow?  This needs to be handled using
backpressure.  I.e. assume that it is ok to just return an error code
if a message doesn't fit, so it can be handled at the sender side
(which will be a more complex system : this is for leaf nodes ).


Entry: What are services?
Date: Mon Mar 25 12:20:44 EDT 2019

Many things, but practically they are objects that
- grant restricted mutually exclusive access to state, and
- perform actions parameterized by this state


Entry: Is all state just cache?
Date: Mon Mar 25 14:38:04 EDT 2019

No, but when thinking about it, a lot of what we treat as unique state
is actually cache, i.e a compiled form of some other state.  A good
example is a running OS, "compiled" from the state on disk.


Entry: Metamorphic testing
Date: Tue Mar 26 17:31:12 EDT 2019

Generate tests by picking a random input, and for perturbations of the
input, test how the different outputs relate based on the known
perturbations (the Metamorphic Relation).

https://www.hillelwayne.com/post/metamorphic-testing/

https://lobste.rs/s/lp14cm/metamorphic_testing

    I think a mathematician would state the underlying observation as:

    (A) If we know the output of a function f is invariant under the
    action of a group G, we can generate tests for f by selecting a
    single element x and comparing the values of f on the orbits g.x
    for some g in G.

    (B) We can further simplify our job by applying some hashing
    function to each orbit f(g.x) to avoid the construction of and
    caching of expensive elements (in the example in the article, the
    hashing function is the audio transcription).

    It’s a very nice observation. I’m not shocked that the practice
    isn’t more well known, because coming up with properties of
    functions or systems that are invariant under some large, easily
    generated group is hard.


Entry: Statecharts
Date: Wed Mar 27 09:32:38 EDT 2019


Harel Statecharts (HSC) solve the problem of representing "common"
transitions through composition.

- A hiearchy of states is defined (parent->child).

- The parent states are not actual, they are abstract and point
  (recursively) to a default actual child.

- Child states inherit all parent transitions

- A transition can point to any state.


See also Decision Tables below.


https://www.hillelwayne.com/post/formally-specifying-uis/
http://gameprogrammingpatterns.com/state.html
https://statecharts.github.io/
http://www.inf.ed.ac.uk/teaching/courses/seoc/2005_2006/resources/statecharts.pdf


Entry: State machines, groups and geometry
Date: Wed Mar 27 10:19:36 EDT 2019

Geometry is intuitive: we have a way to build a mental model of
something that has a configuration space.

Is there a way to represent a state machine as a group action?

It doesn't seem so, because not all elements compose: in a certain
configuration there are only a limited set of transitions.


Entry: Decision Tables
Date: Wed Mar 27 10:36:21 EDT 2019

I'm not quite sure what the big deal is here, apart from being similar
to flattening down nested pattern matches.

https://www.hillelwayne.com/post/decision-tables/


Entry: Lock free programming
Date: Mon Apr 29 16:28:09 EDT 2019

https://preshing.com/20120612/an-introduction-to-lock-free-programming/


Entry: Path indexing vs. nested dictionaries
Date: Fri May 10 07:53:52 EDT 2019

Do you nest dictionaries, or compose keys?

This is really just currying.

a -> b -> c vs (a, b) -> c


Entry: Folds for mutually recursive types
Date: Wed May 15 09:58:09 EDT 2019

Creating a fold for a type is straightforward: replace all the
constructors with functions mapping to the typ.  But what if the type
is mutually recursive with another (recursive) type?  Something that
happens often is trees with a list of nodes.

data Tree t = Leaf t
            | Node [Tree t]

What is the canonical way to write this?  I would think that inlining
would work, but that only works with finite types.  E.g.


data Tree t = Leaf t
            | NodeNil
            | NodeCons (Tree t) (Tree t)

Could represent the same information, but it doesn't have the same
structure.


It seems this is a representative of a long lasting confusion of mine:
many data structures take this mutually recursive form, and that means
there is always a mutial recursion between a function processing a
single element, and a function processing a list of elements.


So how do you write a fold in the mutually recursive form?

It's quite straightforward.  The trick is to see that there are 2
"accumulator types".  Below a is the return type accumulator, and a'
is an intermediate accumulator used in the list foldr.


data Tree t = Leaf t
            | Node [Tree t]

foldTree leaf node cons nil = tree where
  tree (Leaf t)  = leaf t
  tree (Node ts) = node $ foldr cons nil ts

foldTree :: (t -> a)              -- leaf
         -> (a' -> a)             -- node
         -> (Tree t -> a' -> a')  -- cons
         -> a'                    -- nil
         -> (Tree t -> a)


Now to reflect the composition of the data types, the fold could also
be parameterized.  E.g. instead of passing in 'cons' and 'nil', the
list foldr can be abstracted away:


foldTree foldr' leaf node  = tree where
  tree (Leaf t)  = leaf t
  tree (Node ts) = node $ foldr' ts

foldTree :: ([Tree t] -> a')  -- foldr'
         -> (t  -> a)         -- leaf
         -> (a' -> a)         -- node
         -> (Tree t -> a)


This is the flip side of thinking of the data type as:

data Tree a t = Leaf t
              | Node (a t)


Where the container type 'a' has its own associated fold.


Summary: parameterized container type is associated to parameterized
fold.  Types and folds are essentially the same thing.


( Note that to make a full fold, this needs to be done mutually
recursively, with a foldr that recurses into foldTree again. )


Entry: Futamura projection
Date: Fri Jun 14 14:27:54 CEST 2019

So is it safe to say that futamura projection in a pure functional
setting is rather trivial?  Keep performing reductions until some
normal form is reached.

https://www.cs.purdue.edu/homes/rompf/papers/wei-preprint201811.pdf

Introduction says partial evaluation is in general hard because of
binding time analysis, and that in practice manual annotations are
used (multi-stage programming).


Entry: Hierarchical state machines
Date: Thu Aug 15 18:57:00 EDT 2019

Machines who's state can be other state machines.

https://link.springer.com/content/pdf/10.1007/3-540-44929-9_24.pdf


Entry: Reinventing lenses
Date: Sun Nov 17 11:56:47 EST 2019

Let's try to do this from first principle that covers the very
practical case that I have in mind: mapping a database to a text
representation of that database.

There are two data representations A and B.

These are isomorphic, in that there are information-preserving maps
A->B and B->A.

In addition, there are information preserving maps that relate edits:
e.g. dA -> dB and dB -> dA.  Note that the maps between the edits ddo
not need to be isomorphic, but their effect needs to be.

Eg  A -> dA -> A'  and B -> dB -> B'  then A' and B' are also isomorphic.


The reason to look at this differentially, is that the data itself
might be large, and applying an edit might be much simpler.  E.g. it
is then sufficient to download a local copy just once, and from that
point on just operate on edits.


I am interested mostly in these lenses:

- operations on views that map to operations on tables (editable
  views)

- operations on s-expressions in an editor, that map to operations on
  db tables/views.


Entry: OpenComRTOS
Date: Sun Dec 22 14:53:34 CET 2019

Reading this one:

isbn://1441997350
https://www.amazon.com/Formal-Development-Network-Centric-RTOS-Engineering/dp/1441997350


In my own work I've been more inclined towards dumb schedulers and
non-preemptive state machines.  However this might give some new
insights, as the case is made that pre-emption is really necessary for
some more complex applications.


Entry: Erlang, CSP
Date: Mon Dec 23 01:32:11 CET 2019

https://www.youtube.com/watch?v=3gXWA6WEvOM

- CSP is synchronous.
- Interact with channels, not processes


So someone explain to my: why does the CSP I see described only talk
about composition of events, and not about sending and receiving?  Are
these two levels of the theory?

https://en.wikipedia.org/wiki/Communicating_sequential_processes

Ok the wikipedia page explains: earlier versions looked more like a
programming language with send/recive.  Later versions were formulated
as a process algebra.

I still don't quite see how they relate, other than that input and
output are events.

See chapter 4 in CSP book.

In short: a send is a very specific event, and a receive is a more
general specification of a set of events.  I.e. saying that a
(generic) message on a (specific) channel is part of the alphabet of a
process means that it can receive messages on that channel.

Communication is a synchronization: the send event and the receive
event are the same event.  The avoidance of causality is one of the
key simplifications of the algebra.

EDIT: Continuing read of CSP book.

After introducing some of the notation and laws, the conclusion is
made in 1.4 that a process can be represented as a function that
accepts no, one or more (choice) inputs and produces another process.

EDIT: I don't think I quite understood the point of the algebra and
how it relates to the language aspect on my last read.  It makes more
sense now.


Entry: Revisit: Twitter thread on "derivatives"
Date: Mon Dec 23 17:42:53 CET 2019

See 20170908
https://www.informatik.uni-marburg.de/~pgiarrusso/papers/pldi14-ilc-author-final.pdf

Maybe time to look at Phil's "purescript incremental" library?

https://github.com/paf31/purescript-incremental-functions/issues/8

https://github.com/paf31/purescript-incremental-functions/blob/master/src/Data/Incremental.purs


Entry: Eventually consistent
Date: Wed Jan  1 19:53:43 CET 2020

So with Raft I did run into a consistency vs. availability problem.
Raft aims for consistency.  What I need for a current application is
partition tolerance and availability.

So what is the essence of "merging" when partitions join again?


Entry: Theory of events
Date: Thu Jan  2 20:32:15 CET 2020

I need a theory of events as state differences.
I.e. each event is a delta in a certain direction.

It seems that delta-to-delta mapping is really what occurs in
practice.  Some change in one model leads to a change in a related but
different model.


Entry: Incremental lambda calculus
Date: Thu Jan  2 22:05:47 CET 2020

See previous entry.

The rules in the paper are:

- free:   D(x) = dx
- abs:    D(\x.t) = \x.\dx.D(t)
- app:    D(s t) = D(s) t D(t)
- closed: D(c) = []

Also ran into this:
https://bentnib.org/posts/2015-04-23-incremental-lambda-calculus-and-parametricity.html

Maybe good to do a little survey on what happened in this field?


EDIT: The comment in that post is that (-) is only used to define the
zero change.

Let's give it a try.  Define change structure for a tree.

Changes are expressed as sets of add,remove,update.  I already have a
difference operation.  Differences are not unique.

What is then missing to derive a tree processing Erlang function?


Entry: Wingo on CML
Date: Sun Jan  5 19:25:47 CET 2020

https://www.youtube.com/watch?v=7IcI6sl5oBc&t=305s

CML is the way to go.

Can I just do channels? No CML.
What does he mean here with difference between channels and CSP?

The rendez-vous property is very important: async is very different.

Not a really good talker though..

I didn't gather a whole lot apart from:: 

- for rendezvous, the 2nd wakes up the 1st.

- abstraction over channels?

- design with CSP, implement with CML


https://en.wikipedia.org/wiki/Concurrent_ML


Following up
https://wingolog.org/archives/2017/06/29/a-new-concurrent-ml

"You'd think this is a fine detail, but meeting-place channels are
strictly more expressive than buffered channels."

EDIT: The example of using a non-buffered channel for RPC is a good
one: it is not well-defined for buffered channels.


https://wingolog.org/archives/2016/09/21/is-go-an-acceptable-cml


https://wingolog.org/archives/2016/10/12/an-incomplete-history-of-language-facilities-for-concurrency

Some interesting remarks:

- Callbacks (manual inversion) has a lot of mental overhead.  This
  work should be done by the compiler.

- Promises (async/wait) lifts some of the burden, but this style
  "infects" the entire program.

- Kernel threads don't have good information to know what to schedule
  next.  Applications might have more information.

- On when to buffer: not buffering enables "select".  (how?)

- Select is ok, but not compositional.  Typical example: post-process
  channel output.  This is what CML events enable: use events to
  create new events.


Entry: Sperber on CML
Date: Sun Jan  5 21:06:12 CET 2020

Mentioned in Wingo talk.

CML compared to actor model: actors don't compose. As a functional
programmer you want composition.

https://www.youtube.com/watch?v=pf4VbP5q3P0

The idea of rendez-vous as a value seems to be important.  Then
operations on rvs can be used to create new rvs, and 'select' is the
thing that maps rvs to values.

In racket these are called 'events'. Also on the wikipedia page.
https://en.wikipedia.org/wiki/Concurrent_ML

That's what Wingo means with lambda instead of a value.


Entry: Having an Effect by Oleg Kiselyov
Date: Fri Jan 10 18:30:43 CET 2020

https://www.youtube.com/watch?v=GhERMBT7u4w&feature=youtu.be&t=1680
http://okmij.org/ftp/Computation/having-effect.html

About bluespec: "something slightly better than Verilog"

Denotational semantics: compositional mapping from expressions to some
domain.  I.e. "eval". Compositional means that there is no dependence
on structure of subexpressions, only meaning.

Effects as interactions.

In process calculi monads are totally natural.
They were invented there but nobody payed attention to it.
Monads are just as interesting and important as parentheses.

Monads arise naturally in the interaction view, by factoring out
effect propagation.

So the thing here is to define a single monad, and change the effect
handler.


Entry: Freer monads and extensible effects
Date: Fri Jan 10 20:40:29 CET 2020

http://okmij.org/ftp/Haskell/extensible/index.html

https://legacy.cs.indiana.edu/~sabry/papers/exteff.pdf

https://mail.haskell.org/pipermail/haskell-cafe/2018-September/129992.html


Entry: cbuf.h
Date: Sat Jan 25 15:14:56 EST 2020

A review.

Basic question: is it safe to use this from pre-empting ISR?

I think it is, but can I prove it?

I'm not really sure how to make this formal, but I know that these
principles are important:

- read and write pointer are non-decreasing

- reads and writes to the pointers are atomic

- each pointer is written by only one thread

- write pointer is written after data is written

- read pointer is written after data is read


This gives the guarantee that when read and write are read, there will
be no way to get an invalid range.  I.e. there is no way that there
can be less bytes available than "room" indicates for the writer, and
less bytes available than "bytes" indicates for the reader.  The only
things that can go "wrong" is that:

- a reader reads an old value of the write pointer and thus
  underestimates "bytes".

- a writer reads an old value of the read pointer and thus
  underestimates "room".


It is however not portable.  It relies on reads and writes to be
atomic, which is the case on ARM CM3 for a 32-bit aligned 32-bit
access.


( So I don't find any reference to an implementation like this.  Is
there still an error? )


Entry: cache invalidation
Date: Wed Feb 19 10:31:12 EST 2020

http://zwizwa.be/-/compsci/20170711-115938

    definition control-dominates use
    Can this principle be used to ensure caches are coherent?

YES!


Entry: Chunking coroutines
Date: Mon Apr  6 15:46:47 EDT 2020

( not sure where to put this )

Context: I'm writing a blocking call API on top of libuv and
coroutines.  It is based around a read() call that takes an exact
number of bytes from a stream.  That is the API the caller inside a
task can see.

Now the question is: where should the buffer + chunking code go?  At
the push end, or at the pull end?

It seems easiest to do this at the read end.


Entry: Staging a flat description
Date: Sun Jul 12 11:18:39 EDT 2020

Start with a flat table.  Make sure it makes sense that way.

Then gradually introduce structure and "staging", encoding bits as
language elements (compile time) and data.


Entry: Futamura projections
Date: Mon Aug 24 11:58:33 EDT 2020

I would like to understand why this is not feasible in practice, and,
if possible, how to simplify language semantics to make it feasible.